dunzhang/stella-mrl-large-zh-v3.5-1792d

Name: dunzhang/stella-mrl-large-zh-v3.5-1792d
Rating: 5 (50 reviews)
Author: dunzhang

sentence similaritysentence-transformerssentence-transformerspytorchsafetensorsbertfeature-extractionsentence-similaritymit

50

HuggingFace

280.6K

1 开源模型

本次开源stella-mrl-large-zh-v3.5-1792d模型，本模型是在stella-large-zh-v3-1792d的基础上使用MRL方法训练而成。其主要特点是可变的向量维度。

2 使用方法

from sentence_transformers import SentenceTransformer
from sklearn.preprocessing import normalize

model = SentenceTransformer("infgrad/stella-mrl-large-zh-v3.5-1792d")
# 注意先不要normalize! 选取前n维后再normalize
vectors = model.encode(["text1", "text2"], normalize_embeddings=False)
print(vectors.shape)  # shape is [2,1792]
# n_dims越大效果越好，但是时空消耗就越大。建议维度选取128的倍数，因为是这么训练的
n_dims = 768
cut_vecs = normalize(vectors[:, :n_dims])

3 不同向量维度的CMTEB得分

stella-mrl-large-zh-v3.5-1792d_1024 代表取前1024维。整体趋势是维度越大效果越好。

Model	Retrieval	STS	PairClassification	Classification	Reranking	Clustering	CMTEB-Score
stella-mrl-large-zh-v3.5-1792d_128	70.01	62.17	87.99	70.67	66.77	53.55	67.16
stella-mrl-large-zh-v3.5-1792d_256	72.19	62.41	88.09	71.22	68.32	53.38	68.02
stella-mrl-large-zh-v3.5-1792d_384	72.77	62.43	88.26	71.34	68.31	53.87	68.25
stella-mrl-large-zh-v3.5-1792d_512	73.11	62.45	88.16	71.46	68.32	53.28	68.29
stella-mrl-large-zh-v3.5-1792d_640	73.27	62.49	88.21	71.46	68.69	53.63	68.42
stella-mrl-large-zh-v3.5-1792d_768	73.38	62.5	88.19	71.49	68.64	53.77	68.47
stella-mrl-large-zh-v3.5-1792d_896	73.37	62.5	88.14	71.51	68.44	54.13	68.49
stella-mrl-large-zh-v3.5-1792d_1024	73.43	62.51	88.16	71.52	68.59	53.43	68.44
stella-mrl-large-zh-v3.5-1792d_1152	73.46	62.49	88.16	71.57	68.55	53.67	68.49
stella-mrl-large-zh-v3.5-1792d_1280	73.48	62.51	88.12	71.55	68.44	53.74	68.48
stella-mrl-large-zh-v3.5-1792d_1408	73.48	62.51	88.14	71.58	68.46	53.69	68.48
stella-mrl-large-zh-v3.5-1792d_1536	73.49	62.5	88.11	71.55	68.5	54.06	68.52
stella-mrl-large-zh-v3.5-1792d_1664	73.56	62.49	88.06	71.56	68.47	54.28	68.56
stella-mrl-large-zh-v3.5-1792d_1792	73.51	62.48	88.09	71.56	68.45	54.39	68.56

上述表格中stella-mrl-large-zh-v3.5-1792d_1792的得分为68.56和榜单68.55得分不一致，原因和权重类型有关，小差异请忽略不计。

Deploy Model on Runcrate

Run this model on powerful GPU infrastructure. Deploy in 60 seconds.

Pay per second

H100, A100, RTX GPUs

Instant deployment

DEPLOY IN 60 SECONDS

Run stella-mrl-large-zh-v3.5-1792d on Runcrate

Deploy on H100, A100, or RTX GPUs. Pay only for what you use. No setup required.