本次开源stella-mrl-large-zh-v3.5-1792d模型, 本模型是在stella-large-zh-v3-1792d的基础上使用MRL方法训练而成。 其主要特点是可变的向量维度。
from sentence_transformers import SentenceTransformer
from sklearn.preprocessing import normalize
model = SentenceTransformer("infgrad/stella-mrl-large-zh-v3.5-1792d")
# 注意先不要normalize! 选取前n维后再normalize
vectors = model.encode(["text1", "text2"], normalize_embeddings=False)
print(vectors.shape) # shape is [2,1792]
# n_dims越大效果越好,但是时空消耗就越大。建议维度选取128的倍数,因为是这么训练的
n_dims = 768
cut_vecs = normalize(vectors[:, :n_dims])
stella-mrl-large-zh-v3.5-1792d_1024 代表取前1024维。整体趋势是维度越大效果越好。
| Model | Retrieval | STS | PairClassification | Classification | Reranking | Clustering | CMTEB-Score |
|---|---|---|---|---|---|---|---|
| stella-mrl-large-zh-v3.5-1792d_128 | 70.01 | 62.17 | 87.99 | 70.67 | 66.77 | 53.55 | 67.16 |
| stella-mrl-large-zh-v3.5-1792d_256 | 72.19 | 62.41 | 88.09 | 71.22 | 68.32 | 53.38 | 68.02 |
| stella-mrl-large-zh-v3.5-1792d_384 | 72.77 | 62.43 | 88.26 | 71.34 | 68.31 | 53.87 | 68.25 |
| stella-mrl-large-zh-v3.5-1792d_512 | 73.11 | 62.45 | 88.16 | 71.46 | 68.32 | 53.28 | 68.29 |
| stella-mrl-large-zh-v3.5-1792d_640 | 73.27 | 62.49 | 88.21 | 71.46 | 68.69 | 53.63 | 68.42 |
| stella-mrl-large-zh-v3.5-1792d_768 | 73.38 | 62.5 | 88.19 | 71.49 | 68.64 | 53.77 | 68.47 |
| stella-mrl-large-zh-v3.5-1792d_896 | 73.37 | 62.5 | 88.14 | 71.51 | 68.44 | 54.13 | 68.49 |
| stella-mrl-large-zh-v3.5-1792d_1024 | 73.43 | 62.51 | 88.16 | 71.52 | 68.59 | 53.43 | 68.44 |
| stella-mrl-large-zh-v3.5-1792d_1152 | 73.46 | 62.49 | 88.16 | 71.57 | 68.55 | 53.67 | 68.49 |
| stella-mrl-large-zh-v3.5-1792d_1280 | 73.48 | 62.51 | 88.12 | 71.55 | 68.44 | 53.74 | 68.48 |
| stella-mrl-large-zh-v3.5-1792d_1408 | 73.48 | 62.51 | 88.14 | 71.58 | 68.46 | 53.69 | 68.48 |
| stella-mrl-large-zh-v3.5-1792d_1536 | 73.49 | 62.5 | 88.11 | 71.55 | 68.5 | 54.06 | 68.52 |
| stella-mrl-large-zh-v3.5-1792d_1664 | 73.56 | 62.49 | 88.06 | 71.56 | 68.47 | 54.28 | 68.56 |
| stella-mrl-large-zh-v3.5-1792d_1792 | 73.51 | 62.48 | 88.09 | 71.56 | 68.45 | 54.39 | 68.56 |
上述表格中stella-mrl-large-zh-v3.5-1792d_1792的得分为68.56和榜单68.55得分不一致,原因和权重类型有关,小差异请忽略不计。