dunzhang/stella-mrl-large-zh-v3.5-1792d

sentence similaritysentence-transformerssentence-transformerspytorchsafetensorsbertfeature-extractionsentence-similaritymit
662.7K

1 开源模型

本次开源stella-mrl-large-zh-v3.5-1792d模型, 本模型是在stella-large-zh-v3-1792d的基础上使用MRL方法训练而成。 其主要特点是可变的向量维度

2 使用方法

from sentence_transformers import SentenceTransformer
from sklearn.preprocessing import normalize

model = SentenceTransformer("infgrad/stella-mrl-large-zh-v3.5-1792d")
# 注意先不要normalize! 选取前n维后再normalize
vectors = model.encode(["text1", "text2"], normalize_embeddings=False)
print(vectors.shape)  # shape is [2,1792]
# n_dims越大效果越好,但是时空消耗就越大。建议维度选取128的倍数,因为是这么训练的
n_dims = 768
cut_vecs = normalize(vectors[:, :n_dims])

3 不同向量维度的CMTEB得分

stella-mrl-large-zh-v3.5-1792d_1024 代表取前1024维。整体趋势是维度越大效果越好。

ModelRetrievalSTSPairClassificationClassificationRerankingClusteringCMTEB-Score
stella-mrl-large-zh-v3.5-1792d_12870.0162.1787.9970.6766.7753.5567.16
stella-mrl-large-zh-v3.5-1792d_25672.1962.4188.0971.2268.3253.3868.02
stella-mrl-large-zh-v3.5-1792d_38472.7762.4388.2671.3468.3153.8768.25
stella-mrl-large-zh-v3.5-1792d_51273.1162.4588.1671.4668.3253.2868.29
stella-mrl-large-zh-v3.5-1792d_64073.2762.4988.2171.4668.6953.6368.42
stella-mrl-large-zh-v3.5-1792d_76873.3862.588.1971.4968.6453.7768.47
stella-mrl-large-zh-v3.5-1792d_89673.3762.588.1471.5168.4454.1368.49
stella-mrl-large-zh-v3.5-1792d_102473.4362.5188.1671.5268.5953.4368.44
stella-mrl-large-zh-v3.5-1792d_115273.4662.4988.1671.5768.5553.6768.49
stella-mrl-large-zh-v3.5-1792d_128073.4862.5188.1271.5568.4453.7468.48
stella-mrl-large-zh-v3.5-1792d_140873.4862.5188.1471.5868.4653.6968.48
stella-mrl-large-zh-v3.5-1792d_153673.4962.588.1171.5568.554.0668.52
stella-mrl-large-zh-v3.5-1792d_166473.5662.4988.0671.5668.4754.2868.56
stella-mrl-large-zh-v3.5-1792d_179273.5162.4888.0971.5668.4554.3968.56

上述表格中stella-mrl-large-zh-v3.5-1792d_1792的得分为68.56和榜单68.55得分不一致,原因和权重类型有关,小差异请忽略不计。

DEPLOY IN 60 SECONDS

Run stella-mrl-large-zh-v3.5-1792d on Runcrate

Deploy on H100, A100, or RTX GPUs. Pay only for what you use. No setup required.