facebook/mms-300m

transformersabaftransformerspytorchwav2vec2pretrainingmmsabcc-by-nc-4.0

36

HuggingFace

21.9K

Massively Multilingual Speech (MMS) - 300m

Facebook's MMS counting 300m parameters.

MMS is Facebook AI's massive multilingual pretrained model for speech ("MMS"). It is pretrained in with Wav2Vec2's self-supervised training objective on about 500,000 hours of speech data in over 1,400 languages.

When using the model make sure that your speech input is sampled at 16kHz.

Note: This model should be fine-tuned on a downstream task, like Automatic Speech Recognition, Translation, or Classification. Check out the **How-to-fine section or this blog for more information about ASR.

How to finetune

Coming soon...

Model details

Developed by: Vineel Pratap et al.
Model type: Multi-Lingual Automatic Speech Recognition model
Language(s): 1000+ languages
License: CC-BY-NC 4.0 license
Num parameters: 300 million
Cite as:

@article{pratap2023mms,
        title={Scaling Speech Technology to 1,000+ Languages}

, author={Vineel Pratap and Andros Tjandra and Bowen Shi and Paden Tomasello and Arun Babu and Sayani Kundu and Ali Elkahky and Zhaoheng Ni and Apoorv Vyas and Maryam Fazel-Zarandi and Alexei Baevski and Yossi Adi and Xiaohui Zhang and Wei-Ning Hsu and Alexis Conneau and Michael Auli}, journal={arXiv}, year={2023} }

Additional Links

Deploy Model on Runcrate

Run this model on powerful GPU infrastructure. Deploy in 60 seconds.

Pay per second

H100, A100, RTX GPUs

Instant deployment

DEPLOY IN 60 SECONDS

Run mms-300m on Runcrate

Deploy on H100, A100, or RTX GPUs. Pay only for what you use. No setup required.