NLLB-200’s 3.3B variant is a machine translation model designed primarily for research purposes, particularly focusing on low-resource languages. It supports translation among 200 languages for single sentences, including Faroese. This model is intended for use by researchers and the machine translation community. It was evaluated using BLEU, spBLEU, and chrF++ metrics, and human evaluations were performed. The training data includes parallel multilingual data and monolingual data from Common Crawl, with preprocessing done using SentencePiece.
Other variants of NLLB-200: 1.3B, 1.3B distilled and 600M distilled
Release: 2022
Contact: https://github.com/facebookresearch/fairseq/issues




