The NLLB-200 model supports many languages. However, the initial integration in Content Translation supports a smaller set of 23 languages. In order to improve machine translation support, we want to support languages lack currently machine translation support by the current services available but could be provided using NLLB-200.
Candidate languages identified so far
These languages have no current MT support enabled, but NLLB-200 could provide it :
- Kashmiri (ks/kas_Arab) ( view request and T326541 ) — ✅ Enabled as part of T337290.
- Santali (sat/sat_Olck) ( view request) — ✅ Enabled as part of T337290.
- Tumbuka (tum/tum_Latn) (view request) — ✅ Enabled as part of T337290.
- Fulah/Nigerian Fulfulde (ff/fuv_Latn) — ✅ Enabled as part of T337290.
- Kabyle (kab/kab_Latn) — ✅ Enabled as part of T337290.
- Balinese (ban/ban_Latn) — ✅ Enabled as part of T337290.
- Banjar (bjn/bjn_Latn) — ✅ Enabled as part of T337290.
- South Azerbaijani (azb/azb_Arab) — ✅ Enabled as part of T337290.
- Pangasinan (pag/pag_Latn) — ✅ Enabled as part of T337290.
- Tibetan (bo/bod_Tibt) — ✅ Enabled as part of T337290.
- Crimean Tatar (crh/crh_Latn) – Planned as part of T337669.
- Lombard (lmo/lmo_Latn) – ✅ Enabled as part of T337669.
- Silesian (szl/szl_Latn) – ✅ Enabled as part of T337669.
- Venetian (vec/vec_Latn) – ✅ Enabled as part of T337669.
- Ligurian (lij/lij_Latn) – ✅ Enabled as part of T337669.
- Waray (war/war_Latn) – ✅ Enabled as part of T337669.
- Limburgish (li/lim_Latn) – ✅ Enabled as part of T337669.
- Faroese (fo/fao_Latn) – ✅ Enabled as part of T337669.
- Shan (shn/shn_Mymr) – ✅ Enabled as part of T337669.
- Friulian (fur/fur_Latn) – ✅ Enabled as part of T337669.
- Sicilian (scn/scn_Latn) – ✅ Enabled as part of T337834.
- Acehnese (ace/ace_Latn) – ✅ Enabled as part of T337834.
- Buginese (bug/bug_Latn) – ✅ Enabled as part of T337834.
- Tok Pisin (tpi/tpi_Latn) – ✅ Enabled as part of T337834.
- Fijian (fj/fij_Latn) – ✅ Enabled as part of T337834.
- Southwestern Dinka (din/dik_Latn) – ✅ Enabled as part of T337834.
- Rundi (rn/run_Latn) – ✅ Enabled as part of T337834.
- Kabiyè (kbp/kbp_Latn) – ✅ Enabled as part of T337834.
- Latgalian (ltg/ltg_Latn) – ✅ Enabled as part of T337834.
- Dzongkha (dz/dzo_Tibt) – ✅ Enabled as part of T337834.
- Egyptian Arabic (arz/arz_Arab) – ✅ Enabled as part of T338123.
- Moroccan Arabic (ary/ary_Arab) – ✅ Enabled as part of T338123.
- Kikuyu (ki/kik_Latn) – ✅ Enabled as part of T338123.
- Sango (sg/sag_Latn) – ✅ Enabled as part of T338123.
- Awadhi (awa/awa_Deva) – ✅ Enabled as part of T338123.
- Minangkabau (min/min_Latn) – ✅ Enabled as part of T340953
- Sardinian (sc/srd_Latn) – ✅ Enabled as part of T340953
Fon (fon/fon_Latn)(still in incubator, considered for T336683)Akan (ak/aka_Latn)(Akan may be supported using Twi)Kanuri/Central Kanuri (kr/knc_Arab/knc_Latn)(Wikipedia was closed, back to incubator, considered for T336683)
Languages already supported by NLLB-200 as their only option
From the list of languages currently supported by NLLB-200, these are those not supported by other services:
- Asturian (ast/ast_Latn)
- Kongo/Kikongo (kg/kon_Latn)
- Northern Sotho (nso/nso_Latn)
- Occitan (oc/oci_Latn)
- Swati (ss/ssw_Latn)
- Tswana (tn/tsn_Latn)
- Wolof (wo/wol_Latn)
Cantonese/Yue Chinese (zh-yue/yue_Hant)(disabled as per T333835)
( Central Bikol is another language with MinT as the only option, but supported by Opus MT instead: T262253)
Languages listed on the NLLB-200 documentation
These languages are supported by the NLLB-200 model.
Marking in bold those wthout MT, and striked those with MT existing MT support.
- Acehnese (Arabic script) (ace_Arab)
- Acehnese (Latin script) (ace_Latn)
- Mesopotamian Arabic (acm_Arab) no wiki yet
- Ta’izzi-Adeni Arabic (acq_Arab) no wiki yet
- Tunisian Arabic (aeb_Arab) no wiki yet
Afrikaans (afr_Latn)- South Levantine Arabic (ajp_Arab) no wiki yet
- Akan (aka_Latn)
Amharic (amh_Ethi)- North Levantine Arabic (apc_Arab) no wiki yet
Modern Standard Arabic (arb_Arab)Modern Standard Arabic (Romanized) (arb_Latn)- Najdi Arabic (ars_Arab) no wiki yet
- Moroccan Arabic (ary_Arab)
- Egyptian Arabic (arz_Arab)
Assamese (asm_Beng)Asturian (ast_Latn)- Awadhi (awa_Deva)
Central Aymara (ayr_Latn)- South Azerbaijani (azb_Arab)
North Azerbaijani (azj_Latn)Bashkir (bak_Cyrl)Bambara (bam_Latn)- Balinese (ban_Latn)
Belarusian (bel_Cyrl)- Bemba (bem_Latn) no wiki yet
Bengali (ben_Beng)Bhojpuri (bho_Deva)- Banjar (Arabic script) (bjn_Arab)
- Banjar (Latin script) (bjn_Latn)
- Standard Tibetan (bod_Tibt)
Bosnian (bos_Latn)- Buginese (bug_Latn)
Bulgarian (bul_Cyrl)Catalan (cat_Latn)Cebuano (ceb_Latn)Czech (ces_Latn)- Chokwe (cjk_Latn) no wiki yet
Central Kurdish (ckb_Arab)- Crimean Tatar (crh_Latn)
Welsh (cym_Latn)Danish (dan_Latn)German(deu_Latn)- Southwestern Dinka (dik_Latn)
- Dyula (dyu_Latn) no wiki yet
- Dzongkha (dzo_Tibt)
Greek(ell_Grek)English(eng_Latn)Esperanto(epo_Latn)Estonian(est_Latn)Basque(eus_Latn)Ewe (ewe_Latn)- Faroese (fao_Latn)
- Fijian (fij_Latn)
Finnish (fin_Latn)- Fon (fon_Latn) wiki still in incubator
French (fra_Latn)- Friulian (fur_Latn)
- Nigerian Fulfulde (fuv_Latn)
Scottish Gaelic (gla_Latn)Irish (gle_Latn)Galician (glg_Latn)Guarani (grn_Latn)Gujarati (guj_Gujr)Haitian Creole (hat_Latn)Hausa (hau_Latn)Hebrew (heb_Hebr)Hindi (hin_Deva)- Chhattisgarhi (hne_Deva) no wiki yet
Croatian (hrv_Latn)Hungarian (hun_Latn)Armenian (hye_Armn)Igbo (ibo_Latn)Ilocano (ilo_Latn)Indonesian (ind_Latn)Icelandic (isl_Latn)Italian (ita_Latn)Javanese (jav_Latn)Japanese (jpn_Jpan)- Kabyle (kab_Latn)
- Jingpho (kac_Latn) no wiki yet
- Kamba (kam_Latn) no wiki yet
Kannada (kan_Knda)- Kashmiri (Arabic script) (kas_Arab)
- Kashmiri (Devanagari script) (kas_Deva)
Georgian (kat_Geor)- Central Kanuri (Arabic script) (knc_Arab)
- Central Kanuri (Latin script) (knc_Latn)
Kazakh (kaz_Cyrl)- Kabiyè (kbp_Latn)
- Kabuverdianu (kea_Latn) no wiki yet
Khmer (khm_Khmr)- Kikuyu (kik_Latn)
Kinyarwanda (kin_Latn)Kyrgyz (kir_Cyrl)- Kimbundu (kmb_Latn) no wiki yet
Northern Kurdish (kmr_Latn)Kikongo (kon_Latn)Korean (kor_Hang)Lao (lao_Laoo)- Ligurian (lij_Latn)
- Limburgish (lim_Latn)
Lingala (lin_Latn)Lithuanian (lit_Latn)- Lombard (lmo_Latn)
- Latgalian (ltg_Latn)
Luxembourgish (ltz_Latn)- Luba-Kasai (lua_Latn) no wiki yet
Ganda (lug_Latn)- Luo (luo_Latn) no wiki yet
- Mizo (lus_Latn) no wiki yet
Standard Latvian (lvs_Latn)- Magahi (mag_Deva) no wiki yet
Maithili (mai_Deva)Malayalam (mal_Mlym)Marathi (mar_Deva)- Minangkabau (Arabic script) (min_Arab)
- Minangkabau (Latin script) (min_Latn)
Macedonian (mkd_Cyrl)Plateau Malagasy (plt_Latn)Maltese (mlt_Latn)Meitei (Bengali script) (mni_Beng)Halh Mongolian (khk_Cyrl)- Mossi (mos_Latn) no wiki yet
Maori (mri_Latn)Burmese (mya_Mymr)Dutch (nld_Latn)Norwegian Nynorsk (nno_Latn)Norwegian Bokmål (nob_Latn)Nepali (npi_Deva)Northern Sotho (nso_Latn)- Nuer (nus_Latn) no wiki yet
Nyanja (nya_Latn)Occitan (oci_Latn)West Central Oromo (gaz_Latn)Odia (ory_Orya)- Pangasinan (pag_Latn)
Eastern Panjabi (pan_Guru)Papiamento (pap_Latn)Western Persian (pes_Arab)Polish (pol_Latn)Portuguese (por_Latn)- Dari (prs_Arab) no wiki yet
Southern Pashto (pbt_Arab)Ayacucho Quechua (quy_Latn)Romanian (ron_Latn)- Rundi (run_Latn)
Russian (rus_Cyrl)- Sango (sag_Latn)
Sanskrit (san_Deva)- Santali (sat_Olck)
- Sicilian (scn_Latn)
- Shan (shn_Mymr)
Sinhala (sin_Sinh)Slovak (slk_Latn)Slovenian (slv_Latn)Samoan (smo_Latn)Shona (sna_Latn)Sindhi (snd_Arab)Somali (som_Latn)Southern Sotho (sot_Latn)Spanish (spa_Latn)Tosk Albanian (als_Latn)- Sardinian (srd_Latn)
Serbian (srp_Cyrl)Swati (ssw_Latn)Sundanese (sun_Latn)Swedish (swe_Latn)Swahili (swh_Latn)- Silesian (szl_Latn)
Tamil (tam_Taml)Tatar (tat_Cyrl)Telugu (tel_Telu)Tajik (tgk_Cyrl)Tagalog (tgl_Latn)Thai (tha_Thai)Tigrinya (tir_Ethi)- Tamasheq (Latin script) (taq_Latn) no wiki yet
- Tamasheq (Tifinagh script) (taq_Tfng) no wiki yet
- Tok Pisin (tpi_Latn)
Tswana (tsn_Latn)Tsonga (tso_Latn)Turkmen (tuk_Latn)- Tumbuka (tum_Latn)
Turkish (tur_Latn)Twi (twi_Latn)- Central Atlas Tamazight (tzm_Tfng) no wiki yet
Uyghur (uig_Arab)Ukrainian (ukr_Cyrl)- Umbundu (umb_Latn) no wiki yet
Urdu (urd_Arab)Northern Uzbek (uzn_Latn)- Venetian (vec_Latn)
Vietnamese (vie_Latn)- Waray (war_Latn)
Wolof (wol_Latn)Xhosa (xho_Latn)Eastern Yiddish (ydd_Hebr)Yoruba (yor_Latn)Yue Chinese (yue_Hant)Chinese (Simplified) (zho_Hans)Chinese (Traditional) (zho_Hant)Standard Malay (zsm_Latn)Zulu (zul_Latn)
Full list of languages supported by NLLB-200 in 3-letter ISO codes:
Related: T336683: Enable MinT support for languages with no Wikipedia yet