Appendix:Vocabulary lists of Southeast Asian languages
Below are vocabulary lists for Southeast Asian language branches and reconstructed proto-languages.
Many of the lists are glossed in Chinese, with several also in Vietnamese, Russian, and French.
Introduction
editWelcome to Wiktionary's vocabulary lists series. This series aims to have representative word lists for all language families of the world.
- Purpose: As linguistic lexicographical works, the vocabulary lists are designed with historical-comparative linguistics research goals in mind, such as classifying languages, reconstructing proto-languages, and identifying loanwords. Frequency lists and pedagogical resources are not included.
- Glosses: Each list maintains original glosses (definitions, meanings) as found in the original sources. Translated glosses are sometimes added as additional columns if the original glosses are not in English. Translations that are not in the original source are noted in the lists, and do not replace the original glosses. Unlike Swadesh lists and other standardized lexicostatistical word lists, the vocabulary lists here do not consist of lists with predetermined glosses. Instead, the vocabulary lists here can serve as "raw building blocks" for compiling Swadesh lists.
- Content: The lists are typically in the 50-1,000 item range for lexical entries. Definitions are typically concise and focus on basic vocabulary concepts such as numerals, body parts, and natural phenomena.
- Scope: Emphasis is placed on divergent language isolates, families, and branches that would likely be crucial for etymological reconstruction and classification. Proto-languages are included whenever possible. Many of these language groups are sparsely documented and/or extinct. As a result, some of these lists may actually be the only extant documentation of a language or even language group.
- Sources: The word lists are adapted from academic sources published by linguists. Thus, all lists must be properly referenced with adequate notes and metadata. Many of these sources are out of print, with highly limited distribution and accessibility.
- Digitization: As with Wikisource texts, the lists are individually and painstakingly digitized using a variety of methods, such as optical character recognition (OCR), manual typing, and document conversion.
- Encoding: Unicode.
Open-access online lexical databases that are similar in design, content, and research goals include STEDT, MKED, RefLex, Chirila, and Starling.
Navigation template
editHmong-Mien
edit- Appendix:Proto-Hmong-Mien reconstructions - Ratliff (2010); Chen (2013) [in Chinese]
- Appendix:Hmong-Mien comparative vocabulary list - Chen (2013) [in Chinese]
- Appendix:Proto-Hmongic reconstructions - Wang (1994) [in Chinese, with English translations]
- Appendix:Pa-Hng comparative vocabulary list - Mao & Li (1997) [in Chinese]; Niederer (1996); others
- Appendix:Xong comparative vocabulary list - Yang (2004) [in Chinese]
- Appendix:Pana word list - Taguchi (2001); Chen (2001) [in Chinese]; Hu (2018) [in Chinese]
- Appendix:She word list - Lan (2020) [in Chinese]
- Appendix:Proto-Mienic reconstructions - Luang-Thongkum (1993); Liu (2021) [in Chinese]
- Appendix:Mienic comparative vocabulary list - Mao (2004) [in Chinese]
- Appendix:Mien (Gongcheng) word list - Ouyang (2015) [in Chinese]
- Appendix:Biao Min (Shikou) word list - Feng (2020) [in Chinese]
- Appendix:Proto-Hmongic reconstructions - Wang (1994) [in Chinese, with English translations]
Kra-Dai
edit- Appendix:Proto-Kra-Dai reconstructions - Ostapirat (2018); Norquest (2020)
- Appendix:Proto-Tai reconstructions - Pittayaporn (2009)
- Appendix:Zhuang comparative vocabulary list - Zhang et al. (1999) [in Chinese]
- Appendix:Zhuang (Tiandeng) word list - Liang (2022) [in Chinese]
- Appendix:Bouyei comparative vocabulary list - Snyder et al. (2007) [in Chinese and English]
- Appendix:Zhuang comparative vocabulary list - Zhang et al. (1999) [in Chinese]
- Appendix:Proto-Hlai reconstructions - Norquest (2015) [in Chinese and English]; Ostapirat (2004)
- Appendix:Jiamao word list - Liu (2008) [in Chinese]; Norquest (2015)
- Appendix:Proto-Ong-Be reconstructions - Chen (2018) [in Chinese and English]
- Appendix:Jizhao Swadesh list - Li & Wu (2017) [in Chinese]
- Appendix:Proto-Lakkia reconstructions - Luang-Thongkum (1992)
- Appendix:Biao word list - Liang (2002) [in Chinese]
- Appendix:Hunan Kam-Sui languages comparative vocabulary list - various [in Chinese]
- Appendix:Proto-Kam-Sui reconstructions - Thurgood (1988)
- Appendix:Proto-Kra reconstructions - Ostapirat (2000)
- Appendix:Laha word list - Solntseva & Hoang (1986) [in Vietnamese and Russian]
- Appendix:Qabiao word list - Hoang & Vu (1992) [in Vietnamese]
- Appendix:Gelao Swadesh lists - Samarina (2011) [in Russian and Vietnamese]
- Appendix:Gelao comparative vocabulary list [in Russian and Vietnamese] (Russian Wiktionary)
- Appendix:Proto-Tai reconstructions - Pittayaporn (2009)
Austronesian
editAustroasiatic
edit- Appendix:Proto-Austroasiatic reconstructions - Sidwell & Rau (2015)
- Appendix:Proto-Austroasiatic Swadesh list - Sidwell (2024)
- Appendix:Proto-Munda reconstructions - Sidwell & Rau (2015); Rau (2019)
- Appendix:Proto-Khasian reconstructions - Sidwell (2018)
- Appendix:Proto-Palaungic reconstructions - Sidwell (2015)
- Appendix:Quang Lam word list - Nguyen (1975) [in Vietnamese]
- Appendix:Proto-Khmuic reconstructions - Sidwell (2013)
- Appendix:Proto-Pakanic reconstructions - Hsiu (2016)
- Appendix:Proto-Vietic reconstructions - Ferlus (2007) [in French, with English translations]
- Appendix:Proto-Katuic reconstructions - Sidwell (2005)
- Appendix:Proto-Bahnaric reconstructions - Sidwell (2011)
- Appendix:Proto-Pearic reconstructions - Headley (1985)
- Appendix:Proto-Khmeric reconstructions - Sidwell & Rau (2015)
- Appendix:Proto-Monic reconstructions - Diffloth (1984)
- Appendix:Proto-Aslian reconstructions - Phillips (2012)
- Appendix:Proto-Nicobarese reconstructions - Sidwell (2018)
Sino-Tibetan
edit- Appendix:Proto-Tibeto-Burman reconstructions - Matisoff (2015)
- Sinitic
- Appendix:Baxter-Sagart Old Chinese reconstruction - Baxter & Sagart (2014)
- Appendix:Old Chinese basic vocabulary - Sagart & Ma (2020)
- Appendix:Proto-Southern Min reconstructions - Kwok (2018)
- Appendix:Greater Bai comparative vocabulary list - various [in Chinese]
- Appendix:Proto-Tujia reconstructions - Zhou (2020) [in Chinese and English]
- Lolo-Burmese
- Appendix:Proto-Lolo-Burmese reconstructions - Li (2011)
- Appendix:Akha comparative vocabulary list - Hayashi (2016, 2018)
- Appendix:Woni word list - Yang (2016) [in Chinese]
- Appendix:Axi word list - Pan (2018) [in Chinese]
- Appendix:Nesu word list - Wu (2020) [in Chinese]
- Appendix:Yi (Mihei) word list - Yang (2020) [in Chinese]
- Appendix:Proto-Lalo reconstructions - Yang (2010) [in Chinese and English]
- Appendix:Lalo word list - Wu (2022) [in Chinese]
- Appendix:Guiqiong word list - Li (2015)
- Appendix:Proto-Naish reconstructions - Jacques & Michaud (2011)
- Appendix:Proto-Ersuic reconstructions - Yu (2012)
- Appendix:Horpa (Zongke) word list - Li (2020) [in Chinese]
- Appendix:Kathu word list - Wu (2004) [in Chinese]
- Appendix:Gong vocabulary lists - Rujjanavet (1986); Thawornpat (2006); and others
- Appendix:Proto-Karenic reconstructions - Luang-Thongkum (2019)
- Appendix:Zakhring word list - Li & Jiang (2001) [in Chinese]
- Appendix:Proto-Luish reconstructions - Huziwara (2012)
- Appendix:Proto-Bodo-Garo reconstructions - Joseph & Burling (2006)
- Appendix:Kuki-Chin Swadesh lists - Otsuka (2016)
- Jejara - Lubbe, Priest & Lew (2022)
- Appendix:Suansu word list - Ivani (2019)
- Appendix:Mru word list - Luce (1985)
- Appendix:Tshangla comparative vocabulary list - Abraham (2018)
- Appendix:Kho-Bwa comparative vocabulary lists - Lieberherr & Bodt (2017); Abraham (2018)
- Appendix:Mey word lists - Blench (2012)
- Appendix:Puroik comparative vocabulary list - Lieberherr (2015)
- Appendix:Hrusish comparative vocabulary lists - Bodt & Lieberherr (2015); Abraham (2018)
- Appendix:Koro word lists - Abraham (2018); Anderson (2010); Blench (2018)
- Appendix:Greater Siangic comparative vocabulary list - Modi (2013)
- Appendix:Swadesh lists for Tibeto-Burman languages of Nepal - various
- Appendix:Swadesh lists for Raji-Raute languages - Fortier (2019)
- Appendix:Raji-Raute comparative vocabulary list - Fortier (2019)
- Appendix:Dhimalish comparative vocabulary list - Grollmann & Gerber (2017)
- Appendix:Baram-Thangmi comparative vocabulary list - Kansakar (2010); Regmi (2014)
- Appendix:Dura word list - Schorer (2016)
- Appendix:Bunan word list - Widmer (2014)
- Appendix:Proto-Kham reconstructions - Watters (2002)
- Tibetic
- Appendix:Proto-Western Tibetan reconstructions - Backstrom (1994)
- Appendix:Amdo Tibetan Swadesh list - Yang (2017) [in Chinese]
- Appendix:Tibetan (Lajiao) word list - Xu (2020) [in Chinese]
- Appendix:Stable lexical roots in Sino-Tibetan languages - Matisoff (2009)
Branches
editOpen-access online lexical resources for each Sino-Tibetan branch are listed below. Branches for which lexical data is available in the Sino-Tibetan Etymological Dictionary and Thesaurus (2015) is noted as (STEDT).
- Western Himalayas
- West Himalayish: (STEDT); w:West Himalayish languages; Appendix:Bunan word list
- Dura: Appendix:Dura word list
- Raji-Raute: w:Raji–Raute languages#Vocabulary (merged into Appendix:Swadesh lists for Tibeto-Burman languages of Nepal); Appendix:Swadesh lists for Raji-Raute languages
- Magar: (STEDT)
- Chepang: (STEDT); Appendix:Bhujel Swadesh list
- Kham: Appendix:Proto-Kham reconstructions, w:Kham language#Reconstruction
- Newaric
- Newar: (STEDT)
- Baram-Thangmi: Appendix:Baram-Thangmi comparative vocabulary list
- Dhimalish: Appendix:Dhimalish comparative vocabulary list; w:Dhimalish languages, w:Dhimal language#Vocabulary, w:Toto language#Vocabulary
- Sal
- Kiranti: list 1, list 2 (STEDT); list 3
- Lepcha: (STEDT)
- Tamangic: list (STEDT)
- Eastern Himalayas
- Tibetic: w:Tibetic languages#Reconstruction; Appendix:Proto-Western Tibetan reconstructions
- East Bodish: w:East Bodish languages#Reconstruction
- Gongduk: w:Gongduk language#Vocabulary
- 'Ole: w:'Ole language#Vocabulary
- Siangic: w:Siangic languages#Reconstruction
- Tshangla: Appendix:Tshangla comparative vocabulary list
- Kho-Bwa: Appendix:Kho-Bwa comparative vocabulary lists
- Hrusish: Appendix:Hrusish comparative vocabulary lists, w:Hrusish languages#Reconstruction
- Tani: list (STEDT)
- Miju-Meyor: STEDT; Appendix:Zakhring word list
- Idu-Taraon: (STEDT)
- Lamo: w:Lamo language
- Central Sino-Tibetan branches
- Sal
- Bodo–Garo: Appendix:Proto-Bodo-Garo reconstructions
- Northern Naga: list (STEDT)
- Luish: Appendix:Proto-Luish reconstructions, w:Luish languages#Reconstruction
- Jingpho: (STEDT)
- Kuki-Chin-Naga
- Karbi (publication forthcoming)
- Central Naga: list (STEDT)
- Angami-Pochuri: (STEDT)
- Zeme: (STEDT)
- Meithei: (STEDT)
- Tangkhulic: list (STEDT)
- Kuki-Chin: list (STEDT); Appendix:Kuki-Chin Swadesh lists
- Mruic: Appendix:Mru word list
- Taman: w:Taman language (Sino-Tibetan)#Lexicon
- Pyu: w:Pyu language (Burma)#Vocabulary
- Karenic: list (STEDT); Appendix:Proto-Karenic reconstructions
- Gong: Appendix:Gong vocabulary lists
- China
- Nungish: (STEDT)
- rGyalrongic: (STEDT); lists
- Horpa: (STEDT); w:Horpa language#Vocabulary; lists
- Burmo-Qiangic
- Tangut
- Baima
- Lavrung: (STEDT)
- Queyu: (STEDT)
- Zhaba: (STEDT)
- Guiqiong: (STEDT); Appendix:Guiqiong word list
- Muya: (STEDT)
- Rma: (STEDT); list
- Prinmi: (STEDT); list
- Ersuic: Appendix:Proto-Ersuic reconstructions
- Naish: Appendix:Proto-Naish reconstructions
- Lolo-Burmese: list (STEDT): Appendix:Proto-Lolo-Burmese reconstructions
- Loloish: list (STEDT)
- Bai: Appendix:Greater Bai comparative vocabulary list
- Kathu: Appendix:Kathu word list
- Tujia: (STEDT)
- Old Chinese: list (Baxter & Sagart 2014)
Others
edit- Appendix:Burushaski comparative vocabulary list - Backstrom (1992)
- Appendix:Kusunda word list - Watters (2006)
- Appendix:Nihali word list - Nagaraja (2014)
- Appendix:Kenaboi word list - Hajek (1998)
- Appendix:Proto-Ongan reconstructions - Blevins (2007)
- Appendix:Proto-Trans-New Guinea reconstructions - Pawley & Hammarström (2017)
- Appendix:Proto-Japanese Swadesh list - Vovin (1994)
- Appendix:Proto-Ainu reconstructions - Vovin (1993)
- Appendix:Proto-Nivkh reconstructions - Fortescue (2016)
See also
editExternal links
edit- STEDT (Sino-Tibetan Etymological Dictionary and Thesaurus)
- MKED (Mon-Khmer Etymological Dictionary)
- Munda Etymological Dictionary
- ACD (Austronesian Comparative Dictionary)
- ABVD (Austronesian Basic Vocabulary Database)