Wikidata:Property proposal/Vietnamese pronunciation

From Wikidata
Jump to navigation Jump to search

Vietnamese reading

[edit]

Originally proposed at Wikidata:Property proposal/Generic

DescriptionReading of Han character in Quốc Ngữ.
Data typeString
DomainHan characters
Allowed valuesAny valid Quốc Ngữ syllable with mandatory qualifier: sinogram reading pattern (P5244) with one or both of the following values: chữ Hán (Q1378119) and chữ Nôm (Q875344)) optional qualifier of writing system (P282) with values of simplified Chinese characters (Q185614) or traditional Chinese characters (Q178528) (only used if the item has the corresponding writing system)
Example 1(Q4025820) -> ① nhất (sinogram reading pattern (P5244): chữ Hán (Q1378119) and chữ Nôm (Q875344)); ② nhứt (sinogram reading pattern (P5244): chữ Hán (Q1378119))
Example 2(Q3863965) -> ① đấy (sinogram reading pattern (P5244): chữ Nôm (Q875344)); ② đế (sinogram reading pattern (P5244): chữ Hán (Q1378119))
Example 3(Q3863998) -> ① hải (sinogram reading pattern (P5244): chữ Hán (Q1378119) and chữ Nôm (Q875344)); ② hẩy (sinogram reading pattern (P5244): chữ Nôm (Q875344))
SourceVietnamese Nôm Preservation Foundation (http://nomfoundation.org/), WinVNKey (http://winvnkey.sourceforge.net/), Unihan database
Number of IDs in source17,565 items (based on Nôm Preservation Foundation)
Expected completenesseventually complete (Q21873974)
Robot and gadget jobsYes

Motivation

This is another property of Hanzi. GZWDer (talk) 19:03, 20 July 2018 (UTC)[reply]

Discussion

 Support Additional data from Nom Foundation is available here: Module:vi/nom-data KevinUp (talk) 02:26, 21 July 2018‎ (UTC)[reply]

@KevinUp, Mxn, GZWDer, Okkn: ✓ Done: Vietnamese reading (P5625). − Pintoch (talk) 08:14, 12 August 2018 (UTC)[reply]

@KevinUp: I just remembered that sinogram reading pattern (P5244) is constrained to be set to Sino-Vietnamese vocabulary (Q908017) rather than chữ Hán (Q1378119). Sino-Vietnamese vocabulary (Q908017) is more appropriate, since it refers to the method by which the character is assigned a pronunciation, rather than the use of Chinese characters to write Chinese, irrespective of pronunciation. – Minh Nguyễn 💬 09:05, 12 August 2018 (UTC)[reply]
Mxn: I just added "Chữ Hán" (also known as chữ Hán (Q1378119)) as a property constraint for sinogram reading pattern (P5244). In my opinion, the scope of Sino-Vietnamese vocabulary (Q908017) (Từ Hán-Việt) is a bit too wide and that "Chữ Hán" is more appropriate because there is a difference between 'chữ' (single character word) and 'từ' (compound word that consists of at least two characters). Readings obtained from individual "Chữ Hán" are usually not meaningful on its own unless they are used in combination with other "Chữ Hán" to form Sino-Vietnamese vocabulary (Q908017) (Từ Hán-Việt). Since we are dealing with individual Han characters, "Chữ Hán" rather than "Từ Hán-Việt" would be the more appropriate qualifier. Nevertheless, Sino-Vietnamese vocabulary (Q908017) can still be used for the reading pattern of compound words or lexemes. KevinUp (talk) 00:30, 13 August 2018 (UTC)[reply]
By the way, Wikipedia pages written in languages other than Vietnamese offers the following explanation for for "Chữ Nho" (which has the same meaning as "Chữ Hán" in Vietnamese): "Chữ Nho" or "Chữ Hán" is used in the writing of classical Chinese literature or Sino-Vietnamese vocabulary whereas "Chữ Nôm" is used in the writing of native Vietnamese vocabulary. This seems to be much more refined compared to the Vietnamese wiki page for "Chữ Hán" which is the same as "Chinese character" on English Wikipedia. From a translingual perspective, "Chữ Hán" (or chữ Hán (Q1378119)), kanji (Q82772) and Hanja (Q485619) are generic native terms for Chinese characters (Q8201) used in the regions of Vietnam (Q881), Japan (Q17) and Korea (Q18097) respectively whereas chữ Nôm (Q875344), kokuji (Q1185862) (also known as
和製漢字
) and gukja (Q1554195) are more specific terms that refer to native characters created in the regions of Vietnam (Q881), Japan (Q17) and Korea (Q18097) respectively that are not found or used in China (Q29520). KevinUp (talk) 00:30, 13 August 2018 (UTC)[reply]
KevinUp: A couple points of clarification. Chữ Hán primarily refers to Chinese characters in general. Chữ nho means Chinese characters as opposed to chữ nôm (demotic characters), but sometimes chữ Hán is also used in this sense. Từ Hán-Việt refers to the practice of loaning words from Chinese, as opposed to từ thuần Việt (native words). (Từ in modern usage is equivalent to the Western concept of a word and does not necessarily refer to a compound word, which would be cụm từ.) For example, mùi is considered native while vị is considered Hán-Việt, but both are meaningful on their own. Note that it isn't chữ Hán-Việt: từ Hán-Việt can also refer to the same words written alphabetically or spoken verbally. As such, phiên âm Hán-Việt (Sino-Vietnamese reading (Q10805375)) is the proper way to refer to the practice of transcribing Chinese characters representing Chinese words alphabetically in quốc ngữ. What isn't necessarily meaningful on its own is âm Hán-Việt, though the distinction between âm Hán-Việt and từ Hán-Việt is quite obscure. Above, I conflated từ Hán-Việt with phiên âm Hán-Việt; sorry for the confusion. – Minh Nguyễn 💬 01:05, 13 August 2018 (UTC)[reply]
Mxn: Thanks for the clarification. Seems like a new item will need to be created for "native Vietnamese reading" that is the opposite of Sino-Vietnamese reading (Q10805375). Since chữ Nôm (Q875344) refers to characters formerly used in the writing system of Vietnam it is not suitable as a qualifier for sinogram reading pattern (P5244). What do you think? Shall I create "native Vietnamese reading" and use it along with Sino-Vietnamese reading (Q10805375) for the qualifier sinogram reading pattern (P5244)? KevinUp (talk) 01:29, 13 August 2018 (UTC)[reply]
I just realized that the English Wikipedia link for Sino-Vietnamese reading (Q10805375) redirects to "Sino-Vietnamese vocabulary". Should I create a separate item for "Tu Hán-Việt" and put w:Sino-Vietnamese vocabulary under that new item instead? Sometimes new items need to be created on Wikidata to isolate specific concepts, eg. sinogram (Q53764738) and Chinese characters (Q8201). KevinUp (talk) 01:40, 13 August 2018 (UTC)[reply]
Never mind. Turns out Sino-Vietnamese vocabulary (Q908017) already exists and is not to be confused with Sino-Vietnamese reading (Q10805375). I think I will go ahead and create a new item for "native Vietnamese reading". KevinUp (talk) 02:51, 13 August 2018 (UTC)[reply]
Mxn: The property constraint for sinogram reading pattern (P5244) (to be used with this property) is now chữ Nôm reading (Q56066660) and Sino-Vietnamese reading (Q10805375) which is more consistent with Japanese kun'yomi (Q1147749) and on'yomi (Q718498). Also, you might want to check or review the following items on Wikidata:
So instead of using chữ Nôm (Q875344) or chữ Hán (Q1378119) as values for the qualifier sinogram reading pattern (P5244) (as shown in the examples above), chữ Nôm reading (Q56066660) and Sino-Vietnamese reading (Q10805375) will be used instead. I think the issue is now resolved. KevinUp (talk) 04:37, 13 August 2018 (UTC)[reply]
Thanks KevinUp. Distinguishing between chữ Nôm (Q875344) and chữ Nôm reading (Q56066660) might be splitting hairs for most Vietnamese speakers, but it parallels chữ Hán (Q1378119) and Sino-Vietnamese reading (Q10805375), which is important. – Minh Nguyễn 💬 07:04, 13 August 2018 (UTC)[reply]
Mxn: You're welcome. Perhaps you may be interested in Wikidata:WikiProject CJKV character. Thank you very much for your participation in this discussion. Now we can all start using this property with Nôm and Sino-Vietnamese readings clearly distinguished from one another. KevinUp (talk) 09:33, 13 August 2018 (UTC)[reply]