Wiktionary:Grease pit/2020/July
r:obastan -> R:obastan
[edit]Can someone with a bot please replace all instances of * {{r:obastan}} with * {{R:obastan}}, please? Allahverdi Verdizade (talk) 15:33, 1 July 2020 (UTC)
- That template doesn't exist; you mean
{{R:az:Obastan}}
. But since there's a redirect from{{r:obastan}}
to{{R:az:Obastan}}
, why not leave them alone and allow the redirect to do its job? —Mahāgaja · talk 16:06, 1 July 2020 (UTC)
- Sure, why not. Allahverdi Verdizade (talk) 16:16, 1 July 2020 (UTC)
Software change
[edit]The mw:New requirements for user signatures will begin on Monday, 6 July 2020. This is a change to MediaWiki software that will prevent editors from accidentally setting certain types of custom signatures, such as a custom signature that creates Special:LintErrors (such as <span>...<span>
instead of <span>...</span>
) or a signature that does not link to the local account.
Few editors will be affected. If you want to know whether your signature (or any individual editor) is okay, you can check your signature at https://signatures.toolforge.org/check You are not required to fix an invalid custom signature immediately. Starting Monday, editors will not be able to create new invalid signatures to Special:Preferences. Later, we will contact affected editors. Eventually, invalid custom signatures will stop working. There will be an announcement in m:Tech/News then. You can subscribe to m:Tech/News. You can also put mw:New requirements for user signatures on your watchlist.
If you have questions, then please ping me or ask questions at mw:Talk:New requirements for user signatures. Thanks, Whatamidoing (WMF) (talk) 03:47, 2 July 2020 (UTC)
I'm not sure whether this is a Grease Pit or Beer Parlour matter - I am often baffled why WT:NORM is added to a page - sometimes I can track it down to spacing, but more often than not I am puzzled, like here. DonnanZ (talk) 15:12, 4 July 2020 (UTC)
This paragraph may offer a clue:
- Some edits are tagged with "WT:NORM" by an experimental filter. This means that the wikitext of the page violates one of these rules. It does not mean that the edit is bad, though bad edits will sometimes trigger the filter. DonnanZ (talk) 15:45, 4 July 2020 (UTC)
- There was a space at the end of a line. I wish I could click on the WT:NORM tag link and have the system generate an edit normalizing the page. Vox Sciurorum (talk) 16:12, 4 July 2020 (UTC)
- That sounds like a useful idea, if it can be made to work. My 72-year-old eyes didn't pick that one up, thanks for finding it. DonnanZ (talk) 16:51, 4 July 2020 (UTC)
- I forget what it's called now, but there used to be a bot which targeted pages tagged with WT:NORM. It may have stopped operating. In fact I think the bot was used before the tag was introduced. DonnanZ (talk) 12:36, 5 July 2020 (UTC)
- Maybe you're thinking of my bot, User:ToilBot, though I added the tag before creating the bot script. I run the script on relatively small batches of pages because the edit logs end up being so long. I need to figure out how to cut up the log into bite-size pieces (a size that doesn't take too long to load). — Eru·tuon 04:41, 6 July 2020 (UTC)
- User:TheDaveBot used to do that sort of thing a few years ago, but I believe it was before the tag was created. Chuck Entz (talk) 06:13, 6 July 2020 (UTC)
- It must be TheDaveBot I'm thinking of. DonnanZ (talk) 09:27, 6 July 2020 (UTC)
- User:TheDaveBot used to do that sort of thing a few years ago, but I believe it was before the tag was created. Chuck Entz (talk) 06:13, 6 July 2020 (UTC)
- Maybe you're thinking of my bot, User:ToilBot, though I added the tag before creating the bot script. I run the script on relatively small batches of pages because the edit logs end up being so long. I need to figure out how to cut up the log into bite-size pieces (a size that doesn't take too long to load). — Eru·tuon 04:41, 6 July 2020 (UTC)
w inside coinage doesn't work
[edit]If I write "{{coinage|en|{{w|Donald Trump}}}}" I get "Coined by [[w:Donald Trump|Donald Trump]]". Something is preventing substitution of the [[]] generated by {{w}}
. Vox Sciurorum (talk) 16:10, 4 July 2020 (UTC)
{{coinage|en|w:Donald Trump}}
doesn't work, even though{{l|en|w:Donald Trump}}
does. So this seems like a bug in the former. —Rua (mew) 16:29, 4 July 2020 (UTC)- The template links to Wikipedia by default. So you can just unwrap
Donald Trump
out of{{w}}
:{{coinage|en|Donald Trump}}
. — Eru·tuon 18:46, 4 July 2020 (UTC)- Thanks. I should have RTFM a few more times. Vox Sciurorum (talk) 22:26, 4 July 2020 (UTC)
Kyrgyz declension template
[edit]The template {{ky-decl-noun}}
is incapable of handling singulare and plurale tantum declensions. Could someone add that ability? İʟᴀᴡᴀ–Kᴀᴛᴀᴋᴀ (talk) (edits) 20:58, 5 July 2020 (UTC)
- @Ilawa-Kataka See
{{ky-decl-noun-sg}}
and{{ky-decl-noun-pl}}
. I rewrote these two as well as{{ky-decl-noun}}
using a module; they no longer require any parameters. Benwing2 (talk) 01:37, 24 July 2020 (UTC)- @Benwing2: Thank you! @Ilawa-Kataka: I have applied
{{ky-decl-noun-sg}}
to some country names where appropriate. --Anatoli T. (обсудить/вклад) 02:13, 24 July 2020 (UTC)- @Benwing2, @Atitarev: Thank you both! İʟᴀᴡᴀ–Kᴀᴛᴀᴋᴀ (talk) (edits) 03:09, 24 July 2020 (UTC)
- @Benwing2: Thank you! @Ilawa-Kataka: I have applied
- @Benwing2 The module cannot handle capital letters such as at the entry Ош. Can you fix that? İʟᴀᴡᴀ–Kᴀᴛᴀᴋᴀ (talk) (edits) 20:31, 25 July 2020 (UTC)
- @Ilawa-Kataka Fixed. Benwing2 (talk) 21:01, 25 July 2020 (UTC)
- @Benwing2 Thanks! Ош is uncountable by the way. İʟᴀᴡᴀ–Kᴀᴛᴀᴋᴀ (talk) (edits) 21:10, 25 July 2020 (UTC)
- @Ilawa-Kataka Fixed. Benwing2 (talk) 21:01, 25 July 2020 (UTC)
- @Benwing2 One more thing—could you add possessive suffix support? I think the best resource for that is https://www.researchgate.net/publication/313774172_Kyrgyz_Orthography_and_Morphotactics_with_Implementation_in_NUVE at page 7 (the key is at page 3). The forms with "з" in it are polite forms. İʟᴀᴡᴀ–Kᴀᴛᴀᴋᴀ (talk) (edits) 13:31, 29 July 2020 (UTC)
This does not work with Wikipedia language codes not recognized in Wiktionary. For example, we treat azb
under az
, but there is a https://azb.wikipedia.org/ Wikipedia to which I would like to link. --Vahag (talk) 07:30, 6 July 2020 (UTC)
@Sgconlaw Currently, if an entry using Template:quote-book doesn't contain a year, it gets categorised into Category:Requests for date - which is good. It would be even better to categorise according to author, so the quote at abbreviature (#* {{quote-book|en|title=Via Pacis|author={{w|Jeremy Taylor}}| passage=This is an excellent '''abbreviature''' of the whole duty of a Christian.}}) would get categorised into Category:Requests for date/Jeremy Taylor. Obviously, I can't touch that template as I'm a total n00b, so I'm hoping someone else can. --Dada por viva (talk) 23:56, 6 July 2020 (UTC)
Special Characters in the Search Bar
[edit]How can I escape * in the search? I'm getting 60.000 results for "*le-".
I would put this question in the newby forum, but it does not get any exposure there, and if I remember correctly, there is just no solution besides downloading and grepping a db-dump, so this is likely a vain feature request. 109.41.0.27 19:04, 8 July 2020 (UTC)
- Are you trying to use the asterisk as a wildcard? Our searchbox doesn't accommodate that. —Mahāgaja · talk 20:20, 8 July 2020 (UTC)
- @Mahagaja no I'm grepping for a reconstruction that is not indexed. If it's red-linked, I should try the what-links-here page. 109.41.3.82
- OK, since reconstructions are in a separate namespace, and the asterisk isn't actually part of the page name in the Reconstruction namespace, you have to go to Special:Search, then click "Add namespaces…" to restrict your search to the Reconstruction namespace, and then search for "le". I just did so and the closest things I can find to what you're looking for are Reconstruction:Proto-Slavic/a le and Reconstruction:Proto-Samoyedic/lë. —Mahāgaja · talk 07:14, 10 July 2020 (UTC)
- @Mahagaja thanks, I knew, theoretically, but I'm looking for a mention in the main namespace (asserting *le- > PIE *leg'-). I could just go through my browser history one by one, but even if that were successfull, it would not resolve this request in general. To escape means adding a escape character to create a escape sequence. 109.41.2.63 18:24, 10 July 2020 (UTC)
- If you made a red link to the page, you can use Special:WhatLinksHere to find the page it appears on (as you already mentioned yourself). I knew what you meant by "escape", but I don't think the search page has that function. —Mahāgaja · talk 20:15, 10 July 2020 (UTC)
- @Mahagaja thanks, I knew, theoretically, but I'm looking for a mention in the main namespace (asserting *le- > PIE *leg'-). I could just go through my browser history one by one, but even if that were successfull, it would not resolve this request in general. To escape means adding a escape character to create a escape sequence. 109.41.2.63 18:24, 10 July 2020 (UTC)
- OK, since reconstructions are in a separate namespace, and the asterisk isn't actually part of the page name in the Reconstruction namespace, you have to go to Special:Search, then click "Add namespaces…" to restrict your search to the Reconstruction namespace, and then search for "le". I just did so and the closest things I can find to what you're looking for are Reconstruction:Proto-Slavic/a le and Reconstruction:Proto-Samoyedic/lë. —Mahāgaja · talk 07:14, 10 July 2020 (UTC)
- @Mahagaja no I'm grepping for a reconstruction that is not indexed. If it's red-linked, I should try the what-links-here page. 109.41.3.82
- If you're trying to search for a literal
*
, I think that's not supported as a word query (or whatever the term is) because only words are indexed, and*
, and probably-
, are not word characters. But you can use theinsource
feature with regex (insource:/\*le-/
).insource
searches with regex often don't finish, so you can doinsource:"le" insource:/\*le-/
to speed it up. Theinsource
with quotes does a word search in the wikitext, which is about as fast as a word query, and shortens the list of pages that have to be searched with regex. — Eru·tuon 22:00, 10 July 2020 (UTC)
The archaic letter ѣ (jě) was used in Russian in pre-1918 orthography. It normally corresponds to е (je) in the modern spelling but there are cases when it corresponds to letter ё (jo), which is normally or quite often replaced with letter е (je) in running texts by native speakers. In the pre-1918 letter ё (jo) was used even less but ѣ (jě) was never written as ё (jo), unless to show how the word was pronounced. We are using "ѣ" with a combining diaeresis in cases when "ѣ" should be read as "ё" and it automatically adds a combining diaeresis when it's necessary, for example, the plural nominative of гнѣздо́ (gnězdó), etc. is displayed as гнѣ̈зда (gnjǒ́zda) (ѣ with a diaeresis) in the declension table. In cases where "ѣ" should be read as "ё" in the lemma, it has to be specifically added, as гнѣ̈здышко (gnjǒ́zdyško), the pre-reform spelling of гнёздышко (gnjózdyško)
@Benwing2: Can we please remove the diaeresis from links, as with acute accents, so that гнѣ̈здышко (gnjǒ́zdyško) is linked to гнѣздышко? I don't know, which module is responsible for it.
I will update WT:RU TR so that users know about the usage of diaeresis over "ѣ", it's purely our Wiktionary convention. --Anatoli T. (обсудить/вклад) 00:23, 9 July 2020 (UTC)
- @Atitarev Fixed. The module in question is Module:languages/data2. Benwing2 (talk) 00:54, 9 July 2020 (UTC)
- @Benwing2: Thank you for the quick fix. This usage of the diaeresis made me think that maybe we should also use ея̈ (jejǫ́) when it corresponds to modern её (jejó)? It was only one word where "я" = "ё", though, if I'm not mistaken and "её" is a more modern pronunciation but there were a lot of double pronunciations and confusions at the time when "ё" was first introduced. --Anatoli T. (обсудить/вклад) 01:05, 9 July 2020 (UTC)
- @Atitarev There are a lot of similar cases, right? E.g. neut/fem pl бѣ́лыя instead of modern бе́лые. Benwing2 (talk) 01:13, 9 July 2020 (UTC)
- @Benwing2: There are some cases but I didn't have this one in mind. бѣ́лыя could be read as it was written without any issues. It could be read as бѣ́лые or as it's spelled, very little difference with unstressed vowels. In fact, it was grammatically helpful but burdensome for learners. What I find troublesome is old "больша́го" for "большо́го". "больша́го" was not pronounced that way long before the reform, unless by clergy. (The introduction of letter "ё" was first mocked by many as (1) too colloquial or (2) in some cases labelled as "просторечие".) --Anatoli T. (обсудить/вклад) 02:02, 9 July 2020 (UTC)
- @Atitarev There are a lot of similar cases, right? E.g. neut/fem pl бѣ́лыя instead of modern бе́лые. Benwing2 (talk) 01:13, 9 July 2020 (UTC)
- @Benwing2: Thank you for the quick fix. This usage of the diaeresis made me think that maybe we should also use ея̈ (jejǫ́) when it corresponds to modern её (jejó)? It was only one word where "я" = "ё", though, if I'm not mistaken and "её" is a more modern pronunciation but there were a lot of double pronunciations and confusions at the time when "ё" was first introduced. --Anatoli T. (обсудить/вклад) 01:05, 9 July 2020 (UTC)
Alphabetical order in Ojibwe
[edit](Please feel free to point me to the correct spot for this question if this is not the right one.)
I want to alphabetize lists that appear in Ojibwe pages - both namespace pages as well as categories - according to the (Latin) Ojibwe alphabet (a, aa, b, ch, d, e, g, h, ’, i, ii, j, k, m, n, o, oo, p, s, sh, t, w, y, z, zh).
Simply put, double vowels are considered as one letter, as are ch, sh and zh, and the glottal stop (apostrophe) " ’ " follows h. There are also several letters that don't appear, as well as Canadian syllabics, but I don't think these issues should pose a particular problem, as they't alter the order internal to the existing letters.
I'm not certain of the technical changes that are necessary for this, nor the scope of the solution, but I see that lists in other languages (Swedish, Spanish, for example) are organized according to those languages' alphabets, so i thought i would ask.
Thanks in advance for any help you can offer. SteveGat (talk) 18:43, 9 July 2020 (UTC)
- We can fix the alphabetical order in categories the same way we do for Welsh, by specifying a sort key at Module:languages/data2. For automatically sorted lists of Derived terms and Related terms, you would have to use
{{col-u}}
and sort the list manually. —Mahāgaja · talk 19:47, 9 July 2020 (UTC)- Manual sorting isn't necessary because column templates such as
{{der3}}
will use the language's sortkey to alphabetize the words. They can also use an arbitrary sortkey, generated by a function specified in Module:collation, to sort more accurately than categories. Currently Egyptian is the only language that makes use of this feature, using a function in Module:egy-utilities. So for instance for Ojibwe, the function could replacea
with1
,aa
with2
,b
with3
,ch
with4
, etc. (Actually I would use different replacement characters, but you get the idea.) Column templates can use an arbitrary sortkey because they don't display the sortkey anywhere, but categories can't because they display the first code point of the sortkey in the column headers. — Eru·tuon 22:50, 9 July 2020 (UTC)- Cool, I wasn't aware that
{{der3}}
and its allies now use the language's sortkey. —Mahāgaja · talk 06:08, 10 July 2020 (UTC)- Just coming back to this. I think i understand theoretically what is being suggested, but i'm certain i don't know how to do it. Could someone explain it to me with instructions i might be able to follow or, if it really is so simple, maybe just do it? SteveGat (talk) 13:49, 27 July 2020 (UTC)
- @SteveGat: Unless you have template editor or admin rights, you can't edit the module anyway, but I can do it for you, sort of. I can make it so that in categories, words beginning with "aa" are sorted at the bottom of words starting with "a", but they'll still be in the "A" section, and likewise for "ii" at the bottom of the "I" section, "sh" at the bottom of the "S" section, and "zh" at the bottom of the "Z" section. Since "c" doesn't exist outside of the digraph "ch", no special sorting is necessary for it, but the section will still be labeled "C" in categories. As for the ʼ character, I can make it appear at the bottom of the "H" section. Is that acceptable? In lists generated with
{{der3}}
,{{rel3}}
and the like, the alphabetization will be right. Incidentally, we should be using ʼ U 02BC modifier letter apostrophe, not ' U 0027 apostrophe or ’ U 2019, right single quotation mark, for Ojibwe, since it's functioning as a letter and not as a punctuation mark. —Mahāgaja · talk 14:51, 27 July 2020 (UTC)
- @SteveGat: Unless you have template editor or admin rights, you can't edit the module anyway, but I can do it for you, sort of. I can make it so that in categories, words beginning with "aa" are sorted at the bottom of words starting with "a", but they'll still be in the "A" section, and likewise for "ii" at the bottom of the "I" section, "sh" at the bottom of the "S" section, and "zh" at the bottom of the "Z" section. Since "c" doesn't exist outside of the digraph "ch", no special sorting is necessary for it, but the section will still be labeled "C" in categories. As for the ʼ character, I can make it appear at the bottom of the "H" section. Is that acceptable? In lists generated with
- @Mahagaja: Thanks, that would be great. It's less elegant than having an "a" section and an "aa" section, but at least things will be in order. Will it also work within a word, so that, for example, "adwe" is before "adaam"? SteveGat (talk) 14:58, 27 July 2020 (UTC)
- @SteveGat:, yes, that will work too. —Mahāgaja · talk 15:00, 27 July 2020 (UTC)
- @SteveGat: OK, this is done, but it may take a couple of days for it to filter through and actually work in all the categories. Also, I have only sorted the "modifier letter apostrophe", which means any entries using the normal typewriter apostrophe or the curly quote will not sort properly. Those entries should be moved to entries using the modifier letter apostrophe. —Mahāgaja · talk 15:06, 27 July 2020 (UTC)
- @Mahagaja: Great, thanks. I'll check back through the list of lemmas in a few days to see if any apostrophe problems pop up. It is not a common letter, and never appears word-initially, so problem shooting should be straightforward. SteveGat (talk) 15:10, 27 July 2020 (UTC)
- @SteveGat:. Great, let me know if you have any questions or issues. —Mahāgaja · talk 15:55, 27 July 2020 (UTC)
- @Mahagaja: Great, thanks. I'll check back through the list of lemmas in a few days to see if any apostrophe problems pop up. It is not a common letter, and never appears word-initially, so problem shooting should be straightforward. SteveGat (talk) 15:10, 27 July 2020 (UTC)
- @SteveGat: OK, this is done, but it may take a couple of days for it to filter through and actually work in all the categories. Also, I have only sorted the "modifier letter apostrophe", which means any entries using the normal typewriter apostrophe or the curly quote will not sort properly. Those entries should be moved to entries using the modifier letter apostrophe. —Mahāgaja · talk 15:06, 27 July 2020 (UTC)
- @SteveGat:, yes, that will work too. —Mahāgaja · talk 15:00, 27 July 2020 (UTC)
- Just coming back to this. I think i understand theoretically what is being suggested, but i'm certain i don't know how to do it. Could someone explain it to me with instructions i might be able to follow or, if it really is so simple, maybe just do it? SteveGat (talk) 13:49, 27 July 2020 (UTC)
- Cool, I wasn't aware that
- Manual sorting isn't necessary because column templates such as
@SteveGat: Here's a list of Ojibwe entry names with straight apostrophes, as of July 20th. There weren't any with the curly apostrophe.
- a'aw
- aaba'
- aate'ishkodawewinini
- aazhawa'oodoon
- akii-mazina'igan
- anaamaya'ii
- anaamayi'ii
- anama'e-giizhik
- anami'e-giizhik
- anami'egiizhigad
- awegonen i'iw
- ayi'ii
- bagamidaabii'iwe
- bagamidaabii'iweyaang
- biinda'oojigan
- booni'
- daataagwa'igan
- diba'igaans
- diba'igan
- en'
- enyanh'
- gaanda'an
- gaandakii'igan
- gekinoo'amaaged
- gibaakwa'odiiwigamig
- gizaagi'in
- gookooko'oo
- gwiingwa'aage
- i'iw
- ishkwaa-anami'egiizhigad
- ishpayi'ii
- maajidaabii'iwe
- maajidaabii'iweyaang
- mazina'igan
- mazinibii'igan
- mazinibii'igewinini
- mi'iingan
- nanaa'idaabaanewinini
- ningaabii'ani-noodin
- ningwa'
- ningwa'w-
- niningwa'waanaanig
- nitaawigi'
- o'o
- o'ow
- ode'imini-giizis
- wa'aw
- waakaa'igan
- waawiyebii'igan
- wiiji'iwe
- wiiji'iweyaan
- zaaga'igan
— Eru·tuon 18:42, 27 July 2020 (UTC)
- @Erutuon: Thanks for the list. After Mahāgaja moved a few entries to modify curly apostrophes, i did the first row of your list (i didn't know how to do that without creating a redirect). It's pretty arduous, and raised a couple of questions.
- 1) Why not treat regular apostrophes like curly ones on the back end? and
- 2) How do we avoid people recreating the problem by creating new entries using the "straight" apostrophe? SteveGat (talk) 15:55, 29 July 2020 (UTC)
- @SteveGat: The software treats the modifier letter ʼ and the curly apostrophe ’ differently, because they're used for different purposes. The apostrophe (regardless of whether we use the curly kind or the straight kind) is treated as a punctuation mark, while the modifier letter is treated as a letter, equivalent to "a", "b", "c", etc. As for making sure Wiktionarians use the right character in the future, ideally there should be a page WT:About Ojibwe where all of the conventions are outlined. But of course Ojibwe isn't a language that there are ever going to be dozens of editors working on! And don't worry about the redirects; only admins can move a page without creating a redirect. As long as Ojibwe is the only language that uses a certain spelling, keeping the redirect doesn't hurt anything. —Mahāgaja · talk 18:20, 29 July 2020 (UTC)
- I've now moved all the entries from the list @Erutuon: provided, and did a further search through the Ojibwe lemmas and non-lemma forms. The lists seem to re-alphabetizing, and now i'll add a note to the project page. Anything else i should do? SteveGat (talk) 17:18, 4 August 2020 (UTC)
- @SteveGat: The software treats the modifier letter ʼ and the curly apostrophe ’ differently, because they're used for different purposes. The apostrophe (regardless of whether we use the curly kind or the straight kind) is treated as a punctuation mark, while the modifier letter is treated as a letter, equivalent to "a", "b", "c", etc. As for making sure Wiktionarians use the right character in the future, ideally there should be a page WT:About Ojibwe where all of the conventions are outlined. But of course Ojibwe isn't a language that there are ever going to be dozens of editors working on! And don't worry about the redirects; only admins can move a page without creating a redirect. As long as Ojibwe is the only language that uses a certain spelling, keeping the redirect doesn't hurt anything. —Mahāgaja · talk 18:20, 29 July 2020 (UTC)
Requested edit to Template:cite-meta
[edit]For the benefit of Wikipedians who cross projects occasionally, it would be swell if {{cite-meta}}
(on which {{cite web}}
, {{cite book}}
, &c. rely) could understand access-date
, the normative form in the CS1 Wikipedia templates. Please change all occurrences of {{{accessdate}}}
to {{{accessdate|{{{access-date}}}}}}
. Psiĥedelisto (talk) 20:21, 10 July 2020 (UTC)
- Pinging some admins. @SemperBlotto, Equinox, Metaknowledge: can y'all do this, please? Psiĥedelisto (talk) 08:40, 25 July 2020 (UTC)
- Done —Suzukaze-c (talk) 23:07, 27 July 2020 (UTC)
Adding and Using Pali Transliteration
[edit]I have now got the Pali transliteration working for the 9 supported abugidas using module {{Module:pi-translit}}
. However, it is not possible to always automatically determine whether a word is written in an abugida or an alphabet for the Thai and Lao scripts. Should I register an artificial script for these alphabets? How do I go about the appropriate registration processes? --RichardW57 (talk) 21:22, 11 July 2020 (UTC)
For highly derived subsidiary script form පාචයන්ත් (pācayant) I have the gloss
# {{pi-sc|Sinh|pācayant}}, {{inflection of|pi|පාචෙති||present|participle|tr={{l|pi|pāceti}}}}, {{inflection of|pi|පචති||causative|tr={{l|pi|pacati}}|t=to cook}}
which renders, converting '#' to '*', as:
- Sinhala script form of pācayant, present participle of පාචෙති (pāceti), causative of පචති (pacati, “to cook”)
Can I take advantage of automatic transliteration to make the transliteration a link, or would I have to create a new template, possibly backed up with a new module? An additional complication is that there are words whose transliteration is not the Latin script form of the word, e.g. ທັມມະ (damma, “dharma”), whose Latin script equivalent is dhamma, so I need to be able to not make it a link. --RichardW57 (talk) 21:22, 11 July 2020 (UTC)
vi-etym-sino
[edit]{{vi-etym-sino}}
is very hard to adopt in other language projects (needs to be translated). Could you simplify it with module logic instead? --Octahedron80 (talk) 03:49, 15 July 2020 (UTC)
- I think that is a bad idea for a Roman script language. At 0.5MB per module used, such a conversion could make further pages run out of memory during rendering. Indenting the code (hiding white space within comments if need be) would make the code much more readable than it is now, thus making it easier to translate. --RichardW57 (talk) 12:11, 15 July 2020 (UTC)
In trying to puzzle out why baba, an entry with only a few translations, had run out of memory, I stumbled on this near-singularity galactic gravity well of a data module- 12.35 MB of lua memory in preview for this and a couple of much smaller modules. I'm not very good at either Hungarian or lua, but it looks like Module:R:ErtSz is loading in a table of basically every word in the Hungarian language so it can find a single line using an alphabetic key.
This leads to the obvious question: can't this be split up into sub-modules, based on the first letter of the key (like the Module:languages data modules)? Granted, the vowels are complicated a bit by diacritics that aren't reflected in ordering of the data, but that seems like a minor problem.
Pinging @Erutuon, Adam78 who have been working on this. Chuck Entz (talk) 06:40, 15 July 2020 (UTC)
@Chuck Entz, as far as I'm concerned, any improvement you (or anybody else) can possibly implement would be very welcome! (I'm not familiar with Lua myself.) Adam78 (talk) 13:19, 15 July 2020 (UTC)
- I'm sorry if my colorful description of the problem gave even the slightest impression of disapproval- my impression was that this was a a sketchy first draft of the code, and that it might benefit from a suggestion to speed up the optimization.
- In patrolling CAT:E over the years, I've gotten fairly good at getting a rough idea of what code is probably doing, but actually writing the code requires mastery of the details- and I know better than to implement anything myself. Unless it's correcting an obvious typo that's causing thousands of module errors, I keep my hands off the coding part of modules.
- If I understand the code, the simplest implementation would be to generate the name of the data module that's called by adding the first letter of the term to "Module:R:ErtSz/data", and have all the data that starts with "b" in a module called "Module:R:ErtSz/datab". That's what they did with Module:languages/data when it became obvious that loading in data for all the language codes every time would be unsustainable (see Category:Language data modules). The main choice would be whether to have ""Module:R:ErtSz/dataa","Module:R:ErtSz/dataá", "Module:R:ErtSz/dataä", etc. or to strip the diacritics first so it can all be in "Module:R:ErtSz/dataa".
- If you think about it, this module is called with only one search term per entry, so it's best to avoid loading the data for every word in the language. Theoretically, you could go as far as having a separate module for every word, but that would be a ridiculous waste of time to set up. I think a module for each combination of the first one or two letters should be more than enough to solve the memory problems. If I can help in setting up the data modules, I'd be happy to pitch in. Thanks! Chuck Entz (talk) 15:01, 15 July 2020 (UTC)
- Splitting up the data by letter of the alphabet is a good idea, but I tried the less drastic measure of changing Module:R:ErtSz/data and Module:R:ErtSz/homonyms to search for the word in the data string and return its code or codes, without parsing the whole data string into a table. Module:R:ErtSz/data is something like 1.5 MB, but parsed into a Lua table it is much larger. This change removed the error in the old version of baba, which I tested in WT:SAND. The memory usage of
{{R:ErtSz}}
on its own is still about 4 MB, so it might still be worth splitting the module up. Probably a good idea to get it out of the way so I might write a bot script to do it. — Eru·tuon 18:52, 15 July 2020 (UTC)
- Splitting up the data by letter of the alphabet is a good idea, but I tried the less drastic measure of changing Module:R:ErtSz/data and Module:R:ErtSz/homonyms to search for the word in the data string and return its code or codes, without parsing the whole data string into a table. Module:R:ErtSz/data is something like 1.5 MB, but parsed into a Lua table it is much larger. This change removed the error in the old version of baba, which I tested in WT:SAND. The memory usage of
It sounds great. All I could do now was replace this template with the old-style version (which requires one to enter the unique dictionary ID manually) on page baba, so the error is solved here but of course it's bound to arise elsewhere until the matter is solved. Adam78 (talk) 16:10, 15 July 2020 (UTC)
- @Erutuon There are lots of module errors now. Benwing2 (talk) 05:06, 16 July 2020 (UTC)
- @Benwing2: Thanks! Should have tested the new version of Module:R:ErtSz/homonyms before saving. Fixed. — Eru·tuon 05:25, 16 July 2020 (UTC)
@Erutuon, thank you! 🙏 Adam78 (talk) 15:16, 16 July 2020 (UTC)
Seeing as the French changed the spelling again, and that the circonflexe has pretty much been taken off, we would be wise to make alternative spellings of the circumflexed words. I started with the creation of hopital (which, as it turns out, is wrong). To start with, I propose the creation of Category:French terms spelled with Â, Category:French terms spelled with Ê, Category:French terms spelled with Î, Category:French terms spelled with Ô, Category:French terms spelled with Û. OK, not all circumflexed words are affected (it seems the rules just apply on the I and the U), but the categories would be useful anyway. See this website, this one and the BBC for more information. --CasiObsoleto (talk) 08:49, 15 July 2020 (UTC)
- "Seeing as the French changed the spelling again": I don't know what you're talking about. The last spelling reform took place in 1990, and there has been much back-and-forth since, but that's all, afaik. And I see no value in having those categories. PUC – 15:12, 15 July 2020 (UTC)
- You're probably right. I should have read up more on the subject before making such a newbie post. --CasiObsoleto (talk) 17:51, 15 July 2020 (UTC)
‘post’ and ‘ante’ should be equivalent to ‘circa’ in Module:Quotations
[edit]It's a bit hard to explain, but look at, for example, at Module:Quotations/la/data and at the page coruscō. You’ll see that a ‘c.’ in the source code is treated as a special word that calls up a correctly formatted link to circa. Now look at the quotation from Juvenal in that entry. See how the module doesn’t treat ‘p.’ the same as ‘c.’? How would that be fixed? --Biolongvistul (talk) 09:27, 15 July 2020 (UTC)
English Word Dump
[edit]I know there are dumps that have a list of all words defined by Wiktionary in any and all languages. Is there a dump like that for only English words? It does not have to include the definitions, just the entry words would be perfect. — This unsigned comment was added by 2600:1700:E40:B5E0:C8A9:602C:1A45:323C (talk) at 05:12, 16 July 2020 (UTC).
- I don't believe so. The latest dumps are here: [1]. I see no filename that would suggest "English only". Equinox ◑ 08:30, 16 July 2020 (UTC)
- It isn't too hard to process the entire dump to extract all the English words. It is a bit harder to extract only lemmas, only non-obsolete forms, etc. Would you want all the content? Just the (current?) definitions? Just the headwords? All of these things are relatively easy using any language that supports (eg, Perl, Python) regular expressions once one understands the structure of our entries. DCDuring (talk) 18:20, 16 July 2020 (UTC)
- Perhaps we should create a little library of the main relevant regular expressions in each of the main flavors of regular expressions. DCDuring (talk) 18:25, 16 July 2020 (UTC)
- It isn't too hard to process the entire dump to extract all the English words. It is a bit harder to extract only lemmas, only non-obsolete forms, etc. Would you want all the content? Just the (current?) definitions? Just the headwords? All of these things are relatively easy using any language that supports (eg, Perl, Python) regular expressions once one understands the structure of our entries. DCDuring (talk) 18:20, 16 July 2020 (UTC)
- I have lists of each languages' entry names from the latest dump (the titles of the pages that have a given language header), but the English one is 10.7 MB so it would need to be uploaded to a place off-wiki to be available for download. Any suggestions? I could start a Toolforge site for this if more people would be interested. — Eru·tuon 19:13, 16 July 2020 (UTC)
- Maybe we should refer such requests to WikiData. DCDuring (talk) 21:55, 16 July 2020 (UTC)
Please change him/her
to them
. translatewiki:MediaWiki:Anontalkpagetext/en was updated on 2018-12-14. Thanks. --沈澄心✉ 05:46, 16 July 2020 (UTC)
Template importation request: {{tq}}
[edit]It's used for quoting things other people have said on talk pages, in a more visually distinctive way than putting quotation marks around them. I looked for some other template used for that purpose here and didn't find one. It's present and commonly used on English Wikipedia and on Commons, and probably other wikis. I assume one of those can be brought over here somehow, but I don't know how. (In particular, lots of templates on English Wikipedia use Lua, when I have no idea why most of them need it, which could complicate things.) ← PointyOintment ❬t & c❭ 17:42, 17 July 2020 (UTC)
- What's the point? To my knowledge, nobody has ever wanted that template except you, someone who has not made any contributions in mainspace. —Μετάknowledgediscuss/deeds 18:04, 17 July 2020 (UTC)
- I don't know of anyone else requesting it either—I expect that would have come up in my searching, and I probably wouldn't have requested it myself if so (because either the template would already be here or I would've read the previous discussion and learned that it's diswanted and why). But that doesn't mean nobody else has ever wanted it; maybe they just didn't get around to asking. I first thought of requesting it a couple of weeks ago, and didn't do so until today. Or maybe they just didn't realize such a thing was available to ask for. Maybe now that it's been raised, other people will (state that they) want it too.
- And I may not be much of a dictionarist, but does that mean contributions in other ways (such as choosing words of the day, which I was about to start doing) are unwelcome? (You may also note that I have a reasonably long history of constructive editing on English Wikipedia, as well as a userpage there, which I just haven't gotten around to starting here—I'm guessing you looked at my contributions here at least in part because my signature is red?) ← PointyOintment ❬t & c❭ 18:21, 17 July 2020 (UTC)
- I don't think I'd use it, but I wouldn't slap it with
{{rfd}}
either. DCDuring (talk) 21:01, 19 July 2020 (UTC)
- I don't think I'd use it, but I wouldn't slap it with
I can see that this parameter links to the right language section (e.g. at number, see Swahili desc). But at sexy the alt form of the Spanish descendants isn't linking because it's called from the same page. Ultimateria (talk) 17:30, 18 July 2020 (UTC)
Possible bot work, looking for feedback
[edit]I just edited an alternative spelling entry to ensure that it has the same subject categories as the other spelling. This seems uncontroversial to me. So 1.) is it a good idea to have a bot that automatically ensures that alternative spellings have the same categories and 2.) is there anyone who can actually do this work to make a bot that scans alternative spellings, as I am not smart enough to do it myself? —Justin (koavf)❤T☮C☺M☯ 03:30, 19 July 2020 (UTC)
- I've actually removed categories from alt forms plenty of times. There are 70,000 alternative forms and spellings in English alone; the potential for cluttering our categories with essentially duplicate pages to sift through is high enough for me to oppose these edits. Ultimateria (talk) 06:16, 19 July 2020 (UTC)
- I agree. Not only is this not uncontroversial, I think it's an actively bad idea, and it's most certainly against our usual practice. —Μετάknowledgediscuss/deeds 06:34, 19 July 2020 (UTC)
- I also agree that it's a bad idea. Only the primary entry should be in topic categories. —Mahāgaja · talk 08:22, 19 July 2020 (UTC)
- I agree. Not only is this not uncontroversial, I think it's an actively bad idea, and it's most certainly against our usual practice. —Μετάknowledgediscuss/deeds 06:34, 19 July 2020 (UTC)
Okay, consensus sussed out, would you (@Ultimateria, Metaknowledge, Mahagaja) be in favor of a bot that does the opposite? —Justin (koavf)❤T☮C☺M☯ 02:36, 30 July 2020 (UTC)
- I would. It's also worth manually going through and removing other info (I think etymologies are especially common), first checking that it's at the main entry. If someone creates it, I'll go through a list of entries whose only definition line has an alt form/spelling template and includes etymology or translation sections. I think topic cats are very likely to be at both entries, so I'm not concerned about removing those by bot. Ultimateria (talk) 02:54, 30 July 2020 (UTC)
NOt even following the link from semicolon --Backinstadiums (talk) 19:32, 19 July 2020 (UTC)
- @Backinstadiums: It's a server bug that has been reported at phab:T238285. — Eru·tuon 19:50, 19 July 2020 (UTC)
- Meanwhile [2] --Backinstadiums (talk) 19:53, 19 July 2020 (UTC)
- I managed to sort of fix the link at semicolon by having it link to one of the other semicolon-like characters ("︔") on that page, which redirects to the correct page. It's a real amateurish kludge, but it gets where it needs to go (Oddly enough, the Greek question mark (";") is the only one of those characters that gets the automatic redirect, and that's the only one that's semantically a not a semicolon). Apparently redirects work okay. Once you get to that page, you can edit it, but it goes to the main page instead of displaying the new version (your edit does show up in the edit history).
- Just to see if I could, I created a redirect at Category:English terms spelled with ﹔ that goes to Category:English terms spelled with ;, but that only works if you search for or link to it- the category link on the entry page goes to Category:English terms spelled with. I'm still tinkering with manually changing urls for other pages to see what else I can get access to. Chuck Entz (talk) 20:58, 19 July 2020 (UTC)
- Here's Pages that link to ";" (you have to modify the url for some other whatlinkshere query to get this). Also, I notice from a usage note that semicolon is most commonly used in Greek online instead of the dedicated Greek question mark, so that explains the automatic redirect. Chuck Entz (talk) 21:26, 19 July 2020 (UTC)
- It's not exactly a redirect; the Greek question mark is normalized to a semicolon by the MediaWiki backend, so it is replaced with a semicolon whenever anyone tries to insert it into a page or use it in a title. The character was added in version 1.1 in 1993 so Unicode probably decided on its normalization early on (though I'm not sure when they came up with normalization). As with other characters that are changed by normalization, the Greek semicolon can only be displayed by using a HTML character reference (
;
). — Eru·tuon 20:26, 20 July 2020 (UTC)
- It's not exactly a redirect; the Greek question mark is normalized to a semicolon by the MediaWiki backend, so it is replaced with a semicolon whenever anyone tries to insert it into a page or use it in a title. The character was added in version 1.1 in 1993 so Unicode probably decided on its normalization early on (though I'm not sure when they came up with normalization). As with other characters that are changed by normalization, the Greek semicolon can only be displayed by using a HTML character reference (
- On the talk page for semicolon, someone posted the suggestion to use en.wiktionary.org/w/index.php?title=;&redirect=no, which does work. Chuck Entz (talk) 21:33, 19 July 2020 (UTC)
- Meanwhile [2] --Backinstadiums (talk) 19:53, 19 July 2020 (UTC)
- We should probably just make it an unsupported title. DTLHS (talk) 20:30, 20 July 2020 (UTC)
normalizing the position of {{crh-latin-verb}}
[edit]Can we have a bot move all instances of misplaced {{crh-latin-verb}}
preceding the definition down, and create a ====Congugation==== header? This should look like this. Allahverdi Verdizade (talk) 22:45, 19 July 2020 (UTC)
- @Allahverdi Verdizade I ran a bot to do this. Let me know if it missed anything or messed anything up. Benwing2 (talk) 04:09, 23 July 2020 (UTC)
- @Benwing2 Thank you! Allahverdi Verdizade (talk) 15:20, 23 July 2020 (UTC)
Search-fu: searching for words ending in X
[edit]It's easy enough to find words in language ABC that start in XYZ: just go to the category for terms in that language.
- But what if I wanted to find all words in language ABC that end in XYZ? Any ideas on how to do that?
Advanced topic: For example, say I wanted to find all Japanese terms where the romanization ends in -ita. Japanese hiragana is a syllabary, so if I tried searching by kana (syllabic letter), I'd have to find all terms ending in each of いた・きた・した・ちた・にた・ひた・みた・りた. Or, for Korean, hangul is alphabetic, but it's composed into set glyphs that comprise multiple individual jamo (letters). If I tried searching by jamo for all Korean words ending in -i, it'd be a similar mess (I don't have a Korean IME installed, so I'll forgo listing examples).
- How would I find all words in language ABC that uses a non-Latin non-alphabetic, where the romanization ends in XYZ?
Curious, ‑‑ Eiríkr Útlendi │Tala við mig 03:58, 21 July 2020 (UTC)
- https://dixtosa.toolforge.org/ Chuck Entz (talk) 06:29, 21 July 2020 (UTC)
- Thank you, Chuck! Cool stuff.
- Unfortunately, it doesn't seem to work for romanizations. For example, if I search for
ri
in the categoryKorean_lemmas
, I'd hope to see entries such as 머리 (meori) and 다리 (dari). Instead, I get nothing. I have to enter리
as my search string. Testing also confirms that I cannot used uncomposed jamo for searching -- attempting to search forㅣ
(the lone jamo representing /i/) finds only the [[ㅣ]] page. - Any other possible search-fu moves? :) ‑‑ Eiríkr Útlendi │Tala við mig 21:39, 21 July 2020 (UTC)
- I don't know of a way to search romanizations, but I could probably make a Toolforge site for it if I can motivate myself. The backend for the site would need a minimal Lua module infrastructure to generate the transliterations for the search index, which could be a little complicated. A search engine just for titles, which would allow you to find Japanese entry titles matching a regex
[いきしちにひみり]た$
, would be easier, because I already have a program to generate an index of entry titles for every language based on language headers. (Unfortunately the index currently doesn't list the Han-script entries for the various Chinese languages that are under the Chinese header.) The MediaWikiintitle://
search feature would allow suffix searches if the developers added support for$
, which is puzzlingly missing. - It would be an interesting project. Not sure what the name of the Toolforge site would be. Maybe
wiktionary-entry-names
. I worry about it not being more general, in case I come up some other idea besides searching entry names and romanizations of entry names and want to include it. — Eru·tuon 08:04, 23 July 2020 (UTC)
- I don't know of a way to search romanizations, but I could probably make a Toolforge site for it if I can motivate myself. The backend for the site would need a minimal Lua module infrastructure to generate the transliterations for the search index, which could be a little complicated. A search engine just for titles, which would allow you to find Japanese entry titles matching a regex
Language settings at menu
[edit]I know it is not en.wikt-specific, but there is something wrong about the Language settings - left hand menu.
- Choosing language: but we still get the endonym if we click set in English e.g. Ελληνικά instead of Greek, etc. Some scripts are incomprehensible...
- alphabetically in the chosen language:
- #if set in English → English - en / Greek - el
- #if set in Greek → Αγγλικά - en / Ελληνικά -el
- It would be nice to have an extra feature by code in the chosen language e.g.
- el - Greek
- el - Ελληνικά
Plus a) choose the languages you wish to view and b) view all languages Thank you ‑‑Sarri.greek ♫ | 01:33, 22 July 2020 (UTC)
Template:sisterprojects
[edit]Template:sisterprojects edit request. Please replace Wikiversity's "Free learning tools" with "Free learning resources". See Wikipedia:Template:Wikipedia's sister projects and associated discussion. Thanks! -- Dave Braunschweig (talk) 18:57, 22 July 2020 (UTC)
- Done – Jberkel 22:06, 22 July 2020 (UTC)
Specifying "chiefly countable"
[edit]Is there any way, using the standard templates such as "en-noun" or "head", to specify that a noun is "chiefly countable", or words to that effect? I'm talking about the main heading, not an individual numbered sense. Mihia (talk) 20:00, 22 July 2020 (UTC)
- tlb (Template:term-label) is just like lb but covers the whole term instead of one sense. Equinox ◑ 20:52, 22 July 2020 (UTC)
- Thank you. Mihia (talk) 21:18, 22 July 2020 (UTC)
"In other languages" sidebar - always expanded?
[edit]Is there a way (in preferences?) to make "In other languages" on the left always expanded, so that I could see ALL interwiki links? --Anatoli T. (обсудить/вклад) 05:24, 23 July 2020 (UTC)
- @Atitarev: Go to Special:Preferences, "Appearance" tab, scroll to the bottom and uncheck the box "Use a compact language list, with languages relevant to you.". —Mahāgaja · talk 09:10, 23 July 2020 (UTC)
- @Mahagaja: Great, thank you! It makes much easier to go to a specific interwiki link while editing. --Anatoli T. (обсудить/вклад) 09:19, 23 July 2020 (UTC)
Homonyms as a new category to Category:Terms by lexical property subcategories by language?
[edit]I wonder if there is a way to have homonyms of a language collected in a category, namely those pages that have the "Etymology 2" string within the section of a given language. I suppose there must be a way to make the software populate categories, ideally by means of a {{cln}}
category. I know there are hundreds (possibly a few thousands) of terms that could be included among Hungarian entries: it would be useful for language learners (e.g. to help them get used to different ways of parsing words). I suppose it would be useful and interesting for other language editions as well. Any ideas? Adam78 (talk) 19:43, 24 July 2020 (UTC)
@Benwing2, I do hope not to take advantage of your kindness. I'm wondering if you think you might also address this question of mine above. Checking 26.0000 lemmas and 37.000 non-lemma forms would be impossible manually and this condition seems fairly straightforward for a bot. Afterwards I'd like to subcategorize Hungarian homonyms like "Hungarian homonyms between non-lemma noun forms and verb lemmas" etc., so adding {{cln|hu|homonyms}} to each instance would be great, as they could be specified manually later. I thought that later, newly created homonymous entries could be located by a bot that runs once a while (perhaps the way anagrams are handled), in a way that it avoids re-adding a term to the general category if it's already included in any of its subcategories ("…… homonyms" with or without any extension). Adam78 (talk) 17:44, 6 August 2020 (UTC)
- @Adam78 There isn't currently a "homonyms" category, but there's a similar category "terms with multiple etymologies", e.g. Category:English terms with multiple etymologies. There's no such Hungarian category currently, but I could create it by bot. I can also create the lemma-and-non-lemma category, if we can come up with a suitable name and if you give me lists of Hungarian headword lemma and non-lemma templates. Benwing2 (talk) 02:54, 7 August 2020 (UTC)
- "Homonyms" is ambiguous in some languages since it can refer to homophones or homographs (same sound or same spelling) and for some languages (though I think usually not for Hungarian) that is different. There's already Category:Hungarian terms with homophones, so the question is whether that's enough, or whether you want to combine the concepts of homophone and homograph with a homonym category. If so, the homonym category should not be populated for languages such as English where homonyms and homographs are different, because it could lead to confusion. — Eru·tuon 20:12, 7 August 2020 (UTC)
- Oops, failed to read the category description. Category:Hungarian terms with homophones is actually homophones that have different spellings (homophone heterographs?), so it's more restrictive than the homonyms category proposed here. — Eru·tuon 20:33, 7 August 2020 (UTC)
@Benwing2 Wow, you're amazing. :) What about names like "Hungarian terms with non-lemma noun form and verb lemma etymologies" (with the constant elements bolded)? Noun comes before Verb, and lemma comes before non-lemma, so these could be considered for the order of mentioning. Or when the POS and the lemma-state are the same, then e.g. "Hungarian terms with multiple non-lemma verb form etymologies". (You are a native speaker of English, I am not, so your suggestions are very welcome.) Actually, if a word has e.g. multiple non-lemma verb etymologies and a single non-lemma noun etymology (like kísértetek, altogether five, including one noun), it could be categorized in two categories, one for the homonymy between verb forms, and another for the homonymy between noun form and verb form. I hope things are not getting overly complicated. :)
The relevant Hungarian-specific headword templates are the following: {{hu-adj}}
, {{hu-adv}}
, {{hu-noun}}
, {{hu-pron}}
, and {{hu-verb}}
. For the other parts of speech, the generic {{head|hu|……}} is used. The categories are the following: category:Hungarian adjectives (including their forms), category:Hungarian adverbs (including their forms), category:Hungarian conjunctions (indeclinable), category:Hungarian determiners (including their forms, as well as 3 articles, but this latter group doesn't matter, only "determiner" does), category:Hungarian interjections (indeclinable), category:Hungarian nouns (including their forms), category:Hungarian numerals (including their forms), category:Hungarian particles (indeclinable), category:Hungarian postpositions (indeclinable), category:Hungarian pronouns (including their forms), and category:Hungarian verbs (including their forms). There is also category:Hungarian morphemes, which includes quite a few sets of multiple etymologies, but they can be homonymous only within the category (due to the hyphen), so their category can be named like "Hungarian terms with multiple morpheme etymologies". The categories that we can ignore this time are category:Hungarian phrases (due to the spaces inside) and category:Hungarian proper nouns (since no other POS is capitalized). Is this answer sufficient for you? Adam78 (talk) 12:19, 7 August 2020 (UTC)
- @Adam78 I looked into implementing this. I have written code to add the three categories Category:Hungarian terms with multiple lemma etymologies (327 pages), Category:Hungarian terms with lemma and non-lemma form etymologies (297 pages), and Category:Hungarian terms with multiple non-lemma form etymologies (490 pages); 1078 pages total, with some pages in more than one category. The code can also add more specific categories such as Category:Hungarian terms with noun and verb lemma etymologies, Category:Hungarian terms with noun lemma and noun non-lemma form etymologies, Category:Hungarian terms with noun and verb non-lemma form etymologies and Category:Hungarian terms with multiple verb form etymologies. However, there are a lot of these categories. Here's a table of all pairs of POS's found in different etymology sections:
POS1 | POS2 | Count |
---|---|---|
noun form | verb form | 204 |
verb form | verb form | 175 |
noun form | noun form | 114 |
noun | noun | 87 |
noun | noun form | 84 |
noun | verb | 74 |
morpheme | morpheme | 55 |
verb | noun form | 51 |
adjective | noun | 49 |
noun | verb form | 48 |
adjective | noun form | 47 |
verb | verb form | 35 |
noun form | pronoun form | 19 |
interjection | noun | 15 |
verb | verb | 13 |
adjective | verb form | 13 |
adverb | noun form | 12 |
adjective | verb | 11 |
pronoun form | verb form | 11 |
adjective | adjective | 10 |
adverb | noun | 9 |
adverb | verb | 8 |
letter | noun | 8 |
adjective form | noun form | 7 |
adjective form | verb form | 7 |
adverb | verb form | 6 |
noun | pronoun | 5 |
adjective | adverb | 4 |
adverb | adjective form | 4 |
pronoun | noun form | 3 |
adverb | numeral form | 3 |
adjective | interjection | 3 |
noun | pronoun form | 3 |
interjection | letter | 3 |
pronoun | verb form | 3 |
numeral | numeral form | 2 |
numeral | verb | 2 |
interjection | verb | 2 |
proper noun | proper noun | 2 |
noun | particle | 2 |
adverb | pronoun | 2 |
verb | pronoun form | 2 |
determiner | pronoun form | 2 |
verb | adjective form | 2 |
conjunction | verb form | 2 |
conjunction | verb | 2 |
noun | numeral | 2 |
letter | pronoun | 2 |
adjective form | adjective form | 2 |
pronoun | pronoun form | 2 |
interjection | verb form | 1 |
adjective | postposition | 1 |
letter | verb | 1 |
adjective | letter | 1 |
pronoun | pronoun | 1 |
conjunction | interjection | 1 |
adjective form | adverb form | 1 |
pronoun | verb | 1 |
numeral | noun form | 1 |
postposition | verb | 1 |
conjunction | noun | 1 |
adverb | determiner | 1 |
determiner | determiner form | 1 |
noun | determiner form | 1 |
postposition | pronoun | 1 |
interjection | noun form | 1 |
determiner | letter | 1 |
conjunction | letter | 1 |
adverb | pronoun form | 1 |
noun | postposition | 1 |
interjection | pronoun | 1 |
There would be 71 of these categories. Are you sure you want all of them created? Benwing2 (talk) 03:42, 8 August 2020 (UTC)
@Benwing2, to me, it sounds fantastic, so definitely YES! Thanks a lot! I can hardly wait to see them! Adam78 (talk) 09:33, 8 August 2020 (UTC)
- @Adam78 They are added but the categories themselves haven't been created. I still need to add
{{auto cat}}
support for them. Benwing2 (talk) 00:11, 9 August 2020 (UTC)- There was a bug in my code that generated some bad categories in certain cases, and I have decided to simplify the names to follow only two formats: either Category:Hungarian terms with multiple POS etymologies or Category:Hungarian terms with POS1 and POS2 etymologies. There are various reasons for this, one of which is that the old format won't work so well with participles and participle forms, both of which are non-lemma forms. Benwing2 (talk) 02:28, 9 August 2020 (UTC)
All right, whichever way you find it the most fitting! I'm sorry this task proved even more complex than it seemed. It's a huge contribution indeed. Hungarian homonyms have probably never been collected so extensively and systematically (and it's not even complete yet, despite the 1,000 examples). Wiktionary is becoming a veritable treasure trove of the Internet. Thanks a lot!!! Adam78 (talk) 11:36, 9 August 2020 (UTC)
- @Adam78 I finished adding the
{{auto cat}}
support for the various categories. You can create the missing categories yourself using{{auto cat}}
, or wait till tomorrow when I do a run to create missing categories in Special:WantedCategories, which gets populated every 3 days. Benwing2 (talk) 21:02, 9 August 2020 (UTC)
@Benwing2: What shall I say? Wonderful. Brilliant! Thank you. :) I added some of the major categories but I think there are some more, so I'll wait. – A question. I suppose future entries will need to be added to these categories manually. Or do you plan to run such a bot later, once in a while? Adam78 (talk) 21:51, 9 August 2020 (UTC)
- @Adam78 Either way. It's easy enough to rerun the bot script periodically, as it doesn't require any manual intervention. Currently there's no support for removing categories that are no longer needed, e.g. in case someone decides to merge two etymologies, so those categories would have to be manually removed. Benwing2 (talk) 22:00, 9 August 2020 (UTC)
@Benwing2: If you can periodically rerun this script, I'll greatly appreciate it, as I'll not bother to add these categories myself. (I've just found another homonym, cégére, and I was wondering if I could leave this job to the bot.) By the way, auto cat doesn't seem to work here: Category:Multiple etymology subcategories by language. Adam78 (talk) 23:39, 9 August 2020 (UTC)
@Benwing2: Is there anything to be done manually if the categories Category:Hungarian pronoun and verb form etymologies, Category:Hungarian postposition and verb form etymologies, and Category:Hungarian adverb and verb form etymologies are not accepted by {{auto cat}}
? Adam78 (talk) 09:57, 2 February 2023 (UTC)
- @Adam78 They're not accepted because they're in the wrong format; they should be Category:Hungarian terms with pronoun and verb form etymologies etc. i.e. the 'terms with' portion is missing. Benwing2 (talk) 11:02, 2 February 2023 (UTC)
- @Benwing2: I am sorry. Fixed them. Thank you! Adam78 (talk) 11:32, 2 February 2023 (UTC)
"Prank vandalism" abuse filter blocking an edit
[edit]I am attempting to change dragon's beard candy to the following content:
==English== {{wikipedia}} [[File:Dragons beard candy.JPG|thumb|A container of '''dragon's beard candy'''.]] ===Etymology=== {{calque|en|zh|-}} {{zh-l|龍鬚糖}}. ===Noun=== {{en-noun|head=[[dragon]][['s]] [[beard]] [[candy]]|-}} # A traditional [[Chinese]] [[sweet]] similar to [[halva]] or [[cotton candy]] made from [[fine]] [[white sugar]], peanuts, [[desiccated]] [[coconut]], white sesame seeds, [[corn syrup]] and [[glutinous rice]] [[flour]]. ====Translations==== {{trans-top|traditional Chinese sweet}} * Chinese: *: Mandarin: {{t|cmn|龍鬚糖|tr=lóngxūtáng}} {{trans-mid}} * Finnish: {{t|fi|[[lohikäärmeen]] [[parta]]}} * Portuguese: {{t-needed|pt}} {{trans-bottom}} [[Category:en:Sweets]]
Unfortunately, I am met with the message "Error: This action has been automatically identified as harmful, and therefore disallowed. If you believe your action was constructive, please inform an administrator of what you were trying to do. A brief description of the abuse rule which your action matched is: Prank vandalism".
Can anyone provide any guidance or information? Additionally, the error message would probably be more helpful if it linked to the Grease Pit and to a page that explains edit filters or the specific one in more detail. Thanks for any help and please ping me. —The Editor's Apprentice (talk) 20:19, 25 July 2020 (UTC)
- @Chuck Entz You last touched this private filter. Vox Sciurorum (talk) 15:11, 27 July 2020 (UTC)
- I was able to make the edit without issue, perhaps because I'm an admin, though of course that shouldn't be a prerequisite for such a simple edit. —Mahāgaja · talk 15:26, 29 July 2020 (UTC)
- @Mahagaja: I did a bit of research for information about the abuse filter and have come upon some details. First, the specific filter which blocked my edit was abuse filter number 49. Since its visibility is set to "private" I am unable to view almost any details about the filter. The only ones that I can see are those listed in the Special:AbuseFilter table. These details show that the filter was last modified by Chuck Entz at 23:38, July 27, 2020, about 8 hours after being pinged by Vox Sciurorum, presumably in response to the ping and my post. They further show that the filter is currently disabled. This explains why you were able to make the edit without issue. Something that I personally find interesting is that the change Chuck Entz made is not recorded in the contributions log. I hope this information is helpful and/or interesting to you, it was for me. —The Editor's Apprentice (talk) 18:09, 31 July 2020 (UTC)
- The filter looks for a couple of specific words and variations thereof that people insert randomly into text as a prank. Sometimes a large block of text will coincidentally include words containing parts that match variations of those words in the correct order. This is extremely, extremely rare, but I disabled the filter until I can figure out how to keep it from happening at all. The practice I'm trying to prevent isn't common (though those who do it are trying very hard to be subtle and non-obvious), so it's not worth the disruption. Chuck Entz (talk) 18:24, 31 July 2020 (UTC)
- You could enable the filter but have it not block the edit, and check the list of filter matches every once in a while to see what needs to be reverted. Vox Sciurorum (talk) 14:10, 2 August 2020 (UTC)
- The filter looks for a couple of specific words and variations thereof that people insert randomly into text as a prank. Sometimes a large block of text will coincidentally include words containing parts that match variations of those words in the correct order. This is extremely, extremely rare, but I disabled the filter until I can figure out how to keep it from happening at all. The practice I'm trying to prevent isn't common (though those who do it are trying very hard to be subtle and non-obvious), so it's not worth the disruption. Chuck Entz (talk) 18:24, 31 July 2020 (UTC)
- @Mahagaja: I did a bit of research for information about the abuse filter and have come upon some details. First, the specific filter which blocked my edit was abuse filter number 49. Since its visibility is set to "private" I am unable to view almost any details about the filter. The only ones that I can see are those listed in the Special:AbuseFilter table. These details show that the filter was last modified by Chuck Entz at 23:38, July 27, 2020, about 8 hours after being pinged by Vox Sciurorum, presumably in response to the ping and my post. They further show that the filter is currently disabled. This explains why you were able to make the edit without issue. Something that I personally find interesting is that the change Chuck Entz made is not recorded in the contributions log. I hope this information is helpful and/or interesting to you, it was for me. —The Editor's Apprentice (talk) 18:09, 31 July 2020 (UTC)
- I was able to make the edit without issue, perhaps because I'm an admin, though of course that shouldn't be a prerequisite for such a simple edit. —Mahāgaja · talk 15:26, 29 July 2020 (UTC)
Transcription errors
[edit]Even though I specified the transcription as outlined at Template:alter at өсүмдүк, the untranscribed text was still returned. I also tested it with Template:link and the same thing happened so it's a problem with a module somewhere. İʟᴀᴡᴀ–Kᴀᴛᴀᴋᴀ (talk) (edits) 03:09, 27 July 2020 (UTC)
- kk has "override_translit = true" set in Module:languages/data2 which determines the behavior you are seeing. DTLHS (talk) 03:56, 27 July 2020 (UTC)
- I came up with a solution: the transliteration function in Module:ky-translit now returns nothing if the text is in Arabic script, so your manual transliteration is shown. — Eru·tuon 04:59, 27 July 2020 (UTC)
Verb and Noun templates for the Norwegian language
[edit]Hi! Not too familiar with the discussions on Wiktionary so hope I am posting this in the right place. If not let me know.
I want to request templates for Norwegian (specifically Bokmål) for verbs and nouns, and maybe even adjectives. Both Danish and Swedish have their own, but Norwegian has been left out! For example, if you look at the Danish entry "have" (Which is both a verb and a noun), you will find templates for both the noun and the verb, as well as the "main" forms written out next to the base form. Norwegian entries only have these main forms written at the top, but never a template with all the forms. Certain verb forms are never included at all in Norwegian entries, such as the non-finite forms, which could add at least 6 forms to a verb, which Wiktionary readers would not know. For nouns, the genitive could be added, as they are never included in Norwegian entries either. The way it is now, is very inconsistent. Some entries have a lot of information, others have very little, as there is no unifying template giving some kind of pointer as to how much information is required. Hope someone can help out :) Supevan (talk) 05:30, 29 July 2020 (UTC)
- For verbs, the template would need the following forms: Infinitive (active), infinitive passive, present indicative active, present indicative passive, past indicative active, past indicative passive, subjunctive, imperative, imperfective participle, perfective (masculine feminine neuter if applies) and perfective plural/definite.
- Looks like the Swedish wiktionary already has several templates for Norwegian at sv:Mall:no-subst for nouns, verbs, articles and pronouns. These could be adapted, I still have yet to figure out how templates work on MediaWiki. Kritixilithos (talk) 15:08, 2 August 2020 (UTC)
- Wow, you're right! They look pretty great, I hope someone could adapt them to English Wiktionary, I have no idea how that works, but would love to be able to use them. Supevan (talk) 08:30, 3 August 2020 (UTC)
- @Supevan: Here's a thread started yesterday on Norwegian templates, User_talk:Donnanz#Splitting_Template:no-noun-infl, in case you want to chime in, and for future reference.
- Wow, you're right! They look pretty great, I hope someone could adapt them to English Wiktionary, I have no idea how that works, but would love to be able to use them. Supevan (talk) 08:30, 3 August 2020 (UTC)
- Looks like the Swedish wiktionary already has several templates for Norwegian at sv:Mall:no-subst for nouns, verbs, articles and pronouns. These could be adapted, I still have yet to figure out how templates work on MediaWiki. Kritixilithos (talk) 15:08, 2 August 2020 (UTC)
Disabling LQT without losing archives
[edit]Does anyone know if there's any way to remove the old LiquidThreads system from a page without losing all of the old threads? --Yair rand (talk) 05:53, 29 July 2020 (UTC)
- (If not, is it possible to move the old LT page to an archive subpage and then 'reformat' the 'main' page?) - -sche (discuss) 18:25, 31 July 2020 (UTC)
Feedback on Quiet Quentin with {{quote-text}}
templates
[edit]I made a modified version of QQ that uses {{quote-book}}
and {{quote-journal}}
instead of raw wikitext. I've accounted for all the edge cases I could find, but there are probably more out there. Is this something anybody wants?
You can test it out by disabling QQ in your preferences and adding the following to your /common.js:
importScript('User:Enoshd/QQ-test.js'); importStylesheet('User:Enoshd/QQ-test.css'); mw.loader.load(['jquery.ui']);
—Enosh (talk) 14:49, 30 July 2020 (UTC)
- @Enoshd: I have tested out the code by following the instructions you provided and it has worked well for me. Thanks for doing the work to create what is, in mind, and improved version of Quiet Quentin. If you are taking requests, there are a few capabilities that I would enjoy being added to gadget. —The Editor's Apprentice (talk) 22:40, 30 July 2020 (UTC)
- @The Editor's Apprentice: Glad to hear and happy to take requests. —Enosh (talk) 07:50, 31 July 2020 (UTC)
- @Enoshd: Awesome. The first thing that comes to mind for me about Quiet Quentin (or I guess, more accurately, your modified version of it) is that the height that it opens to when a search is conducted is about the height of my screen and so by default the "more results" link is off my screen. The corresponding feature that I would like is that the last set dimensions for the window are remembered from page to page so that I only have to change the window size once or that the default window height is reduced. Another has to do with the fact that the end of quotations are currently appended with
...
. The documentation pages for Template:nb... and Template:... recommend that Template:nb... be used instead to aid in the differentiation of ellipses that are present in the original text and those which are added by editors. In the context of dating quotes, it looks Google Books provides a standardizedYYYY-MM-DD
format for sources so I would appreciate month and day information also included in the quotation code generated. Thanks again. —The Editor's Apprentice (talk) 17:20, 31 July 2020 (UTC)- @Enoshd: Hello again, I've gotten around to using your version of QQ again and have seen the changes that you've been made in accordance with my requests. I just want to come back to this thread and say that they've made my experience much better and that I appreciate the improvements. —The Editor's Apprentice (talk) 03:58, 22 August 2020 (UTC)
- @Enoshd: Awesome. The first thing that comes to mind for me about Quiet Quentin (or I guess, more accurately, your modified version of it) is that the height that it opens to when a search is conducted is about the height of my screen and so by default the "more results" link is off my screen. The corresponding feature that I would like is that the last set dimensions for the window are remembered from page to page so that I only have to change the window size once or that the default window height is reduced. Another has to do with the fact that the end of quotations are currently appended with
- @The Editor's Apprentice: Glad to hear and happy to take requests. —Enosh (talk) 07:50, 31 July 2020 (UTC)
- @Enoshd I just discovered that this was a thing. Thank you so much! I've been wanting this since I started using QQ. Andrew Sheedy (talk) 17:13, 10 October 2020 (UTC)
- @Enoshd On Firefox, I get a transparent overlay. – Jberkel 13:07, 13 October 2020 (UTC)