User talk:tbm
Add topicWelcome
[edit]Welcome
[edit]Hello, welcome to Wiktionary, and thank you for your contributions so far.
If you are unfamiliar with wiki-editing, take a look at Help:How to edit a page. It is a concise list of technical guidelines to the wiki format we use here: how to, for example, make text boldfaced or create hyperlinks. Feel free to practice in the sandbox. If you would like a slower introduction we have a short tutorial.
These links may help you familiarize yourself with Wiktionary:
- Entry layout (EL) is a detailed policy on Wiktionary's page formatting; all entries must conform to it. The easiest way to start off is to copy the contents of an existing same-language entry, and then adapt it to fit the entry you are creating.
- Check out Language considerations to find out more about how to edit for a particular language.
- Our Criteria for Inclusion (CFI) defines exactly which words can be added to Wiktionary; the most important part is that Wiktionary only accepts words that have been in somewhat widespread use over the course of at least a year, and citations that demonstrate usage can be asked for when there is doubt.
- If you already have some experience with editing our sister project Wikipedia, then you may find our guide for Wikipedia users useful.
- If you have any questions, bring them to Wiktionary:Information desk or ask me on my talk page.
- Whenever commenting on any discussion page, please sign your posts with four tildes (
~~~~
) which automatically produces your username and timestamp. - You are encouraged to add a BabelBox to your userpage to indicate your self-assessed knowledge of languages.
Enjoy your stay at Wiktionary! Ultimateria (talk) 02:03, 5 August 2021 (UTC)
Babble Box
[edit]Hey! Now that you're a pretty regular editor, you might want to considering setting up a {{Babel}}
box :) Vininn126 (talk) 08:55, 31 March 2022 (UTC)
- @Vininn126 I avoided this for too long in an attempt to hide my ignorance ;), but I've finally done it. tbm (talk) 07:26, 24 January 2023 (UTC)
changes to Template:desc
[edit]Hi. Just FYI, {{desc}}
has changed a bit; please use |t=
in place of |4=
(the gloss/definition), and |alt=
in place of |3=
(the display/alternative form). This is because the template will soon support multiple terms. Thanks! Benwing2 (talk) 02:52, 28 June 2022 (UTC)
- That's good to know, thanks for letting me know! MartinMichlmayr (talk) 02:54, 28 June 2022 (UTC)
Templates
[edit]Hey! I see you are using the {{bor }}
templates (and the ilk?). Seeing as you are essentially the sole editor for Swahili, you have consensus within that community (you should just know not everyone likes it! But it's fine here.) You might be able to get @Benwing2 to help convert pages to using them. Do you also want to use {{inh }}
or stick to just bor ? Vininn126 (talk) 13:19, 10 December 2022 (UTC)
- @Vininn126: I'm just a junior editor and I'm waiting for Meta to return to get input on a number of things. However, I quite like
{{bor }}
and I think Meta has used it before, so I decided to go ahead with the conversion. I have a number of changes on my list that should hopefully not be controversial. I'll look into{{inh }}
. Thanks. MartinMichlmayr (talk) 05:25, 13 December 2022 (UTC)
N'Ko for Swahili
[edit]What do you think of the Nasema writing system at Omniglot? -- Apisite (talk) 06:43, 5 February 2023 (UTC)
Thanks for fixing the transplanted piece of text. Not sure what I did that might have caused the problem; certainly it was unintentional. I noticed the issue before I saved my edits, but the Preview seemed to look OK, so I left it alone. – HelpMyUnbelief (talk) 00:36, 20 February 2023 (UTC)
umpaasmdhjfdsf
[edit]hiii, why do you edit Swahili when you could edit cool languages like Polish, or Czech, or Estonian, or Latvian, or Lithuanian??? Shumkichi (talk) 08:43, 7 March 2023 (UTC)
"Zana" in Swahili
[edit]Dear user, I noticed you uploaded the File:Sw-ke-zana.flac for the word zana#Swahili, however, it sounds audibly clipped and the female is yelling! Is it possible to reupload another version if you had any or restore the older one? Thanks. --Esperfulmo (talk) 15:50, 30 May 2023 (UTC)
KamusiBot
[edit]Almost 150 test edits should be more than enough. — SURJECTION / T / C / L / 04:45, 13 September 2023 (UTC)
- Ok, thanks for the feedback. I'll draft the vote soon. tbm (talk) 04:59, 13 September 2023 (UTC)
Swahili lists
[edit]Hey, thanks for adding a link to Hurskainen's frequency list for Swahili. I hope it's ok that I've reverted the edit, since that link is actually already included at the subpage for Swahili. Unfortunately the NC licence isn't compatible so we can't really use it though, but I'm wondering if it would be any use to you if I cleaned (to the best of my abilities) and uploaded a wikilinked list for Swahili using the Leipzig corpora list? You can have a look around at Lithuanian/Mixed web to see what this might look like or else have a zoom around for example the corpora. I'm referring specifically to the list generated from the 2011 sw.wikipedia.org corpus, since the swa macrolanguage code is treated here as several distinct languages and at best I could generate plain wikilinks that would need to be checked individually (let me know if that would also be of interest though). Helrasincke (talk) 19:17, 26 September 2023 (UTC)
- @Helrasincke Sorry, I thought I was editing the Swahili sub-page and not the main page. I must have been confused. Thanks for reverting my bad edit! Regarding Swahili, the problem is that the NPL toolkits don't support lemmatization for Swahili, so I'm not sure how useful a frequency list based on Wikipedia is at the moment. tbm (talk) 05:02, 27 September 2023 (UTC)
- @Tbm I won't be using any NLP, just using a semi-assisted process to remove entries in the wrong scripts and punctuation (in this case I'll leave Latin- and Arabic-script terms). Lemmatisation certainly has its uses but I personally think for our purposes it's overkill and even risks missing the point since we can and do include entries for most non-lemma forms as well. If a form is common, then it gets attention earlier, regardless of what form would ultimately be considered the traditional dictionary form. Evidently the user must be willing and able to use their own judgement as for where the main entry information then goes, but then that seems to work quite well here for the most part. Helrasincke (talk) 06:45, 29 September 2023 (UTC)
- Anyway the list is now live so feel free to give any feedback. I'm assuming a lot of the capitalised words are regular words which occur a lot at the beginning of a sentence. Helrasincke (talk) 08:46, 29 September 2023 (UTC)
- @Helrasincke Sorry, I thought I was editing the Swahili sub-page and not the main page. I must have been confused. Thanks for reverting my bad edit! Regarding Swahili, the problem is that the NPL toolkits don't support lemmatization for Swahili, so I'm not sure how useful a frequency list based on Wikipedia is at the moment. tbm (talk) 05:02, 27 September 2023 (UTC)
It looks like an Arabic mu- participle of a verb, imported into the m-wa class. I looked around for a suitable root, and found ع ظ م, which lo and behold has a suitable participle معظم. I can't be sure this is right, but it fits. --2A04:4A43:979F:FE19:30E2:C3DE:BB65:E8A6 11:42, 12 January 2024 (UTC)
- Thanks for your input. I usually put some notes as a comment (if I have any) and hope that an Arabic editor will fill in the missing parts (@Fenakhay has done a lot recently). If you look at the comment, you'll see I mention adhama which has the root you point out. The participate you found seems like a good match (I missed that). BTW, I think you made other edits to Swahili under a different IP address. Would you consider creating an account? tbm (talk) 12:21, 12 January 2024 (UTC)
Swahili ideas
[edit]Hello,
I appreciate your contributions to Swahili entries on Wiktionary and would like to invite you to take a look at how I am modeling Swahili on Wikidata Lexicographical data. The idea behind the Wikidata lexeme project is to maintain structured linguistic data that can be used across Wikimedia projects and elsewhere. There are a few things I have been doing differently from how Swahili is modeled here:
- Ajami (Arabic script) representations are included. There are some features of Swahili such as aspiration and word stress which are shown in the Ajami script but not the Latin script. See for example here, where the adjective has an "intensive" form which shifts stress to the final syllable: L1227819
- The noun class labels such as "ma class" which group together singular and plural class pairs are not used in favor of treating the agreement class itself as a replacement for grammatical number, and allowing each noun to have any number of forms across the classes. See for example on Marekani: L1233484. While Marekani (NC9) is singular it does not require a NC10 plural, and Umarekani (NC14) does not have to be interpreted as singular or plural to be grouped with the same lexeme.
- As Swahili is an agglutinating language, the inflectional morphemes used with verbs are treated separately and verb lexemes only contain the forms based on the stem necessary to derive the conjugations. The forms this includes can be seen on cha for example: L1230739. The way in which this verb can be combined with inflectional morphemes can be observed in this proverb as an example: L1230803.
You can use the Ordia tool to browse the Swahili lexemes that have been added so far on Wikidata. If you are interested it would be great if you could help add more. I would be keen to discuss and implement tools for integrating this data into Wiktionary. It would be possible, for example, to create a Module which renders all possible forms of a noun in an inflection table (like those on Marekani above), with their Arabic script representations shown alongside the Latin script ones. Etymological data, usage examples, and references can also all be queried from Wikidata. عُثمان (talk) 12:54, 3 February 2024 (UTC)
- I'm very interested in this. I believe a lot of the information from Wiktionary would benefit from a proper database (like Wikidata). Unfortunately, I don't know when I'll have time to look into this as I'm very busy at the moment. However, I hope we can work on this together in the future. tbm (talk) 09:51, 8 February 2024 (UTC)
- @Tbm I am glad you are interested! No worries about the timing. I am learning as I go, I think there is a lot of potential in using linked data for modeling languages which do not fit within the “conventional” format on Wiktionary that well. عُثمان (talk) 15:30, 8 February 2024 (UTC)
Hungarian hyphenation
[edit]Thank you for creating the hyphenation error list - it was very useful. I corrected most of them. Some of them are correct: vissza is hyphenated as visz‧sza, hellyel-közzel as hely-lyel-köz‧zel (hyphens are retained and no vertical bar is added), össze- as ösz‧sze. The rules are summarized in Appendix:Hungarian hyphenation. Panda10 (talk) 17:17, 29 April 2024 (UTC)
- @Panda10 thanks so much for fixing the issues and for giving me some background information. I hope I can ask some follow-up questions:
- 1) The reason it is complaining about vissza- and össze- is that they have a dash at the end, which is dropped in the hyphenation. I'm not sure if the dash at the end should be in the hyphenation or not (maybe not?). BTW, I do take the ssz -> sz-sz change into account in my script.
- 2) You write "hellyel-közzel as hely-lyel-köz‧zel (hyphens are retained": I'm not sure I understand this. There's a hyphen between l-k but where does the hyphen between ly-ly suddenly come from? Shouldn't it be hely‧lyel-köz‧zel?
- Thanks again! tbm (talk) 00:42, 30 April 2024 (UTC)
- 1) The dash should not be added at the end of hyphenation for Hungarian prefixes.
- 2) You're right about hellyel-közzel, it should be hely‧lyel-köz‧zel. I corrected it. Thank you. Panda10 (talk) 16:24, 30 April 2024 (UTC)
- @Panda10 thank you! tbm (talk) 00:20, 1 May 2024 (UTC)
Question
[edit]You created "vanielje geursel", but why not "vanielje" and "geursel"? Those are sepearte words. You should add to the dictionary! Mihai Popa 😃📃 Talk to me! 💬 『My contributions! 🕔🕖』 05:45, 10 June 2024 (UTC)
- @MihaiDictonaryWiki frankly, because adding compound words is much easier than other words. For compound words, I can just copy & paste a template and fill in some info (the compound, the plural, the translation and maybe a category).
- Thanks for fixing the wrong language code. That's because I was copy&pasting something from Swahili and I forgot to change the code.
- BTW, I don't think your message on my talk page added a lot of value. Of course I know I should add the other words, but I don't have time for everything I'd like to do. Just some feedback. I don't mind but others might get annoyed by your message. I saw on your profile page that you're young. It's true what someone said that you sometimes need a thick skin because not everyone is nice online, but then again a lot of people are. I got involved in open source about the same age you are (maybe I was a year older or so). tbm (talk) 05:58, 10 June 2024 (UTC)
- Franklin, you are boring to only do 2 entries on the dictionary? Mihai Popa 😃📃 Talk to me! 💬 『My contributions! 🕔🕖』 06:02, 10 June 2024 (UTC)