SVG Images

edit

Hello. I used CorelDRAW X4. The steps are:

  • ctrl N to create new blank file.
  • layout --> page setup; edit the size, unit (I prefered to choose pixel), resolution, etc.
  • click text tool, and type the text. you can easily resize it.
  • file --> export. I usualy use default setting (which appear when you're about to export the SVG), except for "presets" I chose "Text as curves PNG image", and Bitmap export type is PNG. I also choose "link images".
  • thats all, and I hope you try it easily as I did.

M. Adiputra (talk) 02:08, 15 April 2011 (UTC)Reply

I created with these steps, above, and didnt create it by .png or convert any .png. That's all. M. Adiputra (talk) 02:51, 17 April 2011 (UTC)Reply
The result is ok by using Inkscape. However I never use it. It could be an alternative for me. M. Adiputra (talk) 07:21, 25 April 2011 (UTC)Reply

Ethiopic script

edit

I wish I knew of a good source to tell you which scripts use which "extra" symbols. Even if you find a post-1991 source, it may be out of date since some language communities have switched to Roman script after initial Ethiopic script efforts. Sorry!Pete unseth (talk) 21:25, 26 February 2012 (UTC)Reply

Your submission at Articles for creation: Palmyrene alphabet has been accepted

edit
 
Palmyrene alphabet, which you submitted to Articles for creation, has been created.
The article has been assessed as Start-Class, which is recorded on the article's talk page. You may like to take a look at the grading scheme to see how you can improve the article.

You are more than welcome to continue making quality contributions to Wikipedia. Note that because you are a logged-in user, you can create articles yourself, and don't have to post a request. However, you may continue submitting work to Articles for Creation if you prefer.

Thank you for helping improve Wikipedia!

MatthewVanitas (talk) 01:54, 12 July 2014 (UTC)Reply

Common script & Unicode

edit

About your edit in Arabic (Unicode block) [1]. I thinkt this is not the way to do it.

A character in Unicode, can be used in multiple scripts. So in Unicode, that single character gets the script id: 'common'. All fine But a set of characters like 'arabic script' is looking the other direction: for a script it does not matter if a character is used elsewhere. All chars in the arabic script are arabic chars.

The same for inherited characters: they belong to the arabic script, full stop. There is no need to note that they may be used elsewhere.

So the infobox should say: all Arabic characters. -DePiep (talk) 20:28, 27 August 2014 (UTC)Reply

@DePiep: I'd agree with your usage of 'script' if this was an infobox on the Arabic script or language page but this is an infobox specifically for a Unicode block. The script value(s) are taken directly from the Unicode Standard and are also reflected on the Unicode block page. Script has a very specific meaning in the Unicode Standard. The Unicode block infobox has a parameter for 'Major alphabets' which I think is being muddled here with the script parameter. Apart from 'Major alphabets' and 'Note' fields all the information in the Unicode block infobox comes from the Standard itself. I don't want to be the arbiter of the script value for each block... I think it should come from the Standard. I can already envision the arguments over Assamese vs Bengali if we can't point to a definitive source for the script value. If we go down to the character level there's bound to be an argument over whether U 067E ARABIC LETTER PEH is Arabic or Persian as Wikipedia describes it as Pe (Persian letter). DRMcCreedy (talk) 00:26, 28 August 2014 (UTC)Reply
I don't want to be the arbiter - you are right, we cannot have OR. And indeed, some characters in this block have 'common' or 'inherited' for script. Still, I think the Standard has more information that just the script code (which gives one script per character). For example, the name can contain the script name: U 06DA ۚ ARABIC SMALL HIGH JEEM has script code 'inherited' (being a diacritic), but we can conclude that it belongs to the Arabic script (conclude from name and inherit properties). For 'common' script we again can look to at the name, and check other scripts; (e.g., the numbers). Note that 'Persian' script is not defined in Unicode. Still, if we research the whole block, there still can be characters we can not attribute to 'Arabic', so the problem might not be solvable. Btw, your U 067E example is a bad Wiki naming, in Unicode it is defined Arabic. -DePiep (talk) 12:11, 30 August 2014 (UTC)Reply
Are you then OK with me synching the Standard's script names with the Unicode block articles? I would always note Common and Inherited after any specific script names. DRMcCreedy (talk) 18:00, 30 August 2014 (UTC)Reply
Yes, go ahead. That would be factually correct by the source. I was just freewheeling on how to make it more precise. But my algorithm here is not finished. (By the way, do you know this site? It nicely adds "Persian, Urdu, ..." for U 067E). -DePiep (talk) 18:39, 30 August 2014 (UTC)Reply

Chart titles & refs

edit

In Template:Unicode chart Miscellaneous Symbols, I've been pushing and pulling the title a bit. Added v-t-e box, move ref links to bottom. But I'm not completely happy with the visual outcome. Any ideas from you? -DePiep (talk) 11:28, 30 August 2014 (UTC)Reply

@DePiep: I've had a look and here are my suggestions:
  1. I like the V-T-E but think the title is much too large and competes with the section headings. I tried out a few sizes at User:Drmccreedy/sandbox4 purposefully using "Unified Canadian Aboriginal Syllabics Extended" because it's currently the longest block name. "small" is my favorite font size. "smaller" could work but I still think it distracts from the headings and text.
  2. I like moving the official chart link to the footnotes. I think using "Unicode code chart" for the link is too vague because it appears inside a Unicode code chart. I'd leave the wording "Official Unicode Consortium code chart". I seem to remember a heated discussion over "official" and think this wording was the compromise. In any case I think we should always make it footnote #2 because we'll always have #1 for version and #2 for the official chart.
  3. Ref names containing spaces, like {{ref label|Chart U2600|2}}|...}}, do not work. Clicking on the [x] number no longer takes you to the reference.
  4. Lastly, For the title wording you have "Unicode chart BlockName" instead of simply BlockName. While it works, I think it's redundant to have "Unicode" to the title because the template's likely to be in a section that already mentions Unicode and the word will appear at least twice in references/notes at the bottom of each table. I tried a few different wordings at User:Drmccreedy/sandbox4 for the title. "Chart for BlockName block" is my favorite (even though you could say "block" is also redundant) followed by "Chart for BlockName" but in reality any would work. It's that balance between explicit and concise. I'm also keeping in mind Unicode chart templates with the subset feature like Halfwidth and Fullwidth Forms at Katakana#Unicode. It would be nice if any new title syntax easily allows the subset to be noted, like "Chart for foobar subset of the BlockName block" (for example, "Chart for katakana subset of the Halfwidth and Fullwidth Forms block") or "Chart for foobar subset of BlockName" ("Chart for katakana subset of Halfwidth and Fullwidth Forms"). DRMcCreedy (talk)
Useful links

Two chart templates have sandboxes:

In the future, we could use two templates to convey standard showing. Today, this is just a development.
Time to ping BabelStone. -DePiep (talk) 20:39, 30 August 2014 (UTC)Reply
First: I've added some links & sandboxes, for us to play. I'll do my demo's in the Math one (guess what's yours).
re 1.a. Yes, the small font is best. End of problem.
re 1b. But it is a simple table title, so it should be bold. (This looks bad now, because the table is made "font big" for the characters, but it spoils the title. See Miscellaneous Symbols/sandbox below for my proposed changes). My solution is: the characters can be big, but the table top should be regular.
re 2. Yes, that chart link should be in the footnote. OK then, it's just a link.
But no, no need to call it "official". (I was the one who heated up the discussion about this; I still claim that "official" wrt Unicode is weasel wording. Unicode does not have anything not official.
I suggest we use {{cite web}} for this. Like, quicky: "Unified Canadian Aboriginal Syllabics (Unicode chart)" (PDF). Unicode Consortium. 16 June 2014. Retrieved 2014-08-30..
re 3. OK.
re 4. Title wording. Yes, this is not straightforward. I'd say, the table shold be self-contingent (not relying on text surrounding). So somehow it should mention 'Unicode'. Maybe "Miscellaneous Symbols (Unicode chart)"?
re 5 (new) I've started {{Unicode chart/header}}, to be used for all 200 charts. Once we know what to do in this, it will be useful. -DePiep (talk) 20:58, 30 August 2014 (UTC)Reply
See Template:Unicode chart Miscellaneous Symbols/sandbox [2] (being bold, but only a bit). -DePiep (talk) 21:08, 30 August 2014 (UTC)Reply
It's getting there... @DePiep:@BabelStone:
  • We'll need the first codepoint parameter for both the header and the footer in order to create unique ref labels. It's currently called "block id" in the header template but I'd suggest using something like first or start in part because I don't trust spaces in parameter names. For that reason I'd go with name instead of "block name", grey instead of "has grey" and extranotes or notes instead of "extra notes".
Don't get distracted. See my reply below (all in good faith). -DePiep (talk) 22:39, 30 August 2014 (UTC)Reply
  • I don't think we need a version parameter on the header, just the footer. Shouldn't have to change it in two places each time Unicode releases a new version.
Distraction
  • I see the dates for the cite of the Unicode chart being problematic. It's another thing we'll have to change for each release. Same with retrieved date. We might have the template add it based on version number.
Distraction
  • We'll only need to repeat the block name for the footer if we use it for the cite. Generic wording would avoid this parameter.
Distraction
  • I'm assuming "extra note" would just add the passed text as-is after the two standard references, right?
Distraction
  • Is there a way to have the reference numbers automatically generated within the chart instead of hardcoding them as we do today?
I don't know, will take a look.
We disgree. I like the block name being in bold (after all, it's the table title), and have the Unicode mentioned. It's my text proposal for the title.
Distractions: I started the templates {{Unicode chart/header}} and {{Unicode chart/footer}}, but of course they only will reproduce what we agree (wish I had not mentioned them). Template technicalities are a distraction. Please let us focus on the visual outcome, on what we show in mainspace. E.g., I'd like to see your suggestions in here. It's free. -DePiep (talk) 22:39, 30 August 2014 (UTC)Reply
Move the whole talk to Template talk:Unicode chart? -DePiep (talk)
@DePiep: I've updated Template:Unicode chart Unified Canadian Aboriginal Syllabics Extended/sandbox with a mock up as requested.
The title is bold and identifies it as a Unicode block
I removed the ref numbers. We'll always have at least two notes and all the notes have ref numbers just on the title so the numbers seem superfluous and distracting.
I agree this discussion should move to the template talk page for a wider audience.
And I get distracted easily... DRMcCreedy (talk) 06:37, 31 August 2014 (UTC)Reply
Yes, looks good. I changed the cellspacing from 5px to 2px (as a suggestion); but width settings are still active. Do you think the character should be larger (say 150%)? And should we reduce whitespace (cell padding & widths)? -DePiep (talk) 09:52, 31 August 2014 (UTC)Reply
  • @DePiep: I'm not sure if it's just my computer but I don't really see a difference between 5px and 2px. In fact, removing border="1" cellspacing="0" cellpadding="5" and just relying on class="wikitable" seems about the same to me. If it does to you as well maybe we can just leave it up to wikitable.
  • The character size is currently set to "large". Looking at the manual of style using a percentage, like 150%, might be a better way to go. For me 150% and large are quite similar, at least for this block. No doubt thanks to varying fonts for different blocks some look quite large, like Phags-pa and some quite small, like Mongolian. 150% seems fine.
  • I think the width is important to keep for visually consistent columns. It doesn't make much of a difference for this block but I remember that some looked terrible without it. DRMcCreedy (talk) 23:42, 31 August 2014 (UTC)Reply
Yes, let's see what happens when we leave it to basic class="wikitable" (It's not about your browser). Note that there is also this width setting, per column in the hex-numbers row (2nd row): style="width:20pt". So these set column width (too; it is co-affecting). In Template:Unicode chart Miscellaneous Symbols/sandbox I've removed them, to see the effect.
  • So, as for whitespace we are still playing around. I don't know what looks best, but we should know the control options (there are more than those mentioned). I prefer the number ("150%") above the "large" wording option to have more control, see below.
  • About font size. When set in the table top row, the font-size:150%;, this applies to the whole table. But this should only be for the characters, not for headers &tc (we already concluded). So setting to regular size is done by style="font-size:67%" (150% &times 67% = 100%). For now, I think we should use this to set the sizes for regular text (not the Unicode characters it is about).
  • Not applied: adding a row-header sign (! instead of | pipe). Will bolden the cell text. All these are applied now in the Misc Symb sandbox mentioned .
  • About equal column width. (later more) -DePiep (talk) 09:22, 1 September 2014 (UTC)Reply
  • {{Unicode chart Phags-pa}} uses {{Phagspa}} to format the individual characters & cell. I propose to leave this out of the topic. When we have found a good general setup, we can see into that exception. Similar Template:Unicode chart Mongolian using {{MongolUnicode}}: forget for now. -DePiep (talk) 09:46, 1 September 2014 (UTC)Reply
  • In a separate research, I am looking to get these replacement characters show more nicely: NBSP. Should not influence this topic here (must flow nicely in the chart tables). -DePiep (talk) 09:59, 1 September 2014 (UTC)Reply
@DePiep:That all sounds reasonable. When you look at equal column width later... I set up a test page without widths at User:Drmccreedy/sandbox4 to compare against Cyrillic script in Unicode#Blocks. DRMcCreedy (talk) 17:48, 1 September 2014 (UTC)Reply

(arbitrary break)

edit
Yes. Do you know of any chart table that has irregular column widths, today?
If I am correct, we are looking for the right amount of whitespace:character_size at the moment. It takes some patience, but I promise we can control the lot. I also note that page-layout designers love whitespace (at wikimedia , and enwiki pagelayout setting etc), so maybe our outcome will look like ... the current live view.
Just to show, I've added a center-table setting in your sandbox4 '-). Later more. -DePiep (talk) 18:28, 1 September 2014 (UTC)Reply
@DePiep:I think the worst are Template:Unicode_chart_Javanese for irregular size and Template:Unicode_chart_Cyrillic Extended-A for smallness. DRMcCreedy (talk) 22:22, 1 September 2014 (UTC)Reply

Sync script with Unicode Standard

edit

If we are going to list all scripts associated with a given Unicode block (which I think is a good idea), then maybe we should parenthetically indicate how many characters in the block belong to each script, otherwise the list of scripts may be misleading to the reader. e.g. for Tibetan: "Tibetan (207 characters), Common (4 characters)". What do you think? BabelStone (talk) 10:12, 14 September 2014 (UTC)Reply

@BabelStone:Great idea. I should be able to do that tomorrow. DRMcCreedy (talk) 16:31, 14 September 2014 (UTC)Reply
Great. I'm going to be too busy over the next three weeks to be able to spend much time with this or any of my other suggestions, but I'm always available to help out if needed. BabelStone (talk) 19:15, 14 September 2014 (UTC)Reply

Chart view vs List view for Unicode blocks

edit

The Unicode block code chart templates give a good visual overview of the block (if the font stuff works!), but is unsatisfactory in many other ways. I have been thinking that in addition to the visual chart view it would be useful to have a list view of assigned characters in each block that provides all the hidden details about the characters. Maybe we should have a button on the Unicode code chart templates that expands the code chart view into a list view (list view hidden by default), with the code point, character name, script, general category, etc. for each assigned character in the block listed in a table. You wouldn't want it for the CJK unified ideographs and Hangul symbols blocks, but I think it would be very useful for other blocks. BabelStone (talk) 10:18, 14 September 2014 (UTC)Reply

This should be useful. Here are the first things that come to mind:
  • Can it include derived age and annotations (from NamesList.txt file)? I'd hope for a separate column for derived age but annotations and other notes could be included with the character name on a new line.
  • Derived age is definitely useful. I'm not sure if there is enough room for annotations, and if you add annotations and notes from NamesList.txt then random editors are going to start editing these notes and adding their own. There is also copyright issue for the notes which I think would preclude us including them. BabelStone (talk) 19:12, 14 September 2014 (UTC)\Reply
  • Sortable columns would be nice.
  • Should this be a different template to accommodate differing widths? Put another way, I think this data would be wider than the current charts.
  • Should we remove the pop-up information for each character once this is in place? Or would we leave that duplicate information (codepoint, name, alias)?
  • I think yes. The pop-up names are far less useful now than they used to be because a few editors have been keenly (over-keenly in my opinion) adding wikilinks to as many characters as possible, and unless you carefully position your cursor in just the right position the Wikilink target name appears instead of the popup names. BabelStone (talk) 19:12, 14 September 2014 (UTC)Reply
  • Would that size of the data be an issue for larger blocks?
  • Possibly. I would leave out CJK Unified ideographs and Hangul syllables (and any future blocks with algorithmic names such as Tangut and Nushu), which would mean the largest block would be Yi syllables with 1,165 characters which may or may not be too long. For CJK unified ideographs etc. we could just have two rows for first and last character in the block. BabelStone (talk) 19:12, 14 September 2014 (UTC)Reply
  • I hope not. I think we should restrict ourselves to the most useful properties, and avoid mission creep by adding more and more properties just for the sake of being complete. I would say that character name, formal aliases, general category, script, and derived age are very useful; but combining class category, bidi class, decomposition type/mapping, numeric type/value, case mapping, etc. are not essential and should not be included. I would also avoid repeating the actual character shown in the chart template. BabelStone (talk) 19:12, 14 September 2014 (UTC)Reply

Wikidata for Unicode data

edit

I think it is wrong, or at least inefficient, for us to be maintaining so much Unicode data for scripts, blocks and characters by hand on en:wp. Such data is needed by all the big wikipedias, but at present it needs to be duplicated for each language version of an article or template. It seems to me that it would be best to maintain Unicode data on Wikidata so that it available to all Wikimedia projects independent of language. Perhaps this would mean Wikidata entries for each Unicode script, each ISO 15924 code, each Unicode block, and each and every assigned Unicode character. I've never used Wikidata so don't know how we would go about doing this, but it is something I think we should try to consider. BabelStone (talk) 10:31, 14 September 2014 (UTC)Reply

I agree, especially if we expand the amount of data for "list view". I'm not familiar with Wikidata either but it would be good to look into it further. DRMcCreedy (talk) 16:31, 14 September 2014 (UTC)Reply
Yes, might be worth trying to involve someone who is involved with Wikidata. I don't think it's urgent, and I have very limited time for Wikipedia at present, but perhaps we can do something about it over the next year. BabelStone (talk) 19:19, 14 September 2014 (UTC)Reply
Actually this is not what Wikidata does. It does not store central data, but only connects same topics across wikis. Simply, it replaced interwiki links. e.g., Moskow (this enwiki) de:Moskau and ru:Москва now all link to wikidata:Q649, the non-language coded id for this object (the city).
Store the data centralised (a sound idea) could be through commons? or wikibooks? (did not research this yet). -DePiep (talk) 11:05, 16 September 2014 (UTC)Reply
Actually, I believe you are wrong. Interwikilinks is just one aspect of Wikidata, but going forward it is intended to act as a central repository of data for Wikimedia projects (Wikidata Introduction). BabelStone (talk) 19:11, 16 September 2014 (UTC)Reply
You are right. Language-free data, so fit for core Unicode. However, that is long term. Last time I tried it was not even possible to simply retrieve exisiting data from wd (to use in enwiki).
At the moment we have Template:Planes (Unicode), that links code ranges to wikibooks. It's an old template here, but I've never bothered to updated those links myself. -DePiep (talk) 19:32, 16 September 2014 (UTC)Reply

Abbr?

edit

I see you write "char." for "character" in the Unicode template. Nothing wrong with that, but I ask you to consider: write full "character" when possible. It makes much more pleasant reading. Of course, in a table or for informed correspondence we could use the abbrev., but for the wiki Reader-is-King it takes an unneeded mental step. -DePiep (talk) 19:17, 20 September 2014 (UTC)Reply

I've updated Template:Unicode_blocks to use the full word. I've left the abbreviation in the infoboxes (for example, Basic Latin (Unicode block)) where space is at a premium. DRMcCreedy (talk) 19:40, 20 September 2014 (UTC)Reply
Sounds good. (I did not even check). -DePiep (talk) 20:11, 20 September 2014 (UTC)Reply

Templates for Unicode blocks with no code charts

edit

Hi, you may wish to contribute to the discussion of templates for Unicode blocks with no code charts here. BabelStone (talk) 18:20, 11 October 2014 (UTC)Reply

Updating Unicode version for Unicode chart templates

edit

Hi, I noticed that when Unicode 7.0 was released you updated all the 200 Unicode block templates that had not been affected by the Unicode 7.0 release to indicate "As of Unicode version 7.0" -- which is a good thing, but seems really time-consuming. I wonder, would it be a good idea to replace "As of Unicode version 7.0" in all the Unicode block templates with "As of Unicode version {{UnicodeVersion}}" where "UnicodeVersion" is a template containing the current version string (e.g. "8.0" at present), and then when a new version of Unicode is released, all we need to do is update the "UnicodeVersion" template after all the affected Unicode block templates have been updated. What do you think? Pinging DePiep who is the expert at Unicode templates. BabelStone (talk) 12:44, 20 June 2015 (UTC)Reply

I like your idea. I'd want the documentation to include the description you give here and an admonition to not change it until all the templates are updated/verified after a release. DRMcCreedy (talk) 15:09, 20 June 2015 (UTC)Reply
Yes, that was what concerned me, but if anybody does prematurely change the template it would be easy enough to change it back. I'll wait a day or two before doing it in case anyone else has any comments. BabelStone (talk) 17:16, 20 June 2015 (UTC)Reply
Sure have I thought about his often. But. We can not have a "current version number template" (like {{UnicodeVersion}}) that would update each and every page automatically, unchecked. So I say: no. -DePiep (talk) 00:06, 21 June 2015 (UTC)Reply
Oh well, then I guess we'll just have to update all the templates manually. BabelStone (talk) 20:10, 21 June 2015 (UTC)Reply
(edit conflict)Let me rephrase for clarity. If that simple template would exist it could be applied to all Unicode templates and tables etc. after sound checking (manual editing). That is great. Now when a new Unicode version is published, we'd change that version number in the tempalkte once and voila, we are stating that all those tables are "as of Unicode version (latest)". That is a tough statement to make.(What we really need is an option to visit & edit just those pages that are affected by a new version). -DePiep (talk) 20:28, 21 June 2015 (UTC)Reply
re BabelStone 20:10: note that today a sourcing statement "as of version 7.0" is correct! iow, only articles with topics that have changed, like in block additions, are outdated (and still not wrong). -DePiep (talk) 20:28, 21 June 2015 (UTC)Reply
I agree "as of version 7.0" is technically correct for blocks that have not been updated for 8.0, but it can be confusing for readers because they may think that the table is out of date if they know that the current version of Unicode is 8.0. I therefore think that it is preferable to specify "as of" the latest version of Unicode for all blocks, even if they have not been updated since version 1.0. BabelStone (talk) 07:53, 22 June 2015 (UTC)Reply
Agree. It is just to easy any panic about incorrectness. -DePiep (talk) 08:05, 22 June 2015 (UTC)Reply
  • Since Unicode is web-published, I was thinking along this other line. We could make a template {{Cite Unicode}} with optional parameters version=, chapter=, script=, plane=, TR=. Of course it links nicely and automatically to a site page. Not just chart pages, but also chapters and appendices. The version number does not alter with a new release. However, any time we can check all articles using this template. Maybe in maintenance categories even ("Cat:Articles with Unicode version 1 cites)"). When for example Unicode 11 changes chapter 5 (number & content), we can do that maintenance check and update the Cite reference. Note that before someone would do that, the old cite is not wrong (because from the old version, chapter is cite correct. But alas, this takes some development time & thinking. -DePiep (talk) 20:24, 21 June 2015 (UTC)Reply
  • One more idea. BabelStone. Why not create {{Unicode v8.0}} that has text "like "As of version 8.0". You can add that today to where you need it (i.e. add template not change versdion text). Then next year you can more easily revisit those pages (via What-links-here) for a v9 check/update. No harm in sight. -DePiep (talk) 08:46, 22 June 2015 (UTC)Reply
DePiep, thanks for the alternative suggestions. I'll think about them, but for the present I think it is simplest just to manually update the pages. BabelStone (talk) 21:50, 25 June 2015 (UTC)Reply
Yes. -DePiep (talk) 21:55, 25 June 2015 (UTC)Reply

unicode chart styling

edit

I get what you're saying about chart headings (uniformity is nice), but faced with 3k/9k of redundancy, which accounts for over of the documents at present, is it really that big an issue that the chart headings display in the same font as the rest of the row? Maybe bandwidth usage/optimization doesn't matter to most people, but there's the maintainability issue, too. If a new supporting font comes along, it's a bunch easier to change 𝒳 styles than 16𝒳 styles. The sad part is, if we could add a single line to a global stylesheet (e.g. "table.unicodeChart > tbody > tr > th { font-family:whatever;}" — or td:first-child instead of th, depending on the markup), consistency could be enforced throughout WP for those headings while keeping life easier for ordinary editors. ⇔ ChristTrekker 13:54, 27 August 2015 (UTC)Reply

I don't see 6k as a huge waste of bandwidth especially considering the size of the images in the article and the relatively small number of times the page will be viewed. If preserving bandwidth is the goal we could trim a little from the most popular articles in Wikipedia and accomplish far more savings.
Many other Unicode templates, like Template:Unicode_chart_Balinese, employ the same style: headings in the default font of the user and non-Latin characters using a list of fonts. Are they all to be changed? What should we do with the fonts that don't have Latin character support? Exclude them as an option?
If maintainability is the goal we can add style temples for a given script. For example, Template:Unicode_chart_Bamum, which has a single template that lists the fonts. That way to add or remove a font you change it only in one place. But that doesn't address your bandwidth issue.
I agree that a td level default setting would be great, it's just not there. An even better one would be a "skip adding style to this cell". DRMcCreedy (talk) 15:28, 27 August 2015 (UTC)Reply

Rewriting Cirth

edit

«Your draft article on cirth is a great improvement over the existing article. I'd like to encourage you to publish it. DRMcCreedy (talk) 15:06, 3 October 2015 (UTC)»

Thank you so much for your support DRMcCreedy, but I'm not sure I can substitute the existing article with mine. Can I?   ᚪᛋᚦᚩᚾᛏ (Asþont) | Talk  20:40, 5 October 2015 (UTC)Reply

Adding categories directly into templates?

edit

Hi there,

I see that you are doing good work with the Unicode templates... I do have one question/concern, however. I happened to stumble across your latest edit to Template:Unicode chart Greek and Coptic, if I may use that as an example. I see that you removed the following "boilerplate" comment:

<!-- PLEASE ADD THIS TEMPLATE'S CATEGORIES TO THE /doc SUBPAGE, THANKS -->

and replaced it with the category:

[[Category:Unicode charts|Greek and Coptic]]

Can I ask your rationale for doing so? Are you attempting to categorize the template itself, or pages that utilize said template? As best as I can tell, this practice will cause several undesired effects. Please see WP:CAT#T and WP:TCAT, which recommends against this approach for several reasons. Given that you are updating many, many templates, I fear that this might inadvertently be creating a lot of cleanup work for someone.... Thoughts? Thanks! grolltech(talk) 17:44, 2 February 2016 (UTC)Reply

@Grolltech: My only intention was to make those four templates (Unicode chart Halfwidth and Fullwidth Forms, Unicode chart Greek and Coptic, Unicode chart Enclosed Alphanumeric Supplement, and Unicode chart Enclosed CJK Letters and Months) like the other 260 Unicode chart templates. The other templates have their names in the Category (with a few exceptions). It is done within a noinclude tag so maybe the issues noted at WP:CAT#T and WP:TCAT don't apply. I honestly don't know. DRMcCreedy (talk) 20:21, 2 February 2016 (UTC)Reply
@Grolltech: After some research I’m pretty sure the noinclude tag makes the category apply only to the template, not the articles it’s transcluded into. If I’m wrong let me know and I’ll fix any issues. Thanks. DRMcCreedy (talk) 16:21, 12 February 2016 (UTC)Reply

Roadmap to the Unicode SMP

edit

Hi David, I notice you uploaded updated versions of the Unicode roadmap images for 9.0. I assume that you base the geographic classification of scripts on the chapters in the Unicode Standard, but as the Unicode 9.0 core specification will not be available until August I guess you determined the classification of new 9.0 scripts yourself. Ideographic Symbols and Punctuation, Tangut and Tangut Components are coloured dark green for South and Central Asian scripts, but Tangut was used entirely within what is now China, and I understand that Tangut will be included in the East Asian chapter of the Unicode Standard. Therefore, I suggest changing the colour for Ideographic Symbols and Punctuation, Tangut and Tangut Components to red for East Asian scripts. Thanks, BabelStone (talk) 18:13, 24 June 2016 (UTC)Reply

Thanks @BabelStone:. I'll update it tonight. DRMcCreedy (talk) 18:24, 24 June 2016 (UTC)Reply
I've updated the SMP roadmap image. @BabelStone: Is there a way to know which chapter a new script will go into prior to the core spec release? That would be useful the next time. DRMcCreedy (talk) 22:33, 24 June 2016 (UTC)Reply
Thanks. It would be useful to know the ToC in advance, but unfortunately that information does not seem to be publicly available at present. BabelStone (talk) 23:04, 24 June 2016 (UTC)Reply

Into Russian

edit

Hello, Drmccreedy. Nice work with Roadmap. I like it. I see your script and translated it into Russian. For my bad I don't know programming languages like Perl. I just translated it. But in output images all cyrillic letters looks like abracadabra, because I don't know how to switch encoding. Can you help me with it? You can edit your translated script from my page. Thanks. ← Alex Great talkrus? 10:26, 11 July 2016 (UTC)Reply

@Alex Great: I'm happy to help but it will take a day or two. DRMcCreedy (talk) 14:42, 11 July 2016 (UTC)Reply
@Alex Great: I've created a new multilingual image called Roadmap_to_Unicode_BMP_multilingual.svg. I'm still refining it but take a look to see what you think. You'll need click Russian on the "Render this image in" drop down or add [[File:Roadmap to Unicode BMP multilingual.svg|lang=ru]] to a test page to see it. (At the moment it's rendered in all the contained languages at User:Drmccreedy/sandbox.) DRMcCreedy (talk) 02:10, 14 July 2016 (UTC)Reply
Very good work. Excellent. All correctly displayed, thanks you very much! Can you did small fix: at BMP image in a legend delete apostrophe (') at the start of the line in 5th legend description (purple key). Thanks a lot. ← Alex Great talkrus? 07:02, 14 July 2016 (UTC)Reply
Oh. One problem: "Китайское письмо" means "Chinese script", but all of this range is "CJK characters". Can you change it to "Идеограммы ККЯ". Thanks. ← Alex Great talkrus? 07:05, 14 July 2016 (UTC)Reply
@Alex Great:I've uploaded new images. Have a look at User:Drmccreedy/roadmap_test_page#Russian_.28ru.29 and let me know if everything looks good now. DRMcCreedy (talk) 03:00, 15 July 2016 (UTC)Reply
Yes, all is perfect. Thanks for this work. I have a question: what if to create a 3rd plane (Tertiary Ideographic Plane, TIP)? I know that this plane at this moment is unallocated. This image would be consist only one legend key "Unallocated code points" and all table would be in white. What do you think? ← Alex Great talkrus? 05:06, 15 July 2016 (UTC)Reply
@Alex Great:I've updated the script to generate images for all 17 Unicode planes but I think only BMP, SMP, SIP, and SSP should be used or put into Wikimedia/Wikipedia. If you're curious what plane 3 would look like completely unallocated, run the updated script. The Tertiary Ideographic Plane wasn't part of Unicode 9.0 so it remains only a proposal. Maybe we can add it in Unicode 10.0 next year. DRMcCreedy (talk) 00:35, 16 July 2016 (UTC)Reply
I would not add the TIP until it is not empty, which I don't think will be before 12.0. BabelStone (talk) 07:58, 16 July 2016 (UTC)Reply
OK, thanks for you work again. Can you add at your script ukrainian and belorussian?:
   # Ukrainian
   uk => {
      Africa      => "Писемності Африки",
      Americas    => "Писемності Америки",
      AsiaEast    => "Писемності Південно-Східної Азії",
      AsiaSC      => "Писемності Південної і\nЦентральної Азії",
      AsiaSE      => "Писемності Східної Азії",
      asOfVersion => "Станом на версію Юнікоду %s",
      cuneiform   => "Клинопис",
      Europe      => "Нелатинські європейські писемності",
      Han         => "Ідеограми ККЯ",
      hieroglyphs => "Ієрогліфи",
      IndOcean    => "Писемності Індонезії і\nТихого океану",
      Latin       => "Латинська писемність",
      ME          => "Писемності Середньої Європи і\nПівденно-Західної Азії",
      misc        => "Різні символи",
      notation    => "Системи нотописі",
      private     => "Область для приватного використання",
      surrogates  => "Сурогатні пари UTF-16",
      symbols     => "Знаки",
      tags        => "Теги",
      unallocated => "Вільні кодові позиції",
      variation   => "Варіантні селектори",
   },
   # Belorussian
   be => {
      Africa      => "Пісьменства Афрыкі",
      Americas    => "Пісьменства Амерыкі",
      AsiaEast    => "Пісьменства Паўднёва-Усходняй Азіі",
      AsiaSC      => "Пісьменства Паўднёвай і\nЦэнтральнай Азіі",
      AsiaSE      => "Пісьменства Усходняй Азіі",
      asOfVersion => "Па стане на версію Унікода %s",
      cuneiform   => "Клінапіс",
      Europe      => "Нелацінскія еўрапейскія пісьменства",
      Han         => "Ідэаграмы ККЯ",
      hieroglyphs => "Іерогліфы",
      IndOcean    => "Пісьменства Інданезіі і\nЦіхага акіяна",
      Latin       => "Лацінская пісьменнасць",
      ME          => "Пісьменства Сярэдняй Еўропы і\nПаўднёва-Заходняй Азіі",
      misc        => "Розныя сімвалы",
      notation    => "Сістэмы нотапісу",
      private     => "Вобласць для прыватнага выкарыстання",
      surrogates  => "Сурагатныя пары UTF-16",
      symbols     => "Знакі",
      tags        => "Тэгі",
      unallocated => "Свабодныя кодавыя пазіцыі",
      variation   => "Варыянтныя селектары",
   },

Thanks a lot of. ← Alex Great talkrus? 06:58, 16 July 2016 (UTC)Reply

@Alex Great:I've updated the scripts and images for Belarusian and Ukrainian. DRMcCreedy (talk) 20:29, 16 July 2016 (UTC)Reply

German

edit

I fixed a few typos in the German translation and also added translations for the file description pages there. Here is the updated code for the Perl script:

   # German
   de => {
      Africa      => "Afrikanische Schriften",
      Americas    => "Amerikanische Schriften",
      AsiaEast    => "Ostasiatische Schriften",
      AsiaSC      => "Süd- und Mittelasiatische\nSchriften",
      AsiaSE      => "Südostasiatische Schriften",
      asOfVersion => "Stand: Unicode \%s",
      cuneiform   => "Keilschrift",
      Europe      => "Andere europäische Schriften",
      Han         => "CJK-Ideogramme",
      hieroglyphs => "Hieroglyphen",
      IndOcean    => "Indonesische und ozeanische\nSchriften",
      Latin       => "Lateinische Schriften und Symbole",
      ME          => "Nahost- und Südwestasiatische\nSchriften",
      misc        => "Verschiedene Zeichen",
      notation    => "Notationssysteme",
      private     => "Privater Nutzungsbereich",
      surrogates  => "UTF-16-Surrogates",
      symbols     => "Symbole",
      tags        => "Tags",
      unallocated => "Nicht belegte Codebereiche",
      variation   => "Variantenselektoren",
   },

--Schnark (talk) 07:59, 31 July 2016 (UTC)Reply

Thanks. I've updated the images. DRMcCreedy (talk) 15:53, 31 July 2016 (UTC)Reply

Czech

edit

Hi David and Alex. Here is the czech version of texts:

   # Czech
   cs => {
      Africa      => "Africká písma",
      Americas    => "Americká písma",
      AsiaEast    => "Východoasijská písma",
      AsiaSC      => "Písma jižní a střední Asie",
      AsiaSE      => "Písma jihovýchodní Asie",
      asOfVersion => "V Unicode \%s",
      cuneiform   => "Klínové písmo",
      Europe      => "Nelatinková evropská písma",
      Han         => "Čínština, japonština a korejština",
      hieroglyphs => "Hieroglyfy",
      IndOcean    => "Písma Indonésie a Oceánie",
      Latin       => "Latinka",
      ME          => "Písma Blízkého a Středního východu",
      misc        => "Různé znaky",
      notation    => "Notační systémy",
      private     => "Pro soukromé použití",
      surrogates  => "Náhradní páry UTF-16 (surrogate pairs)",
      symbols     => "Symboly",
      tags        => "Jmenovky (tags)",
      unallocated => "Nepřidělené kódové body",
      variation   => "Selektory variant",
   },

Before, I was not aware that SVG allows multilingual texts. That's interesting.

Please, can you re-generate and upload the SVG file? Thank you. Kolarp (talk) 07:40, 5 June 2020 (UTC)Reply

Thanks for the translations @Kolarp:. I've updated the SVGs. Take a look at User:Drmccreedy/roadmap test page#Czech (cs) to make sure everything looks right. DRMcCreedy (talk) 19:03, 5 June 2020 (UTC)Reply
That's great, thank you. I have already used the pictures on cs:Rovina_(Unicode) and they seem to be good. Kolarp (talk) 20:06, 5 June 2020 (UTC)Reply

The Question of Bitcoin

edit

To @Template:Unicode chart Currency Symbols:

The revision has reverted since the Bitcoin is officially not encoded in Unicode, but is proposed at U 20BF code point.

A good ways is to “encode” Bitcoin symbol

edit

Using existing Unicode character:

  • Use B⃦ (can be “overtyped” using uppercase B and combining double pipe, but will not display/align correctly in some browsers/fonts)
    • Or use combining double slash (B⃫)
  • Use ฿ (Baht sign; and is similar to Bitcoin sign)
  • Use Ƀ (it is the best way to handle Bitcoin without using PUA (but will display as squares in most modern Android phones) it is a Latin Extended-B letter

Using images/PUA

  • It is possible to add a image for Bitcoin since it is officially never encoded in Unicode. Also it is recommended it you can use HTML alt parameter (like alt="BTC").[1]
  • Or use Private Use Area. You can use PUA using the one of PUAs (either BMP, SPUAA, or SPUAB).

46.130.57.191 (talk) 16:09, 23 September 2016 (UTC)Reply

Footnotes

Unicode blocks history section

edit

Hey there, I am super excited to see your efforts on adding the history section to Unicode blocks. I would like to share my opinion on the style of the table. The current tables are sortable, and looks like the following

Final code point Count Version L2 ID WG2 ID Document
Block sections 1 16 1.0.0 1 Doc 1
2-1 2-2 Doc 2
3-1 3-2 Doc 3
Block sections 2 20 2.0.0 4-2 Doc 4
5-1 Doc 5

I do not think this is a good idea. The problems:

(1) If you actually sort the table by any column, say, Version, then the vision effect is not good, and you cannot revert back to the original table by any means.

(2) The width of the table should be adjusted. If you look at Kannada (Unicode block), you'll immediately see what I mean.

I propose the following table:

Version Final code points Count L2 ID WG2 ID Document
1.0.0 U 0C82..0C83, 0C85..0C8C, 0C8E..0C90, 0C92..0CA8, 0CAA..0CB3, 0CB5..0CB9, 0CBE..0CC4 16 1 Doc 1
L2/11-438 2-2 Doc 2
3-1 3-2 Doc 3
2.0.0 Block sections 2 20 ( 4) 4-2 Doc 4
5-1 Doc 5

It is unsortable, with the Version column placed at first. Some columns have fixed width: Version = 60, Final code points =180 (so if there are many sections, this column will not be too wide), count = 70, L2 ID = 95 (good for the doc ID length), WG2 ID = 65, Document no width restriction. Of course the width may be fine-tuned depending on the case. The count column records the total number of code points, as well as the number of newly added ones.

If you have no objection, I will systematically convert the tables to this form. Sofeshue (talk) 08:16, 10 April 2017 (UTC)Reply

I certainly agree that Drmccreedy has been doing an amazing job of documenting the history of Unicode blocks, and it is a really important contribution as there is nowhere else on the internet where this information is available in this format. As to the layout of the table, I agree that sorting is not really necessary, and as the rows are ordered by version it makes sense to put the Version column first, so I support the table layout suggested by Sofeshue. BabelStone (talk) 09:29, 10 April 2017 (UTC)Reply
@Sofeshue:@BabelStone:I'm glad my hard work is being appreciated. While scriptsource has some of the same information it's incomplete and usually not available at the code point level.
On the sort issue... The rows aren't ordered by version, they're ordered by code point. Cyrillic is a good illustration. Envision someone trying to find out why/how code point U 04FB was encoded. The task of finding the right document is harder if the table is sorted by version.
If the goal is to determine the code points added in a given release, I can easily create a small, separate table that lists the code points by release.
Sorting the document numbers is useful, at least to me, because they don't always go in date order. If I'm looking at a list of doc numbers it's easier for me to find them in the table if I can sort those columns.
While I'm not keen on changing the column order or removing the ability to sort them, I'm in total agreement that browsers do a lousy job determining an aesthetically pleasing column width for this data. I think the main issue here is the size of the code points column. I can limit the size of that column based on the size of the data but should it be a fixed number or a percentage? For example, 25% or 180px? My concern is accessibility. What if someone has a large font size. Does the display still work with fixed pixels specified?
If the code points column is subdued, is there any reason to impose limits on the document id columns?
DRMcCreedy (talk) 17:17, 10 April 2017 (UTC)Reply
There does seem to be an issue with the L2 ID wrapping because of the dash. A few people have added a non-breaking dash but that makes "find" stop working. I think the most sensible solution is for me to wrap individual L2 IDs in {{nobr|...}}. I can do this for all the block histories once we figure out the desired width of the code point column.DRMcCreedy (talk) 19:52, 10 April 2017 (UTC)Reply
I don't mind keeping the sorting property.
I feel sorting by version is better. Let's still take Cyrillic for example. The final code points column is taken out and an index column is added (the table below). Suppose I want to find U 04D8. I need to scan the code ranges from the first row. When I reach the end of row 7, with code range 04EC..04ED, which is greater than 04D8, can I conclude that 04D8 is within the first seven rows? No. It appears in the 9th row. The problem is, although the starting codes in the columns are increasing, we cannot guarantee that the starting code on the (n 1)th row is larger than the ending code on the nth row. If we sort by Version, for each version, order the codes in each version by an increasing order, then the starting codes in the columns are still in increasing order, and the starting code on the (n 1)th row may still fail to be larger than the ending code on the nth row. To sum up, the two sorting methods essentially have the same property on the Code column, but sorting by Version is furthermore clearer on the Version property, thus it is better. I hate to use such mathematical language on computer science... :-) Sofeshue (talk) 08:49, 11 April 2017 (UTC)Reply
Final Code Points Index
U 0400, 040D, 0450, 045D 1
U 0401..040C, 040E..044F, 0451..045C, 045E..0486, 0490..04C4, 04C7..04C8, 04CB..04CC 2
U 0487 3
U 0488..0489 4
U 048A..048B, 04C5..04C6, 04C9..04CA, 04CD..04CE 5
U 048C..048D 6
U 048E..048F, 04EC..04ED 7
U 04CF 8
U 04D0..04EB, 04EE..04F5, 04F8..04F9 9
U 04F6..04F7 10
U 04FA..04FF 11
Wow, I didn't know of {{nobr|...}} before, and it does solve the L2 ID, WG2 ID columns. I did an experiment, one just adds it to the LONGEST file name in L2 ID and it suffices. As to the Final Code points column, what about this method: add a <br> every two sections. For example:
U 0401..040C, 040E..044F, 0451..045C, 045E..0486, 0490..04C4, 04C7..04C8, 04CB..04CC should become
U 0401..040C, 040E..044F, <br> 0451..045C, 045E..0486, <br> 0490..04C4, 04C7..04C8, <br> 04CB..04CC
Sofeshue (talk) 08:49, 11 April 2017 (UTC)Reply
I strongly think that ordering the tables by version is best, as a common scenario is users trying to work out what additions were made in which version and why. If someone wants to find out more about a particular code point then it is not hard for them to find the code point in the Final code points column. I still do not think that sorting is necessary in these tables. BabelStone (talk) 08:57, 11 April 2017 (UTC)Reply
I did a manual mockup for Cyrillic. I think it incorporates all the changes discussed. I kept the width=180 instead of using br's because that seems cleaner. I'll only set the width if we have a long range of code points. @Sofeshue:@BabelStone:: If this meets everyone's requirements I'll retrofit the existing histories using this format.DRMcCreedy (talk) 16:48, 11 April 2017 (UTC)Reply
Version Final code points[a] Count L2 ID WG2 ID Document
1.0.0 U 0401..040C, 040E..044F, 0451..045C, 045E..0486, 0490..04C4, 04C7..04C8, 04CB..04CC 188 (to be determined)
L2/00-164 Hudson, John (2000-05-01), Rendering Serbian italic forms with OpenType
L2/00-176 Everson, Michael (2000-06-01), Some Türkmen alphabets
L2/00-219 Everson, Michael (2000-07-09), The case of the Cyrillic letter PALOCHKA
L2/05-287 Kryukov, Alexey (2005-10-02), U 047C/U 047D CYRILLIC OMEGA WITH TITLO
L2/06-011 Cleminson, Ralph (2006-01-10), Cyrillic Omega with Titlo
L2/06-033 McGowan, Rick (2006-01-30), PRI #83: Changing Glyph for U 047C/U 047D Cyrillic Omega with Titlo
L2/06-192 Anderson, Deborah (2006-05-08), Request to Change Glyphs for U 0485 and U 0486
L2/06-292 Anderson, Deborah (2006-08-07), Re: Public Review Issue #83: Glyph change for Cyrillic Omega with Titlo
L2/06-329 Cleminson, Ralph (2006-10-11), Histoire d'O (omega with titlo)
L2/06-357 N3184 Everson, Michael; Birnbaum, David; Cleminson, Ralph; Derzhanski, Ivan; Dorosh, Vladislav; Kryukov, Alexey; Paliga, Sorin (2006-10-30), On CYRILLIC LETTER OMEGA WITH TITLO and on CYRILLIC LETTER UK
L2/06-389 Birnbaum, David (2006-11-13), Diacritics for Early Cyrillic
L2/08-144 N3435R Everson, Michael; Priest, Lorna (2008-04-11), Proposal to encode two Cyrillic characters for Abkhaz
L2/15-014 Andreev, Aleksandr; Shardt, Yuri; Simmons, Nikita (2015-01-26), Proposal to Change Annotations on Some Cyrillic Characters
L2/15-182 Whistler, Ken (2015-07-20), Suggested Responses to Suggestions re Cyrillic in L2/15-014
1.1 U 04D0..04EB, 04EE..04F5, 04F8..04F9 38 (to be determined)
3.0 U 0400, 040D, 0450, 045D 4 N1323 Kardalev, Ratislav; Jerman-Blazic, Borka; Everson, Michael (1996-01-16), Proposal and Summary for addition of Cyrillic characters
N1407 Kardalev, Ratislav (1996-05-15), Reconsideration of the ISO/IEC JTC1/SC2/WG2 N 1323 document
U 0488..0489 2 L2/98-211 N1744 Everson, Michael (1998-05-25), Additional Cyrillic characters for the UCS
L2/98-301 N1847 Everson, Michael (1998-09-12), Responses to NCITS/L2 and Unicode Consortium comments on numerous proposals
L2/98-372 N1884 Whistler, Ken; et al. (1998-09-22), Additional Characters for the UCS
U 048C..048D 2 (to be determined)
U 048E..048F, 04EC..04ED 4 L2/97-146 N1590 Trosterud, Trond (1997-06-09), Proposal to add 10 Cyrillic Sámi characters to ISO/IEC 10646
L2/98-211 N1744 Everson, Michael (1998-05-25), Additional Cyrillic characters for the UCS
L2/98-301 N1847 Everson, Michael (1998-09-12), Responses to NCITS/L2 and Unicode Consortium comments on numerous proposals
L2/98-372 N1884 Whistler, Ken; et al. (1998-09-22), Additional Characters for the UCS
3.2 U 048A..048B, 04C5..04C6, 04C9..04CA, 04CD..04CE 8 L2/98-258 N1813 Trosterud, Trond (1997-06-09), Proposal to add 10 Cyrillic Sámi characters to ISO/IEC 10646
L2/98-276 N1813 Kuruch, Rimma; et al. (1998-07-20), Norwegian comments on Cyrillic Sámi
L2/00-082 N2173 Everson, Michael; et al. (2000-03-03), Proposal to add 8 Cyrillic Sámi characters to ISO/IEC 10646
4.1 U 04F6..04F7 2 L2/02-452 N2560 Brase, Jim; Constable, Peter (2002-12-06), Proposal for Encoding Additional Cyrillic Characters for Siberian Yupik
5.0 U 04CF 1 N2942 Freytag, Asmus; Whistler, Ken (2005-08-12), Proposal to add nine lowercase characters
U 04FA..04FF 6 L2/05-080R2 Priest, Lorna (2005-08-02), Proposal to Encode Additional Cyrillic Characters (rev 2005/08/18)
L2/05-215 Anderson, Deborah (2005-08-03), Feedback on Cyrillic letters EL WITH HOOK and HA WITH HOOK (L2/05-080)
L2/05-230 Priest, Lorna (2005-08-11), Nameslist annotations for new Cyrillic letters
5.1 U 0487 1 L2/06-042 Cleminson, Ralph (2006-01-26), Proposal for additional Cyrillic characters
L2/06-181 Anderson, Deborah (2006-05-08), Responses to the UTC regarding L2/06-042, Proposal for Additional Cyrillic Characters
L2/06-359 Cleminson, Ralph (2006-10-31), Proposal for additional Cyrillic characters
L2/07-003 N3194 Everson, Michael; Birnbaum, David; Cleminson, Ralph; Derzhanski, Ivan; Dorosh, Vladislav; Kryukov, Alexey; Paliga, Sorin; Ruppel, Klaas (2007-01-12), Proposal to encode additional Cyrillic characters in the BMP of the UCS
L2/07-055 Cleminson, Ralph (2007-01-19), Comments on Additional Cyrillic Characters (L2/07-003 = WG2 N3194)
  1. ^ Proposed code points and characters names may differ from final code points and names
That's good enough. One minor point, should we record the total number of codepoints and the number of newly added ones? For example, in the above table, Version 1.1, column 3, 38 ---> 226 ( 38). Sofeshue (talk) 17:16, 11 April 2017 (UTC)Reply
I considered that but it would duplicate the information already in the Unicode block Infobox while eating up horizontal space.DRMcCreedy (talk) 17:29, 11 April 2017 (UTC)Reply
BTW @Sofeshue:, does width= work for you with Microsoft Edge/Internet Explorer. It seems to work fine for me in Firefox but not MS Edge.DRMcCreedy (talk) 23:55, 11 April 2017 (UTC)Reply
Indeed it does not work on Edge. It works on Explorer and other browsers. Sofeshue (talk) 04:00, 12 April 2017 (UTC)Reply
Experiment shows that at least <br> does work in Edge, so, consider using it again? Sofeshue (talk) 07:54, 12 April 2017 (UTC)Reply
After several hours I determined that width has to be on each and every table data row for Edge to honor it! br does work on Edge but the forced breaks get weird when the screen is resized so I'm going with width. I'll also use the nobr template for all IDs that can split. I played around with only doing it on the longest ID and it's time consuming and problematic so I'll put it on all IDs with dashes. I'll update the tables now. DRMcCreedy (talk) 21:13, 12 April 2017 (UTC)Reply
OK, let's width. Be careful to adjust the Code width to contain two block sections, as the length of the code in the second plane (Supplementary Multilingual Plane) onwards gets longer (e.g., U 10000..10FFFF). I'll help. Sofeshue (talk) 03:22, 13 April 2017 (UTC)Reply
All updates are done. Let me know if I missed anything. DRMcCreedy (talk) 03:39, 13 April 2017 (UTC)Reply
Damn, how can you be so fast bro? There can be some minor improvement though. E.g. Hatran, the Code column width should be a little large to accomadate two sections. Sofeshue (talk) 03:46, 13 April 2017 (UTC)Reply
I'm not sure how much time I'd spend customizing each block because the output isn't going to look the same for everyone. For example, if I make the Hatran code point column large enough to fit on one line, the document citation wraps to a second line (based on my normal window size). I suspect it might fit all on one line for you based on your suggestion. But if you feel there's a compelling reason to change some of the widths go ahead and update them. I'll add exceptions to my scripting so that I don't overlay them with the old width values if/when I add new documents to a given block. DRMcCreedy (talk) 06:34, 13 April 2017 (UTC)Reply
I realized that it must be the default font thing. I think on your browser, Hatran Code column folds after the second section, but on mine, it folds at the first and the second (so three lines), leaving a white space of about half of the width. I don't have any custom .css and I use the system default font (Arial) at default size (10.5), which is the font I believe most people use. I'll do some experiments and do fine-tunings to the width if necessary. Sofeshue (talk) 07:04, 13 April 2017 (UTC)Reply

Highway Gothic

edit

Hi, I am designing texts for Unicode 11.0 Highway Gothic font. It is composed by over 22,000 characters and used in all currently existed scripts (except Hiragana, Katakana, CJK, and Korean). --cyɾʋs ɴɵtɵɜat bʉɭagɑ!!! (Talk | Contributions) 11:29, 22 September 2017 (UTC)Reply

Template editor granted

edit
 

Your account has been granted the "templateeditor" user permission, allowing you to edit templates and modules that have been protected with template protection. It also allows you to bypass the title blacklist, giving you the ability to create and edit editnotices. Before you use this user right, please read Wikipedia:Template editor and make sure you understand its contents. In particular, you should read the section on wise template editing and the criteria for revocation.

You can use this user right to perform maintenance, answer edit requests, and make any other simple and generally uncontroversial edits to templates, modules, and edinotices. You can also use it to enact more complex or controversial edits, after those edits are first made to a test sandbox, and their technical reliability as well as their consensus among other informed editors has been established. If you are willing to process edit requests on templates and modules, keep in mind that you are taking responsibility to ensure the edits have consensus and are technically sound.

This user right gives you access to some of Wikipedia's most important templates and modules; it is critical that you edit them wisely and that you only make edits that are backed up by consensus. It is also very important that no one else be allowed to access your account, so you should consider taking a few moments to secure your password.

If you do not want this user right, you may ask any administrator to remove it for you at any time.

If you were granted the permission on a temporary basis you will need to re-apply for the permission a few days before it expires including in your request a permalink to the discussion where it was granted and a {{ping}} for the administrator who granted the permission. You can find the permalink in your rights log.

Useful links

Happy template editing! — xaosflux Talk 03:24, 14 September 2018 (UTC)Reply

  • Hi Drmccreedy, per Special:PermaLink/886347525 I've added this access for you. Please be sure to review the information above carefully. While this lets you bypass protections, ensure that you act carefully - especially if you want to branch out in to areas you have never edited such as luascript modules. If you are in doubt of an edit to a highly used template, after sandboxing and testing, filing an edit request for someone else to review is the best course of action. Best regards, — xaosflux Talk 03:27, 14 September 2018 (UTC)Reply
Thank you. DRMcCreedy (talk) 14:00, 14 September 2018 (UTC)Reply

ISO code redirects

edit

Hi Drmccreedy, thanks for fixing the wrong redirect from ISO 15924:Pauc to Pau Cin Hau instead of to Pau Cin Hau script.

I am currently working on a project using IANA language subtags (cf. IETF language tag) and machine-generated ISO-based links to WP articles of languages and language groups (e.g. ISO 639:myp, ISO 639:gmw), scripts (e.g. ISO 15924:Visp), and regions and regional subdivisions (e.g. ISO 3166-1:VU, ISO 3166-2:VU). One of my jobs is to check if the links exist and lead users to the correct article. Other than the wrong ISO 15924:Pauc redirect I already discovered a redirect using a non-existant ISO code (ISO 639:fl redirected to Filipino language) as well as the still erroneous redirect of ISO 3166-1:NL to the country of the Netherlands instead of to the Kingdom of the Netherlands which includes the countries of Aruba NL-AW, Curaçao NL-CW, and Sint Maarten NL-SX, see https://www.iso.org/obp/ui/#iso:code:3166:NL. I am likely to discover more of those.

Now my question is: How can I fix wrong ISO code redirects myself? Or, in case I am not authorized to do so: Where can I report wrong ISO redirects? Love —LiliCharlie (talk) 16:40, 5 October 2018 (UTC)Reply

@LiliCharlie: Here's how I'd do it: When you click on ISO 3166-1:NL you'll end up redirected to the Netherlands article. Notice it says "(Redirected from ISO 3166-1:NL)" just below the article title. Clicking on the ISO 3166-1:NL link takes you to the redirect itself (automatically using &redirect=no in the URL). Click "edit this page" just like a normal article. In this case, you would change #REDIRECT Netherlands to #REDIRECT Kingdom of the Netherlands. I don't know if there are edit restrictions but that may vary by the redirect and the user. Give it a try. I don't think there's an "official" place to report ISO redirect problems but if you're unable to update them, let me know and I'll give it a try. Thanks for improving the accuracy of Wikipedia. DRMcCreedy (talk) 17:03, 5 October 2018 (UTC)Reply
I've done that. I didn't dare because on ISO_3166-1:NL&redirect=no it says: "Do not replace these redirected links with piped links." Muchísimas gracias. Love —LiliCharlie (talk) 17:19, 5 October 2018 (UTC)Reply
You won't need piped links for your task, so you'll be fine. Glad the NL redirect update worked. DRMcCreedy (talk) 17:44, 5 October 2018 (UTC)Reply

Sharada script visibility?

edit

Can you see the Sharada script in Unicode? I cannot see it on my PC, Mac or IPhone. 174.3.181.199 says he can see it on his Linux machine.Malaiya (talk) 04:26, 17 December 2018 (UTC)Reply

@Malaiya: I can see it on my Windows 10 machine but I have the TWB01p and TWB01q fonts from Thomas Buchleither's font page installed. I don't know if these are truly compliant or if they fully support Sharada rendering requirements so I haven't added any information about them on the multilingual support page. They at least have glyphs, but I haven't confirmed they match the Unicode assignments completely. Hope that's some help. DRMcCreedy (talk) 05:35, 17 December 2018 (UTC)Reply
Thanks, that helps. 174.3.181.199 is apparently a researcher in fonts, I presume you are too. I think Sharada script page should say that those intending to see Sharada script, should install the fonts. I used to install fonts back in pre-Unicode days, and occasionally still do to read older documents, but generally I expect standard fonts should be enough. Malaiya (talk) 00:12, 18 December 2018 (UTC)Reply
@Malaiya: I'm not in favor of adding information about these specific fonts because they don't appear to meet minimum requirements for representing Sharada script. Looking briefly at the Unicode proposal, I tried out the vowel positioning of u (ku 𑆑𑆶 U 11191 U 111B6) and a basic consonant conjunct (kka 𑆑𑇀𑆑 U 11191 U 111C0 U 11191). The TWB01 fonts don't properly support either. But if you do add the information to Help:Multilingual support, it should clearly note this limitation. Look at Code2000 under the Buhid section for an example. Likewise, the "Contains special characters" template on the Buhid alphabet page is the normal way of pointing people to the font information for a given script. Cheers. DRMcCreedy (talk) 05:23, 18 December 2018 (UTC)Reply

thanks

edit

for fixing up all the Glottocode and ISO-639 errors. I'm going over them now, and the ISO errors especially are pulling up a lot of other confusion. I think they're mostly caused by page moves or split articles, where the ISO redirects were not fixed to match, so those are usually now wrong too. As are alt-name redirects. — kwami (talk) 04:31, 6 February 2019 (UTC)Reply

You're welcome. Glad I could help. DRMcCreedy (talk) 15:46, 6 February 2019 (UTC)Reply

Thank you

edit

Thanks for your correction in https://en.wikipedia.org/w/index.php?title=Kannada_alphabet . I'm just creating a dataset and your action helped me to correct it in time. Awesome! :)

Glad I could help. DRMcCreedy (talk) 21:09, 22 February 2019 (UTC)Reply

Irminones

edit

Hello. I am wondering if you have picked the correct glottolog name. The distinction between the Irminones and the other Germanic tribes was only made classicaly in a period before the split into lower and upper west germanic, and so maybe West Germanic is more correct. Of course there is some uncertainty about such an historical term, but this surely seems the most obvious conclusion.--Andrew Lancaster (talk) 07:05, 3 April 2019 (UTC)Reply

Hi @Andrew Lancaster: I was just matching the existing Glottolog code in Wikipedia to the name used in the (latest version of) Glottlog. (high1286 = Upper German and high1287 = West Middle German.) Unfortunately I have no idea if those are the right codes to use in the Infobox. There is a Glottolog code for West Germanic (west2793). You could use that instead if it's more accurate. DRMcCreedy (talk) 15:09, 3 April 2019 (UTC)Reply
Yes that is what I was thinking might be an alternative.--Andrew Lancaster (talk) 06:51, 4 April 2019 (UTC)Reply

Unicode code chart template -- expandable?

edit

Hi, one thing which I have been thinking about for a long time (several years) is to make the Unicode code chart templates expandable to show a list of all character names (and formal character name aliases). I think this would be very helpful to users as at present the only way to know what the character name is is to hover the mouse over the character cell whilst carefully avoiding hovering over the link that people so love to add to the characters; but the mouseover text is not copyable, so it is of limited use. I have made a rough mock up of what I mean in my sandbox. What do you think? Please feel free to tweak or improve it. (I suggest that this approach is not applied for large blocks with algorithmic names). BabelStone (talk) 20:42, 22 August 2019 (UTC)Reply

@BabelStone: I think it's definitely doable if you think it's useful. And if you've been thinking about it for that long it's probably useful.
  • I changed the "Character names" title to "List of character names" to be painfully clear.
It is better.
  • Should the list be sortable? This involves adding a header. I've mocked it up in your sandbox. I'd skip this on the algorithmic ones though because the code point and name always sort the same.
Personally, I don't think sortable is particularly useful, but I don't mind.
  • I'm assuming aliases will use the same format as the current charts: FOO (alias BAR)
Seems reasonable.
  • Can we agree that there should be NO LINKED CHARACTERS in that lists? If someone wants to link each character they can do so in the existing part of the chart as far as I'm concerned. Latin Extended-B is an example of this.
I full agree that there should be no links in the names list.
  • There's likely to be some duplication between the template and the article text. Latin Extended-B again is a good example. I'm thinking that article text with a list of characters can be removed once this is in place so long as they don't add additional information. (I would count the decimal values provided in Latin Extended-B as not adding information except to anyone who doesn't know you can use &#xHHHH; notation.)
Yes.
  • Lastly, do we need to worry about added character counts for articles that include multiple charts? Could this cause them to exceed size limits?
Probably not because articles with multiple code chart templates are generally not for very large blocks, and the huge blocks with algorithmic names will only have a slight increase in size.
DRMcCreedy (talk) 22:24, 22 August 2019 (UTC)Reply
I definitely think it is useful. If users want an overview of the character names, at present they have to click on the link to Unicode code charts or go to another website. (Other replies inline above) BabelStone (talk) 10:53, 23 August 2019 (UTC)Reply
I made the ogham table sortable, but when you sort by code point it does not sort in the expected order (hex values with A..F are sorted separately from hex values comprising 0..9 only). We could overcome this by putting the code point in a {{sort}} template with a fixed width decimal value for the hidden sort parameter, but this seems like too much trouble for a marginally useful feature. BabelStone (talk) 11:06, 23 August 2019 (UTC)Reply
In light of that, let's ditch sorting. DRMcCreedy (talk) 16:07, 23 August 2019 (UTC)Reply
Agreed. Here are a few more comments and questions I have before we start implementing the change to three hundred templates BabelStone (talk) 16:55, 23 August 2019 (UTC)Reply
  • For blocks with algorithmic character names I think best to only list first and last assigned characters in the block. I currently put "..." between the two rows -- is that OK, or is there a better way of indicating omission of the intervening rows?
I noticed that and thought it was intuitive.
  • Many or most blocks have hard-coded fonts applied (in the template or using css) to the code chart glyphs (which I personally don't like). For the names list it is useful to put the character after the code point, but I don't want to hard-code the fonts to use, so I was thinking of not specifying fonts for the names list part of the table. What do you think?
I'm OK with this but anticipate others will want to add font info. I'd say let's leave font info off for now and see if there's push back.
  • Do we want to add any other core data for the characters? For example, we could provide a column for general category or script. Is that perhaps overkill?
I thought of that too. Probably overkill. My concern is there's almost no end of info we could add.
  • Should the List of character names go above or below the Notes? I'm happy with current placement below the notes, but maybe it makes more sense to put the notes at the very bottom.
I like the notes at the very bottom logically, but the list is probably easier to spot if we don't wedge it between the chart and the notes. So let's leave the list as the last item.

@BabelStone: DRMcCreedy (talk) 17:07, 23 August 2019 (UTC)Reply

Thanks for all the feedback. I think we're about there now, but I don't want to rush into making quite a large change to a large number of templates, so I'll sit on it for a week or so in case you or me or anyone else has any suggestions for improving how we do it. BabelStone (talk) 20:05, 23 August 2019 (UTC)Reply
Sounds good. The only other question that's popped into my head is combining characters. Often in the chart we'll use a dotted circle (◌) or a space with them. I'm thinking if the purpose of the table is copy-and-paste, maybe we should skip that. Not sure I feel strongly either way but that should be nailed down before the charts are created. DRMcCreedy (talk) 20:43, 23 August 2019 (UTC)Reply
I've added an example for a block with combining characters (Combining Diacritical Marks for Symbols), with plain characters for the first row and prefixed with nbsp for the second row. The unprefixed characters do not look good as they straddle the code point column, so I think prefixing with nbsp is best (I don't like the dotted circle as that often interferes with the combining mark, and makes it difficult to see clearly). BabelStone (talk) 11:24, 24 August 2019 (UTC)Reply
I've also added an example with a character name alias. BabelStone (talk) 12:58, 24 August 2019 (UTC)Reply
Looks good. I like the linked "alias". DRMcCreedy (talk) 15:56, 24 August 2019 (UTC)Reply

Revert

edit

Hi, Drmccreedy, I noticed your revert :-) look for Amurskaja oblast' and you'll see there is a redirect to Amur Oblast... Lotje (talk) 15:55, 31 August 2019 (UTC)Reply

Sorry. I've restored the oblast link I reverted. My concern is only that the spelling match the source spelling at https://www.iso.org/obp/ui/#iso:code:3166:RU, not with links. DRMcCreedy (talk) 16:37, 31 August 2019 (UTC)Reply

Roadmap color keys

edit

Reason: [3]

I'm sorry to say you this but i don't feel right about the latest version of the roadmap. It is not as representative since e.g. South Asian and Central Asian scripts. I don't feel they are represented the right way and might cause confusion. As such i demand some changes to the roadmap. I'm going to use this reference: [4]

  • Separate South Asian and Central Asian scripts. The color representative of Central Asian scripts are added back.
  • The color used for "Notational Systems" in the current version will have to be expanded so can be used for combining marks (Diacritics), Punctuations, modifier letters, and notational systems. Because of this color key will be renamed "Systematic characters". What i mean by that is these characters aren't essentially symbols but systematic. This will be applied in BMP and SMP roadmaps.
  • The color used for "Linguistic scripts" will be used for Specials, C0 and C1 control characters. This color block will be named "Control characters", since they are used for the system. This will be expanded to the Variation Selectors and Variation Selectors Supplement in the Plane 14 (under different name of Variation Selectors Supplement) to conserve space as well as because they are control characters too. Take a look at Unicode control characters for the reason.
  • Because Variation Selectors had to use new colors, i had to say that color previously used by Variation Selectors in Plane 14 might had to be reused for a new style of script (Ideographs). Please note that Ideographs are separate from CJK Ideographs. Any characters that can't be considered CJK but ideographic will be included in this key. Anyways, this key will be uses in the Plane 1 and beyond. For example, Linear B ideograms. As more ideographic scripts are added in the future, this will be also expanded along with the addition of these scripts.

That's all of my requests to you. I'm sorry if i was little bit harsh, but it is clear that this roadmap needs an update. Thank you. SMB99thx XD (contribs) 06:46, 20 November 2019 (UTC)Reply

I don't quite understand the reasoning, but if it's about how to represent Unicode's Roadmap to the BMP in the graphic commons:File:Roadmap to Unicode BMP.svg the discussion should take place on the graphic's talk page at commons:File talk:Roadmap to Unicode BMP.svg.
Also note that the graphic is currently used on eight wiki projects (not counting the Commons; and commons:File:Roadmap to Unicode BMP multilingual.svg on another ten, commons:File:Roadmap to Unicode BMP.png on another five wikis), so any changes to it have an impact on all those wikis. Love —LiliCharlie (talk) 08:14, 20 November 2019 (UTC)Reply
Thank you. SMB99thx XD (contribs) 11:42, 20 November 2019 (UTC)Reply
I've moved this discussion to commons:File talk:Roadmap to Unicode BMP multilingual.svg. I'll address the various requested changes when I have some time to evaluate them. Hopefully later today. DRMcCreedy (talk) 17:29, 20 November 2019 (UTC)Reply

Reverted note. It's useful but misplaced. It should be in one of the articles that includes that chart. Individual characters aren't footnoted and if they were, there would be hundreds of footnotes in the charts.

edit

Moved discussion to Template talk:Unicode chart Halfwidth and Fullwidth Forms DRMcCreedy (talk) 19:50, 9 December 2019 (UTC)Reply

My edit was reverted (Mathematical Alphanumeric Symbols)

edit

Hello! You reverted my edit on Mathematical Alphanumeric Symbols as "reverting vandalism." I was attempting to fix an error (it seems that two pairs of symbols had been swapped). TdanTce (talk) 04:11, 2 January 2020 (UTC)Reply

My apologies. Vandalism wasn't accurate. However, the table wasn't wrong. You changed U 1D766 to show U 1D767, U 1D7A0 to 1D7A1, U 1D767 to U 1D766, and U 1D7A1 to U 1D7A0... swapping them yourself. Maybe your default font isn't showing them correctly but I've double-checked them in a Unicode-specific editor and they were correct prior to your edit. DRMcCreedy (talk) 04:22, 2 January 2020 (UTC)Reply
This is a known font bug, see Talk:Mathematical Alphanumeric Symbols#Rho and Theta-symbol in Maths sans-serif bold. BabelStone (talk) 17:21, 2 January 2020 (UTC)Reply

Unicode superscripts

edit

Hi. Yeah, clarified. Unicode only accepts characters that are semantically distinct, and these small caps are graphically basically the same as l.c. and so couldn't be used as distinct symbols. Most of the remaining are found in the lit e.g. as phonetic symbols, at least in tables of theoretically expected symbols. — kwami (talk) 02:48, 25 January 2020 (UTC)Reply

Thanks for the clarification. DRMcCreedy (talk) 04:07, 25 January 2020 (UTC)Reply

Edit to ISO_639:b

edit

Hi, Drmccreedy. I thought your edit comment here would be quite correct if the table on that page listed both languages, but in fact it only lists one. It seems to me it makes more sense that the link should take readers to the article about the language that is listed in the table. --R'n'B (call me Russ) 14:05, 15 February 2020 (UTC)Reply

My issue is that I don't trust the infobox comments on the Bonerif and Edwas pages. Neither are sourced so I think it was just someone's preference to say bnv applies to one and not the other. I should fix the comments in the language articles to say something like "ISO 639 treats Bonerif and Edwas as a single language" but I didn't feel like getting into an edit war with the person who's feels ISO 639 is "confused". Does the single language comment sound right? Then I think the item in the table should be "Beneraf, Bonerif, Edwas" like ISO 639 shows. If those changes are made, I think the disambiguation link should stay. What do you think? DRMcCreedy (talk) 16:16, 15 February 2020 (UTC)Reply
Well, you're getting over my head now. If ISO in fact lists all three names, then I suppose our table should do the same. In that case, the link to the disambiguation page should conform to WP:D#HOWTODAB. --R'n'B (call me Russ) 21:24, 15 February 2020 (UTC)Reply
I've updated the language name(s) and the two articles. DRMcCreedy (talk) 22:43, 15 February 2020 (UTC)Reply

Nomination of EBCDIC 389 for deletion

edit
 

A discussion is taking place as to whether the article EBCDIC 389 is suitable for inclusion in Wikipedia according to Wikipedia's policies and guidelines or whether it should be deleted.

The article will be discussed at Wikipedia:Articles for deletion/EBCDIC 389 until a consensus is reached, and anyone, including you, is welcome to contribute to the discussion. The nomination will explain the policies and guidelines which are of concern. The discussion focuses on high-quality evidence and our policies and guidelines.

Users may edit the article during the discussion, including to improve the article to address concerns raised in the discussion. However, do not remove the article-for-deletion notice from the top of the article. Fram (talk) 07:04, 23 June 2020 (UTC)Reply

Unichar

edit

Thank you for finding and fixing incorrect names. But I don't think you need to change lower case names to upper case because {{Unichar}} does that (or more precisely, applies style SMALLCAPS) anyway. Or is there a sublety I've missed? --John Maynard Friedman (talk) 18:12, 30 June 2020 (UTC)Reply

I don't change names based on case alone. At least not intentionally. Which specific edit(s) are you referring to? DRMcCreedy (talk) 18:15, 30 June 2020 (UTC)Reply
This: title=Lozenge&diff=prev&oldid=965330469 but you are correct. You removed an 'a'. --John Maynard Friedman (talk) 19:14, 30 June 2020 (UTC)Reply
And "... small black ..." became Unicode's "... black small...". No worries. It's nice to have another pair of eyes checking these things for when I do make mistakes. DRMcCreedy (talk) 19:27, 30 June 2020 (UTC)Reply

Weird behaviour of unichar template

edit

Just on the off-chance, perhaps you may have the solution to this conundrum? (Why does Unichar sometimes fail to render in small caps). If there is a solution, it is not at all obvious. --John Maynard Friedman (talk) 10:05, 1 July 2020 (UTC)Reply

I've updated the discussion on DePiep's talk page with my findings. DRMcCreedy (talk) 19:43, 1 July 2020 (UTC)Reply
TYVM. I've had a look. I wouldn't expect DePiep to be satisfied with the quick'n'dirty hack. Fancy your chances at fixing the the underlying fault in {{sc}}? If it was easy, anybody could do it, that's why we pay you so much! --John Maynard Friedman (talk) 23:14, 1 July 2020 (UTC)Reply
Very unlikely. Sorry. DRMcCreedy (talk) 20:19, 3 July 2020 (UTC)Reply

Dont !vote twice

edit

In Wikipedia:Articles for deletion/Code page 875 I think you may have voted !keep twice by mistake. If so please change one of them to a comment. Thankyou. Djm-leighpark (talk) 02:54, 24 July 2020 (UTC)Reply

Unintentional. Thanks for pointing that out. DRMcCreedy (talk) 03:08, 24 July 2020 (UTC)Reply
Thanks - I need to point out you need to sign that change as well. Thankyou.Djm-leighpark (talk) 03:22, 24 July 2020 (UTC)Reply

Unicode chart footnotes

edit

Hi Drmccreedy, what would be the correct location to raise changing the footnotes to letters? Category talk pages tend to be rarely used in my experience. CMD (talk) 16:37, 14 August 2020 (UTC)Reply

@Chipmunkdavis: That's a tough question. And one I should be able to answer considering I raised the requirement in the first place. Both the category talk page and various Unicode templates I checked have fewer than 30 watchers. So there's probably no high visibility spot to put it. I'd say propose it either on the Category:Unicode charts talk page or Template:Unicode chart Tagalog and ping user BabelStone (the only other user that might care). I don't think it will be a controversal change unless there's some technical issue I've overlooking. DRMcCreedy (talk) 17:25, 14 August 2020 (UTC)Reply
I put it on Wikiproject Linguistics so it has at least some visibility. Best, CMD (talk) 14:53, 19 August 2020 (UTC)Reply

Old Turkic script

edit

Sorry I may sound rough but it is actually that the written sources are from the 8th century, other source says it is been used since 6th century. So, I hope this may help. Thanks. Beshogur (talk) 20:40, 15 September 2020 (UTC)Reply

No worries. I never read past the first sentence, noticed the discrepancy, and hit undo too fast. DRMcCreedy (talk) 20:51, 15 September 2020 (UTC)Reply

EBCDIC Code page numbers without articles that need to transwiki

edit

The Code page articles missing articles (and have never been created) are Code pages 390, 391, 392, 393, 394, 395, 435, 829, 834, 835, 837, 839, 881, 882, 883, 884, 885, 886, 887, 888, 889, 890, 931, 933/1364, 935/1388, 937/1371, 939/1399, 1001, 1003, 1005, 1007, 1024, 1027, 1028, 1030, 1031, 1032, 1033, 1037, 1068, 1071, 1073, 1074, 1075, 1076, 1077, 1078, 1080, 1082, 1083, 1085, 1087, 1091, 1136, 1150, 1151, 1152, 1278, 1303, 1364, 1376, and 1377. Code page 1279 is also missing an article, but I haven't found any sources for the code table for that code page. You should also look at the reply on Scottywong's talk page. Alexlatham96 (talk) 00:38, 30 September 2020 (UTC)Reply

Missing cite in Iggeret of Rabbi Sherira Gaon

edit

The article cites "Mayence 1873" but no such source is listed in bibliography. Can you please add? Also, suggest installing a script to highlight such errors in the future. All you need to do is copy and paste importScript('User:Svick/HarvErrors.js'); // Backlink: [[User:Svick/HarvErrors.js]] to your common.js page. Thanks, Renata (talk) 14:38, 4 October 2020 (UTC)Reply

I suggest you contact other editors of that page as I'm not familiar with the topic. My sole contribution was to fix a language parameter of a template. Sorry. DRMcCreedy (talk) 21:56, 4 October 2020 (UTC)Reply

ISO 3166-2:GB

edit

ISO 3166-2:GB begins with the dreaded C word. Do you have access to the standard to give an actual date? (and add a footnote to say that it has been overtaken by events since then? --John Maynard Friedman (talk) 00:41, 30 October 2020 (UTC)Reply

I only have access through their online browsing platform (OBP). I'm also signed up for notifications of changes, which come fairly soon after the standard is updated. I've seen nothing on Buckinghamshire. Based on other country changes it can take some time (months) for the standard to adjust to reality. I don't really worry about it much because these articles on ISO 3166 are describing the standard itself while an article like Subdivisions of England should be more up-to-date. If you want, you could add a footnote citing a source and saying something like Buckinghamshire's change to a unitary authority is not yet reflected in the ISO 3166-2 standard. DRMcCreedy (talk) 01:24, 30 October 2020 (UTC)Reply

The Glottostar

edit
  The Original Barnstar
For your work over the years in keeping language articles in sync with Glottolog's ever-changing database. – Uanfala (talk) 20:12, 14 January 2021 (UTC)Reply

Thank you. DRMcCreedy (talk) 23:11, 14 January 2021 (UTC)Reply

Nomination for deletion of Template:ISO 15924/footer

edit

 Template:ISO 15924/footer has been nominated for deletion. You are invited to comment on the discussion at the entry on the Templates for discussion page. DePiep (talk) 20:36, 9 February 2021 (UTC)Reply

"ꭟ" listed at Redirects for discussion

edit

  A discussion is taking place to address the redirect . The discussion will occur at Wikipedia:Redirects for discussion/Log/2021 March 19#ꭟ until a consensus is reached, and anyone, including you, is welcome to contribute to the discussion. 𝟙𝟤𝟯𝟺𝐪𝑤𝒆𝓇𝟷𝟮𝟥𝟜𝓺𝔴𝕖𝖗𝟰 (𝗍𝗮𝘭𝙠) 08:19, 19 March 2021 (UTC)Reply

'LC'

edit

Hi. It's at [5]. LC is narrower than L: it's just {Lu, Ll, Lt}. L is {LC, Lm, Lo}. — kwami (talk) 09:02, 31 August 2021 (UTC)Reply

@Kwamikagami: I see two issues: First, tr44 (Unicode Standard Annex #44) should be added as a reference. Second, we need a way to clarify that LC doesn't include all the non-bold L rows under it. Maybe something like "L, Letter; LC, Cased Letter (Lu, Ll, Lt only)<ref>". What do you think? DRMcCreedy (talk) 15:30, 31 August 2021 (UTC)Reply

Sounds good. I don't know how important it is, was just trying to be complete. — kwami (talk) 19:45, 31 August 2021 (UTC)Reply

I've made the update. DRMcCreedy (talk) 21:57, 31 August 2021 (UTC)Reply

14

edit

Like it how you update for the 14th. Consistently & completely! Even wd is (prohibitively) late. -DePiep (talk) 21:07, 15 September 2021 (UTC)Reply

Thanks. I forget how much work it is! DRMcCreedy (talk) 21:17, 15 September 2021 (UTC)Reply

Regarding your undo

edit

Hi, I would like to talk regarding your undo here. I am new to contributing and I am trying to understand what is the difference between this source referenced as 32 in Letter frequency page and my reference to this blog. Both references are doing the exact same thing when the first one can get multiple references and the second one was rejected. I don't think there was a scientific claim there. They just took a well-known text and counted letter frequency. Is that make a difference if that counting is done in a blog published in practicalcryptography.com or in a blog published in armenianchat.net? Thanks Inaxmo (talk) 21:01, 5 January 2022 (UTC)Reply

Thanks for reaching out. The armenianchat.net source clearly falls under "Questionable sources" per Wikipedia guidelines as a group blog. Maybe practicalcryptography.com is also questionable as it could be a personal website but that's a little less obvious, at least to me. The other reason your edit caught my attention was the reference placement in what you added:

The phonetic layout is not very performant due to the letter frequency difference between the Armenian and English languages, although it is easier to learn and use.<ref>

. The reference supports frequency differences but was placed after "easier to learn and use". If you find an acceptable reference to re-add this, I'd suggest placing the reference after the comma so it's clear it supports the assertion about letter frequency. Ideally, you would have a citation to support the "easier to learn and use", which would go after the comma. I hope this helps explain my thought process even though it doesn't lay out a blank-and-white criteria for references. DRMcCreedy (talk) 21:40, 5 January 2022 (UTC)Reply

Arabic unicode

edit

Hello. I have been wondering about the reason Wikipedians use Arabic Unicode characters of the Presentation Forms (U FB5x-U FEFx) rather than the regular ones (U 060x-U 089x). As an example, you used the presentation forms too in one of your edits.

The presentation forms are never needed and their glyphs are not always supported by all Arabic typefaces. I keep correcting them all the time. Thanks. --Mahmudmasri (talk) 13:27, 10 February 2022 (UTC)Reply

That edit was mainly a copy from French and German wiki pages and I wasn't aware the original text was using presentation forms. I see no reason for presentation forms to be used and support your effort to correct them. DRMcCreedy (talk) 22:25, 10 February 2022 (UTC)Reply

Unicode version template

edit

Hi Drmccreedy,

I propose to use {{Unicode version}} template everywhere where the version number is to be followed & checked. The template must be coded, later today. (I have freed its name, old usage more specific name).

Then, when you do the version bumping this week ;-), you can replace a version number with the template: {{Unicode version|15}} → 15. In the future, we only have to check the template instead of searching for literal version wikitext code. OK? DePiep (talk) 07:25, 13 September 2022 (UTC)Reply

Sounds good. DRMcCreedy (talk) 14:41, 13 September 2022 (UTC)Reply

Problems of your edit note of Arabic Extended-C and Arabic Mathematical Alphabetic Symbols

edit

Hi Drmccreedy, I have been looked your edit note for Arabic Extended-C and Arabic Mathematical Alphabetic Symbols:

"I don't think it's notable if there's more than one"

I look out Plane (Unicode), it have more Arabic SMP block, but yeah, even I was outsighted there's other the Arabic block in SMP, which is Rumi Numeral Symbols, a set of numeric symbols used in Fez, Morocco, and elsewhere in North Africa and the Iberian peninsula between the tenth and seventeenth centuries. You outsighted recently added Arabic Extended-C block, so there are three Arabic blocks in SMP now.

That make sense the sentence "The block and Arabic Extended-C are only Arabic block defined in Supplementary Multilingual Plane (SMP)." seems superfluous that I don't need to add again.

Also, the edit note in Ahom (Unicode block) "Not remarkable Ahom block was expanded. Tangut Supplement was also expanded in v14.0." is explicit wrong, Tangut Supplement was shortened by to fix the erroneous block end point, and Ahom block I affirmed the block was first block to been extended of very long time, I never seen any blocks has been expended until 14.0, so yeah, Ahom is the first block has been expended since 1.1. Weather Top Wizard (talk) 10:31, 16 September 2022 (UTC)Reply

My apologies... You are correct. I saw the block change for Tangut Supplement but didn't notice it got smaller. While it's interesting that Ahom is the first block expanded since 1.1, I still don't know that it's notable. Unicode itself doesn't point this out at http://www.unicode.org/reports/tr44/#Unicode_14.0.0 but instead just states the change happened. If Unicode itself doesn't note "it's been a long time", I lean towards leaving it out. But if you want to add it back with the "citation needed" note I won't remove it again. DRMcCreedy (talk) 17:32, 16 September 2022 (UTC)Reply

U E001

edit
The SHADOWED UNICODE BARNSTAR for —once again— updating our Unicode meticulously, this time into version 15.0
U★E001
SHADOWED UNICODE BARNSTAR
DePiep 24 September 2022 (UTC)

DePiep (talk) 10:49, 24 September 2022 (UTC)Reply

Thanks! DRMcCreedy (talk) 19:36, 24 September 2022 (UTC)Reply

Cite report

edit

"Also, cite report isn't accurate because Unicode Consortium isn't a government body." Eh? Since when was report production a government monopoly? A slip of the pen, I assume? --𝕁𝕄𝔽 (talk) 10:24, 14 October 2022 (UTC)Reply

I shouldn't update Wikipedia after midnight... I'm not at my best then. Re-reading Template:Cite report I suppose Unicode itself could fall under "major semi-governmental instrumentalities" but I'm not sure every proposal would be seen that way. I guess the real question would be is Template:Cite report better than plain, old Template:Citation used for the thousands of Unicode document citations on hundreds of Wikipedia articles? DRMcCreedy (talk) 15:23, 14 October 2022 (UTC)Reply
good grief, it never to me that such a strange restriction would be in the template. Many NGOs publish reports and I have cited them using cite report. Maybe I should have used {{cite document}}. Oh wait... --𝕁𝕄𝔽 (talk) 18:19, 14 October 2022 (UTC)Reply

Kannada script

edit

comparing this version to this version you will see that your revert didn't remove the table, you just moved it to a template which is being deleted. a bot will probably undo your edit in the next 24 hours so you will have to actually remove the table, not just undo the edit. Frietjes (talk) 21:20, 25 January 2023 (UTC)Reply

Doh! Thanks for pointing that out. I've removed the template from the article. DRMcCreedy (talk) 21:34, 25 January 2023 (UTC)Reply

Unicode chart Bengali

edit

I ummed and ahhed about that edit and decided to be bold and see what happened. For background, see this diff. 𝕁𝕄𝔽 (talk) 19:39, 12 February 2023 (UTC)Reply

This is a political issue with some very passionate people wanting Unicode to change the name of the "Bengali" block to include "Assamese" or to re-encode all Assamese characters in a new "Assamese" block. Both of these are impossible for Unicode to do technically. The compromise included an FAQ and adding text to the Unicode PDF: In Assam, the preferred name of the script is Asamiya or Assamese.
I take your edit on good faith, and not another attempt to bring up the Bengali/Assamese debate, but I still can't support adding language information into the block templates. For example, the Myanmar block would then need footnotes for over a dozen languages: Aiton, Eastern Pwo Karen, Geba Karen, Kayah, Khamti Shan, Mon, Pali, Phake, Rumai Palaung, S'gaw Karen, Sanskrit, Shan, and Western Pwo Karen. DRMcCreedy (talk) 20:21, 12 February 2023 (UTC)Reply

Code page drafts for review

edit

Because I have finished the EBCDIC code tables for the wikibook, I would like you to check my drafts for the remaining code pages:

I'm currently traveling but may be able to review these around the end of the month. DRMcCreedy (talk) 03:08, 18 May 2023 (UTC)Reply
@Alexlatham96: I've reviewed your draft charts and made various changes while verifying the Unicode mapping/names. DRMcCreedy (talk) 04:15, 1 June 2023 (UTC)Reply
Also Draft: Code page 898, which it was decided is too different for a redirect. I would also prefer Draft: Code page 906 to have its own article ready.
@Alexlatham96: I've reviewed and updated both those pages. DRMcCreedy (talk) 23:35, 12 June 2023 (UTC)Reply

Thanks Drmccreedy

edit
  Thanks a bunch
Hello,

Thank you for correcting my edit and redirecting me to here: https://en.wikipedia.org/wiki/List_of_circulating_currencies

You made me learn about a different article. I appreciate your kind and nice answer! <3 Mavreju (talk) 16:44, 4 October 2023 (UTC)Reply

Code page 220

edit

This now-deleted article is on Wikibooks. See wikibooks:Character Encodings/Code Tables/MS-DOS, where other DOS code pages can be added (like the FreeDOS code pages). Anyone else working on these types of articles (HarJIT, Spitzak, Gschizas, Matthiaspaul, etc.) should also be informed. Alexlatham96 (talk) 23:01, 19 October 2023 (UTC)Reply

Loma language, etc.

edit

Hi,

Thanks for the reversion you made here. I was on the point of doing the same thing, but dithered while I tried to make sense of the IP editor's other edits. Would you mind taking a look at their edit history? I've reverted a number of their edits that made no sense to me, but I'm quite out of my depth in the subject area, and not all of them seem completely unhelpful.

Best wishes, Jean-de-Nivelle (talk) 11:24, 6 November 2023 (UTC)Reply

@Jean-de-Nivelle: I reviewed the edits I hadn't looked at yet and reverted many of them. Best regards. DRMcCreedy (talk) 05:44, 7 November 2023 (UTC)Reply

Unicode and image maps

edit

Are you familiar with the concept of image maps that define clickable areas on images used on web pages? This is the feature that allows you, for example, to view an image of the United States on a web page, and when you hover over each individual U.S. state, you can click it and go to an article about just that one state. So, for example, the US image might have 50 predefined clickable areas roughly corresponding to the state boundaries. You probably see where I'm going with this: if the little boxes (some of them, at least), on the Unicode images you are designing cover a set of code points for which we already have an article (or might have one in the future), you can define an image map so that clicking your image in the right square jumps to the Devanagari article, or whatever. You don't have to be a web designer yourself to do this, there are teams that can help; one is at WP:GL/I, but there may be others. That would be a really nice addition to your images. Mathglot (talk) 02:17, 19 November 2023 (UTC)Reply

I've heard of this but hadn't considered it in this case. My first thought is that the categories match the Unicode Standard table of contents well but don't always map nicely to Wikipedia articles. My second thought is that because these SVG are used across different wikis the links would have to vary based on the Wikipedia using the image. I'm not sure how to get around that unless each wiki has its own set of SVGs which defeats the purpose of having shared, multilingual SVGs. DRMcCreedy (talk) 04:31, 19 November 2023 (UTC)Reply
Two thoughts: I'm not an expert on this, but the image map doesn't reside in the image, it is separate, so I believe other wikis could ignore the map or use it (or create a different one) as they chose. But don't quote me on that. Another thing that occurs to me, is that you don't need to create different images, just so the text in the legend (or in the image) can be rendered in French, Spanish, or whatever. If you create your image as SVG, then you only need to have that one image for all 300 Wikipedias; you can translate just the labels with a tool here, without touching the image itself, and then the right image label text will come up in the right language depending on which Wikipedia you are looking at. I've used the svg translate tool myself to create multiple versions of some svg's for different languages. Once you've created one of them, you can crank them out in multiple languages, pasting the labels into a webform each time. See for example, c:File:Chronologie constitutions françaises.svg, which now has nine languages (click the dropdown to see). Mathglot (talk) 05:40, 19 November 2023 (UTC)Reply
That sounds technically possible. Not sure I want to pursue it tho, at least not for the roadmaps. DRMcCreedy (talk) 18:29, 20 November 2023 (UTC)Reply

fr/fre/fra

edit

I should've clarified that I added French codes to the list because I often see "fre" in ISO 639-3 contexts (from "French") used incorrectly instead of "fra" (from "Français"). For that reason, I think it's pertinent to keep it, but I wouldn't add other languages blindly. Iketsi (talk) 00:19, 6 January 2024 (UTC)Reply

OK. I won't oppose if you add it back. DRMcCreedy (talk) 00:24, 6 January 2024 (UTC)Reply

Asomtavruli capitals

edit

Hi!

I noticed you revert my edit (on the Georgian block page) where I changed "Asomtavruli capitals, known as Mtavruli..." to "Mkhedruli capitals, known as Mtavruli". I think you misread the context? Asomtavruli letters are sometimes considered capital letters (as you mentioned in your explanation) but this passage suggests that Mtavruli letters are capitals of the Asomtavruli letters themselves. As far as I know, they're two different (both nonstandard) ways of capitalizing Georgian text. That is, unless I'm misreading the sentence. I just wanted to check before I considered changing anything. Maybe the sentence could be written more clearly? Spaceexplorerer(talk) 02:06, 16 May 2024 (UTC)Reply

Unfortunately I'm away from my home so I can't review my reference material but I think the issue is that Asomtavruli literally means "capital letters" so while it's true that both are used for emphasis, it's more accurate to refer to Asomtavruli, not Mkhedruli, as capitals. But you pointed out the real issue and that is that the script is unicameral so the sentence needs to be written more clearly. It's probably best to scratch that sentence entirely or to note how the two are both used for emphasis and how. DRMcCreedy (talk) 14:22, 16 May 2024 (UTC)Reply

Noto fonts and unichar

edit

Just FYI, {{unichar}} no longer requires a text description as the canonical text is now picked up from Wikidata. So {{unichar|0031}}, {{unichar|0031|digit 1}} and {{unichar|0031|digit one}} and even {{unichar|0031|DRMcCreedy}} should all give the same result: U 0031 1 DIGIT ONE, U 0031 1 DIGIT ONE and U 0031 1 DIGIT ONE (and even U 0031 1 DIGIT ONE). QED. 𝕁𝕄𝔽 (talk) 09:40, 15 July 2024 (UTC)Reply

Thanks @JMF: for clarifying this. I knew that it was programmatically populated if omitted but didn't realize it was ignored entirely. The name is specified for the unichar template in around 600 articles. Is there any benefit in eventually removing it or should it be left to be treated as a comment? I'm guessing other editors are also confused by this, especially if they just copy from existing unichar examples. DRMcCreedy (talk) 17:16, 15 July 2024 (UTC)Reply
Yes, we discussed that issue at template talk:unichar at the time the change was being made (as a result of some insidious vandalism and too many simple errors, plus the few cases of the Consortium correcting errors in their "we never touched nuffink, Guv, honest oh look a squirell" way  . The consensus was that only the canonical name should ever be shown.) It think the conclusion was that it would be a huge amount of gnomic work for no evident return. So I (at least) have taken opportunities as presented, to simplify the calls when I'm doing a substantive edit. Over time, it will fade away. I hope that if someone hammers away trying to get 1 digit 1 but "the system" insists on returning 1 digit one, that they will go and read the template doc. Meanwhile, it does no harm as a sanity check and corrects editor typos for free. --𝕁𝕄𝔽 (talk) 20:00, 15 July 2024 (UTC)Reply

Oh my!

edit

It's barely been a day since Unicode 16.0 released, and there's already a Myanmar Extended-C page! Wow! You are fast! Logan1spyker (talk) 03:57, 11 September 2024 (UTC)Reply

Thanks. I've done my homework. Still so much more to update! DRMcCreedy (talk) 04:02, 11 September 2024 (UTC)Reply
1 Double sharp (talk) 09:21, 11 September 2024 (UTC)Reply

Your reversion of my edit to "ISO 3166-2:GB"

edit

Greetings and felicitations. I noticed that you reverted my edit to ISO 3166-2:GB. I made that edit because it was nearly impossible to find the Outer Hebrides' code, and Welsh locations already have alternate names in square brackets. (I only found the code in the Outer Hebrides article, having missed it previously.) Is there any proper way to add the same information that I did? —DocWatson42 (talk) 05:36, 26 October 2024 (UTC)Reply

The alternate names in square brackets in the GB article are actually part of the Standard. For example https://www.iso.org/obp/ui/#iso:code:3166:GB lists GB-CAY as "Caerphilly [Caerffili GB-CAF]". That isn't the case for GB-ELS. A note about it being Outer Hebrides would be useful. Either add it to the text of the article or add a "[note 1]" after "Eilean Siar" with the text of the note after the table, similar to how ISO 3166-2:AM does (but instead of on a column, just after "Eilean Siar"). Cheers.DRMcCreedy (talk)