Meetups/Wikisource/Notes
Welcome to WMF etherpad installation. Please keep in mind all current as well as past content in any pad is public. Removing content from a pad does not mean it is deleted. Keep in mind as well that there is no guarantee that a pad's contents will always be available. A pad may be corrupted, deleted or similar. Please keep a copy of important data somewhere else as well
Agenda
- Backlog clearance of Phabricator tickets
- Update Tessaract package on Tools Labs regularly
See also :
Several element concerns directly Wikisource on 2015 Community Wishlist Survey/Results https://meta.wikimedia.org/wiki/2015_Community_Wishlist_Survey/Results -- User:Astinson (WMF) will be
https://lists.wikimedia.org/pipermail/wikisource-l/2016-May/002807.html
https://etherpad.wikimedia.org/p/wscon2015
Wikimania 2016 hackathon notes: https://etherpad.wikimedia.org/p/wikisource-hackathon
23.06.2016 (Museum)
Google OCR Bodhisattwa: told about the short history of OCR problem in Indic laguages and development of OCR4wikisource tool (https://phabricator.wikimedia.org/T120788). Andrea showed the recent stat of Bengali Wikisource.
Alex : Community tech team is there
Niharika: We need to prioritize Wikisource in community wishlist
Andreas
Follow Wikisource mailing list if you donot use Phabricator
Tpt is showing VE in NS:Page (https://phabricator.wikimedia.org/T48580) Please test and get it back to Tpt
Bug files (https://phabricator.wikimedia.org/tag/proofreadpage/) , FIle bugs in https://phabricator.wikimedia.org/project/view/276/ Ideas for Community Tech backlog:
* Let users review pages of a book without having to load the VE interface for every single page (contact Tpt for more info) * Help with code review & deployment of the extension for handling different characters in Chinese wikisource *RTL and vertical script integration (https://phabricator.wikimedia.org/T11436 )
Stories of outreach
- Taiwanese to Chinese Wikisource: focused on transcription of an entire dictionary which was released by the authors' heirs.
- Main community is the Chinese Wikisource.
- discussion string on Chinese Wikisource https://zh.wikisource.org/wiki/Wikisource:写字间#.E7.BC.BA.E5.AD.97.E8.99.95.E7.90.86.E8.AD.B0.E9.A1.8C.EF.BC.9A.E5.BC.95.E9.80.B2.E5.8B.95.E6.85.8B.E7.B5.84.E5.AD.97.E8.99.95.E7.90.86.E6.8A.80.E8.A1.93
- discussion string on Chinese Wikipedia https://zh.wikipedia.org/wiki/Wikipedia:互助客栈/技术#.E7.BC.BA.E5.AD.97.E8.99.95.E7.90.86.E8.AD.B0.E9.A1.8C.EF.BC.9A.E5.BC.95.E9.80.B2.E5.8B.95.E6.85.8B.E7.B5.84.E5.AD.97.E8.99.95.E7.90.86.E6.8A.80.E8.A1.93
- Taiwanese user developed a new extension for the different characters in Chinese/Taiwanese Wikisource which are not yet supported in Unicode. https://www.mediawiki.org/wiki/Extension:Ids, https://github.com/Wikimedia-TW/Mediawiki-IDSextension (non-internationalize version), https://github.com/Wikimedia-TW/han3_ji7_tsoo1_kian3_WM (the internationalized version, this one is the one needing code review)
- Extension is already developed with some code review, better move the hosting to gerrit. Contact Niharika. AWight (Adam Wight, fundraisng technical team) also interested in reviewing.
- in WMFlabs demo https://tools.wmflabs.org/idsgen/⿲☺rz.png?字體=楷體 <-- what does this do? it is an example of how the extension can do, it is combining the ideograph roots of Chinese into one character, if it's in the wikipages looks like this : https://upload.wikimedia.org/wikipedia/commons/c/c7/Idsrender_test.jpg
- Sounds comparable to https://www.mediawiki.org/wiki/Extension:Josa which was recently enabled on the Korean Wikipedia, although that doesn't have to deal with missing Unicode characters. To add new Unicode characters you may use ULS to deliver a webfont, too. See also https://www.mediawiki.org/wiki/Extension:UniversalLanguageSelector/Fonts_for_Chinese_wikis
- phabricator task https://phabricator.wikimedia.org/T137786
- Using one object as a storytelling strategy to make WikiSource capable and usable for these other projects.
- Any relation to the Wikimedia Indonesia transcription project? http://www.wikimedia.or.id/wiki/Digitalisasi_Konten
- Main community is the Chinese Wikisource.
- Felix from Wikimedia Ghana -- English WikiSource , no Wikisource community in Ghana, have learned some Basics yesterday and we will try it soon
- Languages of Ghana: http://www.unicode.org/cldr/charts/latest/supplemental/territory_language_information.html#GH
- LOC knows about 20 books in Ghana's main language, Akan: https://www.loc.gov/books/?q=&fa=language:akan&all=true&st=list
- Send representation to regional conferences like WCI, wikiindaba etc. to help people learn about Wikisource? [But what sort of people?] https://en.wikisource.org/wiki/Portal:Ghana
- Spam people!
- Facebook page - https://www.facebook.com/wikisource
- Facebook group - https://www.facebook.com/groups/210444209286893/?fref=nf
- Discourse: https://discourse.wmflabs.org/c/wikisource-d
- Awight asks what Wikisource actually does
- Andrea: actually elaborated the mission in previous Wikisource conference (https://etherpad.wikimedia.org/p/wscon2015weekend)
- take a scan, put it online, OCr, transcribe, make it html, make epub, link to other wiki projetcs
- What Wikisources allow unpublished, community-made translations? About 20 are known: https://wikisource.org/wiki/Wikisource:Subdomain_coordination For English, translations are allowed on Wikisource but can (probably) also be contributed to Wikiversity.
- RTL issues: Tpt fixed all that was reported, please report more. Also need vertical text support (https://phabricator.wikimedia.org/T11436 ), but for what exactly? The feature request would benefit from a list of use cases whjich would have an active community behind them.
- Know what the users do with Wikisource:
- Site views tool: http://tools.wmflabs.org/siteviews/ (the most viewed book, the most required authors...)
- https://tools.wmflabs.org/ws-search/ potentially useful for new users to browse stuff?
- Every library user is looking for works by copyright status! https://tools.wmflabs.org/ws-cat-browser/ Or maybe by awards won and stuff. Next step: fetching statements from WIkidata and browse Wikisource books by that (e.g. information on authors, which are usually linked to Wikidata). The header templates might use such data to categorise, as well, but nobody bothered so far.
- As for digitali library alliances, it would be nice to improve the format support on translatewiki.net and make DSpace join https://translatewiki.net/wiki/Thread:Support/Adding_DSpace
- Asaf gives Wikidata training at the collaboration space next to the town hall
- On the German Wikisource, before adding a book you apparently need to "promise" that you will proofread it within a month.
- How to involve unregistered users who just stumble upon a text? We should do more of that (ovation)!!
- Voting system for proofreading valutation, caused a very lively discussion on Wikisource-l
- Asaf set up some simple JavaScripts for visitors to report corrections to an email address (he will translate it in English, Andrea will try to use it on Wikisource)
- LA2 shows usage of parallel text corpora (Anna Karenina) to learn languages on Wiktionary, by adding quotations in source and target language (Russian original, PD translation to Swedish; a Swedish learner of Russian can learn Russian better and improve entries) https://ru.wiktionary.org/wiki/egenhet , https://ru.wiktionary.org/?oldid=6050693
Wikidata graph builder :
- citation network for Zika research papers: https://angryloki.github.io/wikidata-graph-builder/?property=P2860&item=Q23906890&iterations=5&mode=undirected - SCOTUS decision on english Wikisource (still need data to be added in ikid): https://angryloki.github.io/wikidata-graph-builder/?property=P2860&item=Q300950&iterations=3&mode=undirected