User talk:Richard Nevell (WMUK)/2016–2018

Latest comment: 5 years ago by MediaWiki message delivery in topic Facto Post – Issue 19 – 27 December 2018


Senghenydd

Hi Richard, I hope all is well with you. Just a brief note that the Senghenydd colliery disaster article is now an FA, thanks to several people, not least you, Jason.nlw and Dr. Blofeld's 'Dragon' project. Cheers – SchroCat (talk) 16:43, 8 April 2016 (UTC)

That's Great news! Thanks SchroCat for all your hard work on this. Congratulations.Jason.nlw (talk) 16:04, 11 April 2016 (UTC)
That is tremendous work, well done SchroCat! Richard Nevell (WMUK) (talk) 09:34, 12 April 2016 (UTC)

Hm

About this ... would you please explain your role here? (You can reply here) Thanks. Jytdog (talk) 17:20, 27 April 2016 (UTC)

Hello Jytdog. I have been helping the course leader with getting to grips with Wikipedia. My comment was made with the intention of starting a dialogue. Editors have been advising the students to use secondary sources and to consult WP:MEDRS, and I think discussing with the students which sources in particular pose issues and which are acceptable would be a very useful next step. That would help them identify which areas are fine and which need more work. Richard Nevell (WMUK) (talk) 10:23, 28 April 2016 (UTC)
Thanks for replying! Your requests there and here are not reasonable to me. I spend a lot of time helping new users get oriented. Don't get my wrong there. Just this request in particular. Jytdog (talk) 10:40, 28 April 2016 (UTC)
@Jytdog: I understand it's easy for me to say 'can you please help' and helping new editors is time consuming. What is it about this request you object to? Richard Nevell (WMUK) (talk) 11:08, 28 April 2016 (UTC)
Thanks for asking! I went back and looked at their interactions with the editing community before I reverted that, and two of our most experienced health/med editors had already advised them they were painting with way too broad a brush. And the content they wrote was little changed and is ... unuseable. The more I read the deeper into "oh no" i got. (I can talk about that more, but basically the content has no nuance and assumes that women have something very like estrus that determines their behavior, which is just.... ack. double ack. really the worst kind of evolutionary psychology) And when I started checking their refs I found they didn't cite page numbers for books, used popular media, and most aggravatingly, didn't use any PMIDs as they are instructed to, in the tutorial. (The latter may sound petty if you don't know how we use PMID in citations, but it saves so much time and when there are 55 refs the inefficiency of working without PMIDs is too much)
In general it is really unwise for students to try to work on FAs, especially in health/medicine. It is really hard to improve them. It is possible (of course), but it takes sophistication in the subject matter not to mention WP skills.
While I am talking. There is something I wish the Education program would help better frame for students, just on a high conceptual level. Students spend their whole lives producing work that is very much meant to be their own, that they present as a whole for grading. That makes complete sense in school. It makes no sense in Wikipedia. Around this time we get scads of students who have created some chunk of content that they want to plop into a Wikipedia article, that takes little to no account of issues like WEIGHT or a bunch of other things. Real editing is nothing like that.
I don't know if you have done any translation work (one language to another) but there is always a tension when you do that, between hewing close to the syntax and feel of the source language or writing beautifully in the target language. (For example, the Hebrew expression that we usually translate as "he became angry" is literally translatable as "his nose grew long"; you get a whole different feel if you hew closer to the source language. Someone has actually done a whole translation of the Hebrew Bible that way.).
I think sometimes the Education program allows/supports/enables students working in Wikipedia to stay too close to what they know, in terms of what "schoolwork" is. I have wondered if a better model wouldn't be to clone off articles and have students improve whole articles as part of their classwork. And leave it to the students if they want to try to implement any of the individual edits making up those improvements in Wikipedia, once they have been worked over in class. I don't know if that has ever been kicked around. But that is what I mean about a different notion of "schoolwork". As it is, the editing community has to deal with these "nose grew long" kind of things that are just really abnormal - almost impossible to deal with - for us. And so many at once! Anyway, enough from me. Thank you again for asking and this was I am sure way more than what you wanted. :) Jytdog (talk) 11:36, 28 April 2016 (UTC)
Hi Jytdog, rather than being too much that definitely helps me understand the situation. Richard Nevell (WMUK) (talk) 15:25, 3 May 2016 (UTC)
Thanks for your gracious reply.Jytdog (talk) 15:42, 3 May 2016 (UTC)
Just want to add here that as part of the training, getting teachers and students to deal with a) whole articles and b) what the most recent and best sources say about the the whole topic, and whether the article is actually NPOV with regard to WEIGHT, and whether there is OR or everything is actually well supported... can be very engaging and high level scholarship. To do that work you need to really engage with the subject matter and the sources (and go out and find sources), weigh strength of sources and judge which are actually "best" and for what, and think critically, not only about what is in the sources but what is already presented in the article. This is all good meaty scholarly stuff and students would learn all kinds of things that way, along with getting more deeply trained in WP's mission and the policies and guidelines). Jytdog (talk) 19:34, 3 May 2016 (UTC)

WMUK AGM

I assume that this is the email you were pointing me towards, and while I'd normally be happy to lead a wikitakes I don't know yet whether I will be able to make this year's AGM in person. If I do make it, it may not be for the whole day - I have an engagement in Loughton that evening that may also have an afternoon component. Until I know details of the WMUK AGM programme I am unable to see what portion(s) of the day I'll benefit from the most, how practical and affordable travel, and thus how I will divide my day. Sorry I can't be more helpful. Thryduulf (talk) 16:51, 25 May 2016 (UTC)

@Thryduulf: No problem, I hope you can make the AGM but if you can't make sure to send in your proxy vote!
What I had in mind was the email I sent (20th May) about the training module we'd like to give a trial run. Richard Nevell (WMUK) (talk) 09:57, 26 May 2016 (UTC)
I don't seem to have received that email... Thryduulf (talk) 10:13, 26 May 2016 (UTC)
@Thryduulf: That's peculiar, I've tried re-sending it. Richard Nevell (WMUK) (talk) 10:20, 26 May 2016 (UTC)

Funding for a photography

Dear Richard Nevell, I was advised by Dr. Blofeld to contact you regarding an enquiry for photographs from the Ashmolean Museum to improve content. Let me clarify my request: members of the WikiProject Ancient Egypt, most importantly user Khruner and I, have set ourselves an ambitious goal a couple of years back: to acquire and post on Wikicommons and Wikipedia at least one image per pharaoh article for pharaohs known from archaeological evidences, especially for those of the shadowy Second Intermediate Period. Thanks to Khruner's impressive illustrations (e.g. [1], [2]), some contacts with itinerant Egyptologists whom we convinced to release the copyrights on their photos (e.g. [3]) and materials with expired copyrights in libraries (e.g. [4]) we managed to get pretty close to our goal. There is however a few stubborn cases that are nearly impossible to solve and this is the object of my request. Pharaoh Sekheperenre is known from a single scarab seal bearing his name, housed but not displayed in the Ashmolean Museum. The curator of the Egyptian collection gave me the exact reference of the seal in the museum catalog, from which I can order for a photography to be made by the museum, at a fee of circa 50 GBP. Provided the museum agrees to release all copyrights of the photography once taken, I wonder if it would be possible to have this fee payed by the Wikimedia foundation, so as to finally secure the image of this rare seal? Iry-Hor (talk) 16:00, 3 January 2017 (UTC)

The problem here is licensing I think, when you contacted me I thought you meant they charged a fee for you to take photographs in the museum yourself which you wanted help on. If other people take photographs and you pay them for it, that'll be difficult as they would be unlikely to put it in CC non attribut licensing, you'd need OTRS ticketing and full confirmation on that. That's something which WMUK might be less likely to support, paying people for photographs, but perhaps Richard can help you!♦ Dr. Blofeld 20:04, 3 January 2017 (UTC)

Iry-Hor, thanks for getting in touch, and thank you Dr. Blofeld making the suggestion. It sounds like a really interesting project, and those are some really useful photos. The first hurdle is working out whether the museum would release an image under a Creative Commons licence, and if that's clear WMUK would decide if £50 for an otherwise unreachable photo is something we'd fund. Looking at the museum's contact list, was it Liam McNamara you talked to? We would need to talk to the Publications, Filming & Licensing department. I'm happy to get in contact with them and copy you in. Richard Nevell (WMUK) (talk) 12:16, 5 January 2017 (UTC)
As a bit of background, in the past image releases from the Ashmolean have required committee approval (see Wikipedia:GLAM/Bodleian/4th month report#Institutional policy). Richard Nevell (WMUK) (talk) 12:28, 5 January 2017 (UTC)
Dear Richard, yes I had an email exchange with Liam McNamara, he told me, among other things, that "you should contact my colleagues in museum’s Picture Library ([email protected]) to enquire about existing photographs or to commission new photography. They will also be able to advise about copyright as it applies to images of Ashmolean objects used on Wikipedia." Given the nature of the object (by its way the exact accession number is AN1935.100a and we need a photo of the side of the scarab with the hieroglyphic inscriptions with the king's name) it is almost certain that no photography already exist of it, hence according to this, we are looking at a 50 GBP cost. The photography enquiry form mentions the following options for the "Reproduction Rights required": Study only, Publication, Media, Online, Commercial and Display. I am not sure which applies best (online or display?), but we can state in the "additional information" that this is for wikipedia and hence they can decide if they agree. Should I or you complete the form and see what comes of it? Iry-Hor (talk) 13:46, 5 January 2017 (UTC)
My feeling is our request wouldn't quite fit with the form so I'll email them directly. Richard Nevell (WMUK) (talk) 14:30, 5 January 2017 (UTC)
Dear Richard, I am very very surprised that Amy states the seal is on display as I have visited the museum numerous times until early 2015 and I have never seen it. Unfortunately I am not in Oxford anymore and cannot go back to the museum to find it. Let me know if the copyright she proposes is acceptable, if not I will try to find a wikipedian who lives in Oxford so that he/she can visit the museum and take the photograph. In this case, I will ask Amy to clarify where the seal is on display as I am absolutely certain to have never seen it. Iry-Hor (talk) 13:07, 9 January 2017 (UTC)
My understanding of the licence is that the restrictions would make the file effectively fair use. Wikipedia does of course have fair use files, but I think we might be best off finding an Oxford Wikipedian who can spot it in the exhibition. I'll reply to Amy that the image would really need to be CC and then you can ask about where the scarab is on display. Richard Nevell (WMUK) (talk) 15:06, 9 January 2017 (UTC)
So I have found a number of Wikipedians in Oxford willing to go take the picture. I asked Amy to specify where the scarab is on display as you suggested. As soon as I get this precision, I will pass it on to the interested people. Iry-Hor (talk) 09:32, 10 January 2017 (UTC)
Dear Richard, I hope all is well with you. I would like to enquire about what to do regarding Amy's offer: would the foundation be willing to pay 85 pounds to get a copyright-free photography of the scarab? If not, there is the possibility of paying only 20 pounds for a copyrighted low-resolution photography. From this photography we should be able to make a very good drawing of the scarab, drawing which would be copyright-free and hence can be posted. The last alternative is to wait for Liam McNamara to be back from the field and ask him to accompany someone to the storage for a photo to be taken, although as Amy stated, this is unlikely to happen given the museum desire to get a fee out of the whole deal. Anyway, can we have WMUK last word on the matter? Iry-Hor (talk) 08:52, 19 January 2017 (UTC)
Hi Iry-Hor, sorry for the delay, I've got your email and will reply later today. Richard Nevell (WMUK) (talk) 10:07, 20 January 2017 (UTC)

Classics editathon messages

Hello!! thanks for all the help and patience! Great eventLeanwa13 (talk) 12:42, 23 January 2017 (UTC)

Mmmm hope I'm doing this right! Can't quite get used to the idea that EVERYTHING is editable! Thanks for the info and support Srsval (talk)

Citation templates

I can't find the "Reply to" template on the Gaelic wikipedia, so I'm pinging you here. I left a question for you on my page. GunChleoc (talk) 16:03, 27 June 2017 (UTC)

Ping GunChleoc (talk) 15:48, 28 June 2017 (UTC)

Thanks for the extra info, I'll put it on my todo-list. GunChleoc (talk) 16:12, 28 June 2017 (UTC)

Thanks for your help today

Thanks for sorting us out to translate from Finnish at UCL TrabiMechanic 23:10, 2 August 2017 (UTC) — Preceding unsigned comment added by TheTrabiMechanic (talkcontribs)

@TheTrabiMechanic: Glad I could help, let me know if you need a hand with anything else! Richard Nevell (WMUK) (talk) 11:55, 7 August 2017 (UTC)

You've got mail

 
Hello, Richard Nevell (WMUK). Please check your email; you've got mail!
It may take a few minutes from the time the email is sent for it to show up in your inbox. You can remove this notice at any time by removing the {{You've got mail}} or {{ygm}} template.

Regarding WMUK 2014 Stub contest's prize. --Skr15081997 (talk) 12:48, 13 August 2017 (UTC)

Facto Post – Issue 4 – 18 September 2017

Facto Post – Issue 4 – 18 September 2017
 

Editorial: Conservation data

The IUCN Red List update of 14 September led with a threat to North American ash trees. The International Union for Conservation of Nature produces authoritative species listings that are peer-reviewed. Examples used as metonyms for loss of species and biodiversity, and discussion of extinction rates, are the usual topics covered in the media to inform us about this area. But actual data matters.

 
Dorstenia elata, a critically endangered South American herb, contained in Moraceae, the family of figs and mulberries

Clearly, conservation work depends on decisions about what should be done, and where. While animals, particularly mammals, are photogenic, species numbers run into millions. Plant species lie at the base of typical land-based food chains, and vegetation is key to the habitats of most animals.

ContentMine dictionaries, for example as tabulated at d:Wikidata:WikiFactMine/Dictionary list, enable detailed control of queries about endangered species, in their taxonomic context. To target conservation measures properly, species listings running into the thousands are not what is needed: range maps showing current distribution are. Between the will to act, and effective steps taken, the services of data handling are required. There is now no reason at all why Wikidata should not take up the burden.

Editor Charles Matthews. Please leave feedback for him.

If you wish to receive no further issues of Facto Post, please remove your name from our mailing list. Alternatively, to opt out of all massmessage mailings, you may add Category:Opted-out of message delivery to your user talk page.
Newsletter delivered by MediaWiki message delivery

MediaWiki message delivery (talk) 14:46, 18 September 2017 (UTC)

OK to add you to list?

Hi! Would you mind if I add your username to the attendee list at Wikipedia:GLAM/NHSF_Project#September_2017_event? No worries if you'd rather not. Please WP:PING me if you reply. Thanks! zazpot (talk) 12:47, 4 October 2017 (UTC)

@Zazpot: sure, go right ahead! Richard Nevell (WMUK) (talk) 13:21, 4 October 2017 (UTC)
  Done zazpot (talk) 14:41, 4 October 2017 (UTC)

Facto Post – Issue 5 – 17 October 2017

Facto Post – Issue 5 – 17 October 2017
 

Editorial: Annotations

Annotation is nothing new. The glossators of medieval Europe annotated between the lines, or in the margins of legal manuscripts of texts going back to Roman times, and created a new discipline. In the form of web annotation, the idea is back, with texts being marked up inline, or with a stand-off system. Where could it lead?

 
1495 print version of the Digesta of Justinian, with the annotations of the glossator Accursius from the 13th century

ContentMine operates in the field of text and data mining (TDM), where annotation, simply put, can add value to mined text. It now sees annotation as a possible advance in semi-automation, the use of human judgement assisted by bot editing, which now plays a large part in Wikidata tools. While a human judgement call of yes/no, on the addition of a statement to Wikidata, is usually taken as decisive, it need not be. The human assent may be passed into an annotation system, and stored: this idea is standard on Wikisource, for example, where text is considered "validated" only when two different accounts have stated that the proof-reading is correct. A typical application would be to require more than one person to agree that what is said in the reference translates correctly into the formal Wikidata statement. Rejections are also potentially useful to record, for machine learning.

As a contribution to data integrity on Wikidata, annotation has much to offer. Some "hard cases" on importing data are much more difficult than average. There are for example biographical puzzles: whether person A in one context is really identical with person B, of the same name, in another context. In science, clinical medicine require special attention to sourcing (WP:MEDRS), and is challenging in terms of connecting findings with the methodology employed. Currently decisions in areas such as these, on Wikipedia and Wikidata, are often made ad hoc. In particular there may be no audit trail for those who want to check what is decided.

Annotations are subject to a World Wide Web Consortium standard, and behind the terminology constitute a simple JSON data structure. What WikiFactMine proposes to do with them is to implement the MEDRS guideline, as a formal algorithm, on bibliographical and methodological data. The structure will integrate with those inputs the human decisions on the interpretation of scientific papers that underlie claims on Wikidata. What is added to Wikidata will therefore be supported by a transparent and rigorous system that documents decisions.

An example of the possible future scope of annotation, for medical content, is in the first link below. That sort of detailed abstract of a publication can be a target for TDM, adds great value, and could be presented in machine-readable form. You are invited to discuss the detailed proposal on Wikidata, via its talk page.

Editor Charles Matthews. Please leave feedback for him.

If you wish to receive no further issues of Facto Post, please remove your name from our mailing list. Alternatively, to opt out of all massmessage mailings, you may add Category:Opted-out of message delivery to your user talk page.
Newsletter delivered by MediaWiki message delivery

MediaWiki message delivery (talk) 08:46, 17 October 2017 (UTC)

Template for Finland Centenary Translatathon, 29th Nov 2017

Hi Richard, as you recommended I created a template Template:Finland_Centenary_Translatathon. It's still Draft so I'm not actually sure what anyone can see of it. Here's the contents, just in case:

Category:Wikipedia templates

Could you approve it when you're happy? Thanks, TrabiMechanic 11:09, 31 October 2017 (UTC)

@TheTrabiMechanic: The template looks good and makes it very clear what the person displaying it would be doing. I've moved it to Template:Finland Centenary Translatathon. Richard Nevell (WMUK) (talk) 12:34, 31 October 2017 (UTC)
@Richard Nevel (WMUK): Thanks - I've created some guidance for our participants (happy to move this onto Wikipedia if it's useful). https://wiki.ucl.ac.uk/x/jgy9B --TrabiMechanic (talk) 14:54, 2 November 2017 (UTC)

Salford Wikipedia workshop

Hi Richard, in preparation for the Salford Wikipedia workshop next Sunday, could you arrange for odder (talk · contribs · deleted contribs · page moves · block user · block log) to have the accountcreator right (and anything else that you think might be useful). He's an admin on Commons, as you probably know, and is a long-term editor, even though he doesn't edit as much here as he does on Commons. Cheers --RexxS (talk) 01:02, 13 November 2017 (UTC)

Facto Post – Issue 6 – 15 November 2017

Facto Post – Issue 6 – 15 November 2017
 

WikidataCon Berlin 28–9 October 2017

 
WikidataCon 2017 group photo

Under the heading rerum causas cognescere, the first ever Wikidata conference got under way in the Tagesspiegel building with two keynotes, One was on YAGO, about how a knowledge base conceived ten years ago if you assume automatic compilation from Wikipedia. The other was from manager Lydia Pintscher, on the "state of the data". Interesting rumours flourished: the mix'n'match tool and its 600 datasets, mostly in digital humanities, to be taken off the hands of its author Magnus Manske by the WMF; a Wikibase incubator site is on its way. Announcements came in talks: structured data on Wikimedia Commons is scheduled to make substantive progress by 2019. The lexeme development on Wikidata is now not expected to make the Wiktionary sites redundant, but may facilitate automated compilation of dictionaries.

 
WD-FIST explained

And so it went, with five strands of talks and workshops, through to 11 pm on Saturday. Wikidata applies to GLAM work via metadata. It may be used in education, raises issues such as author disambiguation, and lends itself to different types of graphical display and reuse. Many millions of SPARQL queries are run on the site every day. Over the summer a large open science bibliography has come into existence there.

Wikidata's fifth birthday party on the Sunday brought matters to a close. See a dozen and more reports by other hands.

Editor Charles Matthews. Please leave feedback for him.

If you wish to receive no further issues of Facto Post, please remove your name from our mailing list. Alternatively, to opt out of all massmessage mailings, you may add Category:Wikipedians who opt out of message delivery to your user talk page.
Newsletter delivered by MediaWiki message delivery

MediaWiki message delivery (talk) 10:02, 15 November 2017 (UTC)

ArbCom 2017 election voter message

Hello, Richard Nevell (WMUK). Voting in the 2017 Arbitration Committee elections is now open until 23.59 on Sunday, 10 December. All users who registered an account before Saturday, 28 October 2017, made at least 150 mainspace edits before Wednesday, 1 November 2017 and are not currently blocked are eligible to vote. Users with alternate accounts may only vote once.

The Arbitration Committee is the panel of editors responsible for conducting the Wikipedia arbitration process. It has the authority to impose binding solutions to disputes between editors, primarily for serious conduct disputes the community has been unable to resolve. This includes the authority to impose site bans, topic bans, editing restrictions, and other measures needed to maintain our editing environment. The arbitration policy describes the Committee's roles and responsibilities in greater detail.

If you wish to participate in the 2017 election, please review the candidates and submit your choices on the voting page. MediaWiki message delivery (talk) 18:42, 3 December 2017 (UTC)

Facto Post – Issue 7 – 15 December 2017

Facto Post – Issue 7 – 15 December 2017
 

A new bibliographical landscape

At the beginning of December, Wikidata items on individual scientific articles passed the 10 million mark. This figure contrasts with the state of play in early summer, when there were around half a million. In the big picture, Wikidata is now documenting the scientific literature at a rate that is about eight times as fast as papers are published. As 2017 ends, progress is quite evident.

Behind this achievement are a technical advance (fatameh), and bots that do the lifting. Much more than dry migration of metadata is potentially involved, however. If paper A cites paper B, both papers having an item, a link can be created on Wikidata, and the information presented to both human readers, and machines. This cross-linking is one of the most significant aspects of the scientific literature, and now a long-sought open version is rapidly being built up.

 

The effort for the lifting of copyright restrictions on citation data of this kind has had real momentum behind it during 2017. WikiCite and the I4OC have been pushing hard, with the result that on CrossRef over 50% of the citation data is open. Now the holdout publishers are being lobbied to release rights on citations.

But all that is just the beginning. Topics of papers are identified, authors disambiguated, with significant progress on the use of the four million ORCID IDs for researchers, and proposals formulated to identify methodology in a machine-readable way. P4510 on Wikidata has been introduced so that methodology can sit comfortably on items about papers.

More is on the way. OABot applies the unpaywall principle to Wikipedia referencing. It has been proposed that Wikidata could assist WorldCat in compiling the global history of book translation. Watch this space.

And make promoting #1lib1ref one of your New Year's resolutions. Happy holidays, all!

 
November 2017 map of geolocated Wikidata items, made by Addshore

To subscribe to Facto Post go to Wikipedia:Facto Post mailing list. For the ways to unsubscribe, see below.
Editor Charles Matthews, for ContentMine. Please leave feedback for him. Back numbers are here.
Reminder: WikiFactMine pages on Wikidata are at WD:WFM.

If you wish to receive no further issues of Facto Post, please remove your name from our mailing list. Alternatively, to opt out of all massmessage mailings, you may add Category:Wikipedians who opt out of message delivery to your user talk page.
Newsletter delivered by MediaWiki message delivery

MediaWiki message delivery (talk) 14:54, 15 December 2017 (UTC)

Facto Post – Issue 8 – 15 January 2018

Facto Post – Issue 8 – 15 January 2018
 

Metadata on the March

From the days of hard-copy liner notes on music albums, metadata have stood outside a piece or file, while adding to understanding of where it comes from, and some of what needs to be appreciated about its content. In the GLAM sector, the accumulation of accurate metadata for objects is key to the mission of an institution, and its presentation in cataloguing.

Today Wikipedia turns 17, with worlds still to conquer. Zooming out from the individual GLAM object to the ontology in which it is set, one such world becomes apparent: GLAMs use custom ontologies, and those introduce massive incompatibilities. From a recent article by sadads, we quote the observation that "vocabularies needed for many collections, topics and intellectual spaces defy the expectations of the larger professional communities." A job for the encyclopedist, certainly. But the data-minded Wikimedian has the advantages of Wikidata, starting with its multilingual data, and facility with aliases. The controlled vocabulary — sometimes referred to as a "thesaurus" as term of art — simplifies search: if a "spade" must be called that, rather than "shovel", it is easier to find all spade references. That control comes at a cost.

 
SVG pedestrian crosses road
 
Zebra crossing/crosswalk, Singapore

Case studies in that article show what can lie ahead. The schema crosswalk, in jargon, is a potential answer to the GLAM Babel of proliferating and expanding vocabularies. Even if you have no interest in Wikidata as such, simply vocabularies V and W, if both V and W are matched to Wikidata, then a "crosswalk" arises from term v in V to w in W, whenever v and w both match to the same item d in Wikidata.

For metadata mobility, match to Wikidata. It's apparently that simple: infrastructure requirements have turned out, so far, to be challenges that can be met.


To subscribe to Facto Post go to Wikipedia:Facto Post mailing list. For the ways to unsubscribe, see below.
Editor Charles Matthews, for ContentMine. Please leave feedback for him. Back numbers are here.
Reminder: WikiFactMine pages on Wikidata are at WD:WFM.

If you wish to receive no further issues of Facto Post, please remove your name from our mailing list. Alternatively, to opt out of all massmessage mailings, you may add Category:Wikipedians who opt out of message delivery to your user talk page.
Newsletter delivered by MediaWiki message delivery

MediaWiki message delivery (talk) 12:38, 15 January 2018 (UTC)

Facto Post – Issue 9 – 5 February 2018

Facto Post – Issue 9 – 5 February 2018
 

m:Grants:Project/ScienceSource is the new ContentMine proposal: please take a look.

Wikidata as Hub

One way of looking at Wikidata relates it to the semantic web concept, around for about as long as Wikipedia, and realised in dozens of distributed Web institutions. It sees Wikidata as supplying central, encyclopedic coverage of linked structured data, and looks ahead to greater support for "federated queries" that draw together information from all parts of the emerging network of websites.

 

Another perspective might be likened to a photographic negative of that one: Wikidata as an already-functioning Web hub. Over half of its properties are identifiers on other websites. These are Wikidata's "external links", to use Wikipedia terminology: one type for the DOI of a publication, another for the VIAF page of an author, with thousands more such. Wikidata links out to sites that are not nominally part of the semantic web, effectively drawing them into a larger system. The crosswalk possibilities of the systematic construction of these links was covered in Issue 8.

Wikipedia:External links speaks of them as kept "minimal, meritable, and directly relevant to the article." Here Wikidata finds more of a function. On viaf.org one can type a VIAF author identifier into the search box, and find the author page. The Wikidata Resolver tool, these days including Open Street Map, Scholia etc., allows this kind of lookup. The hub tool by maxlath takes a major step further, allowing both lookup and crosswalk to be encoded in a single URL.


To subscribe to Facto Post go to Wikipedia:Facto Post mailing list. For the ways to unsubscribe, see below.
Editor Charles Matthews, for ContentMine. Please leave feedback for him. Back numbers are here.
Reminder: WikiFactMine pages on Wikidata are at WD:WFM.

If you wish to receive no further issues of Facto Post, please remove your name from our mailing list. Alternatively, to opt out of all massmessage mailings, you may add Category:Wikipedians who opt out of message delivery to your user talk page.
Newsletter delivered by MediaWiki message delivery

MediaWiki message delivery (talk) 11:50, 5 February 2018 (UTC)

Courses Modules are being deprecated

Hello,

Your account is currently configured with an education program flag. This system (the Courses system) is being deprecated. As such, your account will soon be updated to remove these no longer supported flags. For details on the changes, and how to migrate to using the replacement system (the Programs and Events Dashboard) please see Wikipedia:Education noticeboard/Archive 18#NOTICE: EducationProgram extension is being deprecated.

Thank you! Sent by: xaosflux 20:28, 8 March 2018 (UTC)

Facto Post – Issue 10 – 12 March 2018

Facto Post – Issue 10 – 12 March 2018
 

Milestone for mix'n'match

Around the time in February when Wikidata clicked past item Q50000000, another milestone was reached: the mix'n'match tool uploaded its 1000th dataset. Concisely defined by its author, Magnus Manske, it works "to match entries in external catalogs to Wikidata". The total number of entries is now well into eight figures, and more are constantly being added: a couple of new catalogs each day is normal.

Since the end of 2013, mix'n'match has gradually come to play a significant part in adding statements to Wikidata. Particularly in areas with the flavour of digital humanities, but datasets can of course be about practically anything. There is a catalog on skyscrapers, and two on spiders.

These days mix'n'match can be used in numerous modes, from the relaxed gamified click through a catalog looking for matches, with prompts, to the fantastically useful and often demanding search across all catalogs. I'll type that again: you can search 1000 datasets from the simple box at the top right. The drop-down menu top left offers "creation candidates", Magnus's personal favourite. m:Mix'n'match/Manual for more.

For the Wikidatan, a key point is that these matches, however carried out, add statements to Wikidata if, and naturally only if, there is a Wikidata property associated with the catalog. For everyone, however, the hands-on experience of deciding of what is a good match is an education, in a scholarly area, biographical catalogs being particularly fraught. Underpinning recent rapid progress is an open infrastructure for scraping and uploading.

Congratulations to Magnus, our data Stakhanovite!

 
3D printing

To subscribe to Facto Post go to Wikipedia:Facto Post mailing list. For the ways to unsubscribe, see below.
Editor Charles Matthews, for ContentMine. Please leave feedback for him. Back numbers are here.
Reminder: WikiFactMine pages on Wikidata are at WD:WFM.

If you wish to receive no further issues of Facto Post, please remove your name from our mailing list. Alternatively, to opt out of all massmessage mailings, you may add Category:Wikipedians who opt out of message delivery to your user talk page.
Newsletter delivered by MediaWiki message delivery

MediaWiki message delivery (talk) 12:26, 12 March 2018 (UTC)

Facto Post – Issue 11 – 9 April 2018

Facto Post – Issue 11 – 9 April 2018
 

The 100 Skins of the Onion

Open Citations Month, with its eminently guessable hashtag, is upon us. We should be utterly grateful that in the past 12 months, so much data on which papers cite which other papers has been made open, and that Wikidata is playing its part in hosting it as "cites" statements. At the time of writing, there are 15.3M Wikidata items that can do that.

Pulling back to look at open access papers in the large, though, there is is less reason for celebration. Access in theory does not yet equate to practical access. A recent LSE IMPACT blogpost puts that issue down to "heterogeneity". A useful euphemism to save us from thinking that the whole concept doesn't fall into the realm of the oxymoron.

Some home truths: aggregation is not content management, if it falls short on reusability. The PDF file format is wedded to how humans read documents, not how machines ingest them. The salami-slicer is our friend in the current downloading of open access papers, but for a better metaphor, think about skinning an onion, laboriously, 100 times with diminishing returns. There are of the order of 100 major publisher sites hosting open access papers, and the predominant offer there is still a PDF.

 
Red onion cross section

From the discoverability angle, Wikidata's bibliographic resources combined with the SPARQL query are superior in principle, by far, to existing keyword searches run over papers. Open access content should be managed into consistent HTML, something that is currently strenuous. The good news, such as it is, would be that much of it is already in XML. The organisational problem of removing further skins from the onion, with sensible prioritisation, is certainly not insuperable. The CORE group (the bloggers in the LSE posting) has some answers, but actually not all that is needed for the text and data mining purposes they highlight. The long tail, or in other words the onion heart when it has become fiddly beyond patience to skin, does call for a pis aller. But the real knack is to do more between the XML and the heart.


To subscribe to Facto Post go to Wikipedia:Facto Post mailing list. For the ways to unsubscribe, see below.
Editor Charles Matthews, for ContentMine. Please leave feedback for him. Back numbers are here.
Reminder: WikiFactMine pages on Wikidata are at WD:WFM.

If you wish to receive no further issues of Facto Post, please remove your name from our mailing list. Alternatively, to opt out of all massmessage mailings, you may add Category:Wikipedians who opt out of message delivery to your user talk page.
Newsletter delivered by MediaWiki message delivery

MediaWiki message delivery (talk) 16:25, 9 April 2018 (UTC)

Facto Post – Issue 12 – 28 May 2018

Facto Post – Issue 12 – 28 May 2018
 

ScienceSource funded

The Wikimedia Foundation announced full funding of the ScienceSource grant proposal from ContentMine on May 18. See the ScienceSource Twitter announcement and 60 second video.

A medical canon?

The proposal includes downloading 30,000 open access papers, aiming (roughly speaking) to create a baseline for medical referencing on Wikipedia. It leaves open the question of how these are to be chosen.

The basic criteria of WP:MEDRS include a concentration on secondary literature. Attention has to be given to the long tail of diseases that receive less current research. The MEDRS guideline supposes that edge cases will have to be handled, and the premature exclusion of publications that would be in those marginal positions would reduce the value of the collection. Prophylaxis misses the point that gate-keeping will be done by an algorithm.

Two well-known but rather different areas where such considerations apply are tropical diseases and alternative medicine. There are also a number of potential downloading troubles, and these were mentioned in Issue 11. There is likely to be a gap, even with the guideline, between conditions taken to be necessary but not sufficient, and conditions sufficient but not necessary, for candidate papers to be included. With around 10,000 recognised medical conditions in standard lists, being comprehensive is demanding. With all of these aspects of the task, ScienceSource will seek community help.

 
OpenRefine logo, courtesy of Google

To subscribe to Facto Post go to Wikipedia:Facto Post mailing list. For the ways to unsubscribe, see below.
Editor Charles Matthews, for ContentMine. Please leave feedback for him. Back numbers are here.
Reminder: WikiFactMine pages on Wikidata are at WD:WFM. ScienceSource pages will be announced there, and in this mass message.

If you wish to receive no further issues of Facto Post, please remove your name from our mailing list. Alternatively, to opt out of all massmessage mailings, you may add Category:Wikipedians who opt out of message delivery to your user talk page.
Newsletter delivered by MediaWiki message delivery

MediaWiki message delivery (talk) 10:16, 28 May 2018 (UTC)

Facto Post – Issue 13 – 29 May 2018

Facto Post – Issue 13 – 29 May 2018
 

The Editor is Charles Matthews, for ContentMine. Please leave feedback for him, on his User talk page.
To subscribe to Facto Post go to Wikipedia:Facto Post mailing list. For the ways to unsubscribe, see the footer.

Respecting MEDRS

Facto Post enters its second year, with a Cambridge Blue (OK, Aquamarine) background, a new logo, but no Cambridge blues. On-topic for the ScienceSource project is a project page here. It contains some case studies on how the WP:MEDRS guideline, for the referencing of articles at all related to human health, is applied in typical discussions.

Close to home also, a template, called {{medrs}} for short, is used to express dissatisfaction with particular references. Technology can help with patrolling, and this Petscan query finds over 450 articles where there is at least one use of the template. Of course the template is merely suggesting there is a possible issue with the reliability of a reference. Deciding the truth of the allegation is another matter.

This maintenance issue is one example of where ScienceSource aims to help. Where the reference is to a scientific paper, its type of algorithm could give a pass/fail opinion on such references. It could assist patrollers of medical articles, therefore, with the templated references and more generally. There may be more to proper referencing than that, indeed: context, quite what the statement supported by the reference expresses, prominence and weight. For that kind of consideration, case studies can help. But an algorithm might help to clear the backlog.

 
Evidence pyramid leading up to clinical guidelines, from WP:MEDRS
Links

If you wish to receive no further issues of Facto Post, please remove your name from our mailing list. Alternatively, to opt out of all massmessage mailings, you may add Category:Wikipedians who opt out of message delivery to your user talk page.
Newsletter delivered by MediaWiki message delivery

MediaWiki message delivery (talk) 18:19, 29 June 2018 (UTC)

Facto Post – Issue 14 – 21 July 2018

Facto Post – Issue 14 – 21 July 2018
 

The Editor is Charles Matthews, for ContentMine. Please leave feedback for him, on his User talk page.
To subscribe to Facto Post go to Wikipedia:Facto Post mailing list. For the ways to unsubscribe, see the footer.

Plugging the gaps – Wikimania report

Officially it is "bridging the gaps in knowledge", with Wikimania 2018 in Cape Town paying tribute to the southern African concept of ubuntu to implement it. Besides face-to-face interactions, Wikimedians do need their power sources.

 
Hackathon mentoring table wiring

Facto Post interviewed Jdforrester, who has attended every Wikimania, and now works as Senior Product Manager for the Wikimedia Foundation. His take on tackling the gaps in the Wikimedia movement is that "if we were an army, we could march in a column and close up all the gaps". In his view though, that is a faulty metaphor, and it leads to a completely false misunderstanding of the movement, its diversity and different aspirations, and the nature of the work as "fighting" to be done in the open sector. There are many fronts, and as an eventualist he feels the gaps experienced both by editors and by users of Wikimedia content are inevitable. He would like to see a greater emphasis on reuse of content, not simply its volume.

If that may not sound like radicalism, the Decolonizing the Internet conference here organized jointly with Whose Knowledge? can redress the picture. It comes with the claim to be "the first ever conference about centering marginalized knowledge online".

 
Plugbar buildup at the Hackathon
Links

If you wish to receive no further issues of Facto Post, please remove your name from our mailing list. Alternatively, to opt out of all massmessage mailings, you may add Category:Wikipedians who opt out of message delivery to your user talk page.
Newsletter delivered by MediaWiki message delivery

MediaWiki message delivery (talk) 06:10, 21 July 2018 (UTC)

Facto Post – Issue 15 – 21 August 2018

Facto Post – Issue 15 – 21 August 2018
 

The Editor is Charles Matthews, for ContentMine. Please leave feedback for him, on his User talk page.
To subscribe to Facto Post go to Wikipedia:Facto Post mailing list. For the ways to unsubscribe, see the footer.

Neglected diseases
 
Anti-parasitic drugs being distributed in Côte d'Ivoire
What's a Neglected Disease?, ScienceSource video

To grasp the nettle, there are rare diseases, there are tropical diseases and then there are "neglected diseases". Evidently a rare enough disease is likely to be neglected, but neglected disease these days means a disease not rare, but tropical, and most often infectious or parasitic. Rare diseases as a group are dominated, in contrast, by genetic diseases.

A major aspect of neglect is found in tracking drug discovery. Orphan drugs are those developed to treat rare diseases (rare enough not to have market-driven research), but there is some overlap in practice with the WHO's neglected diseases, where snakebite, a "neglected public health issue", is on the list.

From an encyclopedic point of view, lack of research also may mean lack of high-quality references: the core medical literature differs from primary research, since it operates by aggregating trials. This bibliographic deficit clearly hinders Wikipedia's mission. The ScienceSource project is currently addressing this issue, on Wikidata. Its Wikidata focus list at WD:SSFL is trying to ensure that neglect does not turn into bias in its selection of science papers.

Links

If you wish to receive no further issues of Facto Post, please remove your name from our mailing list. Alternatively, to opt out of all massmessage mailings, you may add Category:Wikipedians who opt out of message delivery to your user talk page.
Newsletter delivered by MediaWiki message delivery

MediaWiki message delivery (talk) 13:23, 21 August 2018 (UTC)

Facto Post – Issue 16 – 30 September 2018

Facto Post – Issue 16 – 30 September 2018
 

The Editor is Charles Matthews, for ContentMine. Please leave feedback for him, on his User talk page.
To subscribe to Facto Post go to Wikipedia:Facto Post mailing list. For the ways to unsubscribe, see the footer.

The science publishing landscape
 

In an ideal world ... no, bear with your editor for just a minute ... there would be a format for scientific publishing online that was as much a standard as SI units are for the content. Likewise cataloguing publications would not be onerous, because part of the process would be to generate uniform metadata. Without claiming it could be the mythical free lunch, it might be reasonably be argued that sandwiches can be packaged much alike and have barcodes, whatever the fillings.

The best on offer, to stretch the metaphor, is the meal kit option, in the form of XML. Where scientific papers are delivered as XML downloads, you get all the ingredients ready to cook. But have to prepare the actual meal of slow food yourself. See Scholarly HTML for a recent pass at heading off XML with HTML, in other words in the native language of the Web.

The argument from real life is a traditional mixture of frictional forces, vested interests, and the classic irony of the principle of unripe time. On the other hand, discoverability actually diminishes with the prolific progress of science publishing. No, it really doesn't scale. Wikimedia as movement can do something in such cases. We know from open access, we grok the Web, we have our own horse in the HTML race, we have Wikidata and WikiJournal, and we have the chops to act.

 
Links

If you wish to receive no further issues of Facto Post, please remove your name from our mailing list. Alternatively, to opt out of all massmessage mailings, you may add Category:Wikipedians who opt out of message delivery to your user talk page.
Newsletter delivered by MediaWiki message delivery

MediaWiki message delivery (talk) 17:57, 30 September 2018 (UTC)

Facto Post – Issue 17 – 29 October 2018

Facto Post – Issue 17 – 29 October 2018
 

The Editor is Charles Matthews, for ContentMine. Please leave feedback for him, on his User talk page.
To subscribe to Facto Post go to Wikipedia:Facto Post mailing list. For the ways to unsubscribe, see the footer.

Wikidata imaged

Around 2.7 million Wikidata items have an illustrative image. These files, you might say, are Wikimedia's stock images, and if the number is large, it is still only 5% or so of items that have one. All such images are taken from Wikimedia Commons, which has 50 million media files. One key issue is how to expand the stock.

Indeed, there is a tool. WD-FIST exploits the fact that each Wikipedia is differently illustrated, mostly with images from Commons but also with fair use images. An item that has sitelinks but no illustrative image can be tested to see if the linked wikis have a suitable one. This works well for a volunteer who wants to add images at a reasonable scale, and a small amount of SPARQL knowledge goes a long way in producing checklists.

 
Gran Teatro, Cáceres, Spain, at night

It should be noted, though, that there are currently 53 Wikidata properties that link to Commons, of which P18 for the basic image is just one. WD-FIST prompts the user to add signatures, plaques, pictures of graves and so on. There are a couple of hundred monograms, mostly of historical figures, and this query allows you to view all of them. commons:Category:Monograms and its subcategories provide rich scope for adding more.

And so it is generally. The list of properties linking to Commons does contain a few that concern video and audio files, and rather more for maps. But it contains gems such as P3451 for "nighttime view". Over 1000 of those on Wikidata, but as for so much else, there could be yet more.

Go on. Today is Wikidata's birthday. An illustrative image is always an acceptable gift, so why not add one? You can follow these easy steps: (i) log in at https://tools.wmflabs.org/widar/, (ii) paste the Petscan ID 6263583 into https://tools.wmflabs.org/fist/wdfist/ and click run, and (iii) just add cake.

 
Birthday logo
Links

If you wish to receive no further issues of Facto Post, please remove your name from our mailing list. Alternatively, to opt out of all massmessage mailings, you may add Category:Wikipedians who opt out of message delivery to your user talk page.
Newsletter delivered by MediaWiki message delivery

MediaWiki message delivery (talk) 15:01, 29 October 2018 (UTC)

ArbCom 2018 election voter message

Hello, Richard Nevell (WMUK). Voting in the 2018 Arbitration Committee elections is now open until 23.59 on Sunday, 3 December. All users who registered an account before Sunday, 28 October 2018, made at least 150 mainspace edits before Thursday, 1 November 2018 and are not currently blocked are eligible to vote. Users with alternate accounts may only vote once.

The Arbitration Committee is the panel of editors responsible for conducting the Wikipedia arbitration process. It has the authority to impose binding solutions to disputes between editors, primarily for serious conduct disputes the community has been unable to resolve. This includes the authority to impose site bans, topic bans, editing restrictions, and other measures needed to maintain our editing environment. The arbitration policy describes the Committee's roles and responsibilities in greater detail.

If you wish to participate in the 2018 election, please review the candidates and submit your choices on the voting page. MediaWiki message delivery (talk) 18:42, 19 November 2018 (UTC)

Facto Post – Issue 18 – 30 November 2018

Facto Post – Issue 18 – 30 November 2018
 

The Editor is Charles Matthews, for ContentMine. Please leave feedback for him, on his User talk page.
To subscribe to Facto Post go to Wikipedia:Facto Post mailing list. For the ways to unsubscribe, see the footer.

WikiCite issue

GLAM ♥ data — what is a gallery, library, archive or museum without a catalogue? It follows that Wikidata must love librarians. Bibliography supports students and researchers in any topic, but open and machine-readable bibliographic data even more so, outside the silo. Cue the WikiCite initiative, which was meeting in conference this week, in the Bay Area of California.

 
Wikidata training for librarians at WikiCite 2018

In fact there is a broad scope: "Open Knowledge Maps via SPARQL" and the "Sum of All Welsh Literature", identification of research outputs, Library.Link Network and Bibframe 2.0, OSCAR and LUCINDA (who they?), OCLC and Scholia, all these co-exist on the agenda. Certainly more library science is coming Wikidata's way. That poses the question about the other direction: is more Wikimedia technology advancing on libraries? Good point.

Wikimedians generally are not aware of the tech background that can be assumed, unless they are close to current training for librarians. A baseline definition is useful here: "bash, git and OpenRefine". Compare and contrast with pywikibot, GitHub and mix'n'match. Translation: scripting for automation, version control, data set matching and wrangling in the large, are on the agenda also for contemporary library work. Certainly there is some possible common ground here. Time to understand rather more about the motivations that operate in the library sector.

Links

Account creation is now open on the ScienceSource wiki, where you can see SPARQL visualisations of text mining.

If you wish to receive no further issues of Facto Post, please remove your name from our mailing list. Alternatively, to opt out of all massmessage mailings, you may add Category:Wikipedians who opt out of message delivery to your user talk page.
Newsletter delivered by MediaWiki message delivery

MediaWiki message delivery (talk) 11:20, 30 November 2018 (UTC)

Facto Post – Issue 19 – 27 December 2018

Facto Post – Issue 19 – 27 December 2018
 

The Editor is Charles Matthews, for ContentMine. Please leave feedback for him, on his User talk page.
To subscribe to Facto Post go to Wikipedia:Facto Post mailing list. For the ways to unsubscribe, see the footer.

Learning from Zotero

Zotero is free software for reference management by the Center for History and New Media: see Wikipedia:Citing sources with Zotero. It is also an active user community, and has broad-based language support.

 
Zotero logo

Besides the handiness of Zotero's warehousing of personal citation collections, the Zotero translator underlies the citoid service, at work behind the VisualEditor. Metadata from Wikidata can be imported into Zotero; and in the other direction the zotkat tool from the University of Mannheim allows Zotero bibliographies to be exported to Wikidata, by item creation. With an extra feature to add statements, that route could lead to much development of the focus list (P5008) tagging on Wikidata, by WikiProjects.

Zotero demo video

There is also a large-scale encyclopedic dimension here. The construction of Zotero translators is one facet of Web scraping that has a strong community and open source basis. In that it resembles the less formal mix'n'match import community, and growing networks around other approaches that can integrate datasets into Wikidata, such as the use of OpenRefine.

Looking ahead, the thirtieth birthday of the World Wide Web falls in 2019, and yet the ambition to make webpages routinely readable by machines can still seem an ever-retreating mirage. Wikidata should not only be helping Wikimedia integrate its projects, an ongoing process represented by Structured Data on Commons and lexemes. It should also be acting as a catalyst to bring scraping in from the cold, with institutional strengths as well as resourceful code.

Links

Diversitech, the latest ContentMine grant application to the Wikimedia Foundation, is in its community review stage until January 2.

If you wish to receive no further issues of Facto Post, please remove your name from our mailing list. Alternatively, to opt out of all massmessage mailings, you may add Category:Wikipedians who opt out of message delivery to your user talk page.
Newsletter delivered by MediaWiki message delivery

MediaWiki message delivery (talk) 19:08, 27 December 2018 (UTC)