IRC office hours/Office hours 2014-09-03
Structured Data
editLog
editTime: 18:00-19:00 UTC
Channel: #wikimedia-office
Timestamps are in UTC.
18:00:23 <Keegan> #startmeeting Structured Data
18:00:23 <wm-labs-meetbot`> Meeting started Wed Sep 3 18:00:23 2014 UTC and is due to finish in 60 minutes. The chair is Keegan. Information about MeetBot at http://wiki.debian.org/MeetBot.
18:00:23 <wm-labs-meetbot`> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
18:00:23 <wm-labs-meetbot`> The meeting name has been set to 'structured_data'
18:00:39 <multichill> Welcome!
18:00:47 <fabriceflorin> Welcome to our discussion about Structured Data everyone!
18:00:53 <Steinsplitter> hi all!
18:00:56 <Keegan> Fabrice, what do you have for us?
18:00:57 <moogsi> hi :)
18:01:03 <multichill> We'll be talking about https://commons.wikimedia.org/wiki/Commons:Structured_data so open that!
18:01:33 <fabriceflorin> The Structured Data initiative proposes to store and retrieve information for media files in machine-readable data on Wikimedia Commons, using Wikidata tools and practices, as described on this project page:
18:01:40 <fabriceflorin> https://commons.wikimedia.org/wiki/Commons:Structured_data
18:01:54 <fabriceflorin> The Multimedia team and the Wikidata team are starting to plan this project together, in collaboration with many community volunteers active on Wikimedia Commons and other wikis.
18:01:56 <thedjNotWMF> hi !
18:02:05 <Keegan> hi thedjNotWMF
18:02:14 <fabriceflorin> These include the illustrious multichill and thedjNotWMF ...
18:02:29 <marktraceur> He's NOT WMF. We swear.
18:02:35 <fabriceflorin> As well as Lydia_WMDE, tgr, mark and many more …
18:02:41 <Keegan> and Jheald!
18:02:53 <guillom> marktraceur: Well, he does know the sikrit handshake.
18:03:04 <fabriceflorin> The purpose of this project is to make it easier for users to read and write file information, and to enable developers to build better tools to view, search, edit, curate and use media files. To that end, we propose to investigate this opportunity together through community discussions and small experiments. If these initial tests are successful, we would develop new tools and practices for structured data, then work with our communities
18:03:07 <fabriceflorin> to gradually migrate unstructured data into a machine-readable format over time.
18:04:04 <fabriceflorin> We are really happy to have such a great group of folks, and look forward to developing this initiative as a collaboration between community, WMF and WMDE.
18:04:26 <fabriceflorin> So let’s open the floor for comments, questions, suggestions from everyone. Who wants to be first?
18:04:26 <Keegan> So now...
18:04:39 <Steinsplitter> Are commons user able to edit filedescription on common directly?
18:04:56 <legoktm> o/
18:05:12 <fabriceflorin> Steinsplitter: Yes, commons users will be able to edit filedescription on Commons.
18:05:33 * DanielK_WMDE wibbles
18:05:47 <darkweasel> will it still be possible to have templates like https://commons.wikimedia.org/wiki/User:Darkweasel94/copyright2 ?
18:06:06 <Steinsplitter> and we are able to protect filedescriptions on commons?
18:06:10 <thedjNotWMF> althought i would say that that might not necessarly be straight from the File description page, but possibly from an alternate namespace most likely.
18:06:22 <fabriceflorin> Here is a nice overview of some first ideas we discussed at Wikimania - our Structured Data Slides:
18:06:23 <fabriceflorin> https://commons.wikimedia.org/wiki/File:Structured_Data_-_Slides.pdf
18:06:26 <thedjNotWMF> am I correct in that assesmsent
18:06:53 <Steinsplitter> new namespace on commons? This sounds interesting :)
18:06:54 <multichill> darkweasel: Yes, that is still possible
18:06:57 <dennyvrandecic> Steinsplitter: yes, the pages containing the filedescriptions on Commons can be protected on commons, just as today
18:07:26 <DanielK_WMDE> hey dennyvrandecic!
18:07:28 <Scott_WUaS> great Fabrice
18:07:30 <matanya> can it be edited directly on commons?
18:07:31 <tgr> Steinsplitter: darkweasel: think of this as the Commons version of Wikidata-powered infobox templates
18:07:43 <fabriceflorin> Hey dennyvrandecic, so nice to see you here :)
18:07:53 <tgr> you can just put {{infobox}} in the wikitext and all the information appears magically
18:07:58 <DanielK_WMDE> matanya: yes. actually, only on commons.
18:07:58 <dennyvrandecic> fabriceflorin: thanks :) so nice to see this happen :)
18:08:08 <tgr> but there is no fundamental change in how the article or wikitext behaves
18:08:20 <Jheald> But denny if the descriptions are being partly pulled from Wikidata items, surely those WD items can be modified, even if the Commons File page is protected ?
18:08:20 <dschwen> Will protecting the page protect the underlying data from being changed?
18:08:32 <multichill> See some of the magic at https://www.wikidata.org/wiki/Talk:Q17616737
18:08:32 <Steinsplitter> hi dschwen :)
18:08:38 <dschwen> Hi Steinsplitter !
18:08:45 <Dereckson> darkweasel: less abstract and more concrete: the metadata information is a set of properties and value. If we choose to represent licensing information values by custom templates, you will have to add a Darkweasel94/copyright2 to the relevant property value, and it will be printed.
18:09:03 <Steinsplitter> commons admins schould be able to full administrate every content without asking wd folks
18:09:04 <darkweasel> ah, ok, i understand
18:09:11 <Steinsplitter> this is also important for our NPOV policy.
18:09:12 <Dereckson> Jheald: the point is not to use Wikidata to store metadata items, but to install the Wikibase extension on Commons
18:09:26 <thedjNotWMF> exactly
18:09:49 <Steinsplitter> Do you plan to start a RFC on commosn about the new extension?
18:09:49 <dschwen> As I understand it this whole endeavor is a multi-year thing. Maybe a slightly provocative statement: I'm not even sure the concepts of a "file description page" or even a "commons user" would have to stay the same, once we have structured data.
18:10:00 <darkweasel> will this bring us localized category names? in a way that's transparent to readers and editors?
18:10:03 <fabriceflorin> Here is a handy FAQ for Structured Data, which can help clarify some of your questions: https://commons.wikimedia.org/wiki/Commons:Structured_data#FAQ
18:10:16 <Jheald> Dereckson: the translation of the name of a painter in each language will not be stored on CommonsData, it will be pulled from a Q-item on Wikidata
18:10:34 <Steinsplitter> Do you start a RFC about wikibase. Asking teh communety?
18:10:36 <DanielK_WMDE> Steinsplitter: would you say that is also true for wikipedia admins, wrt wikipedia content?
18:10:47 <Jheald> But this is something that does need more clarity: what will be stored where
18:10:51 <Keegan> One sec Steinsplitter
18:10:55 <fabriceflorin> Steinsplitter: Of course, we will get community consensus before deploying something that huge. For now, we recommend that you all participate in this discussion page: https://commons.wikimedia.org/wiki/Commons_talk:Structured_data
18:10:56 <Keegan> fabriceflorin is typing
18:11:20 <Dereckson> Jheald: I don't see the issue: if we consider the Wikidata not pertinent for one of these painter, we can remove the item reference and use our own data instead
18:11:21 <Steinsplitter> i am refering to [[COM:RFC]] with a closure by a local crat.
18:11:30 <fabriceflorin> Let’s first discuss this, then we can make decisions. This is going to be a long project, because it has many complexities.
18:12:05 <Steinsplitter> a project without asking the communetc via RFC (regeular process) ?
18:12:28 <Jan_Ainali> Steinsplitter: why is it important for part of the community to be able to not ask other parts of the communtiy what effects the ntire community?
18:12:29 <fabriceflorin> dschwen: You are absolutely right that this needs to be viewed as a multiyear project. Right now, we are proposing that we start with an experimentation period.
18:12:29 <multichill> Steinsplitter: First discussion, than community consensus to do it, than deploy
18:12:49 <Steinsplitter> RFC , not discussion.
18:12:59 <Lydia_WMDE> darkweasel: yes it will give us that! \o/
18:13:00 <marktraceur> Yeah, not much point in deciding whether to do something when we don't know what it will look like
18:13:01 <dschwen> The RFC would be the consensus part
18:13:07 <Steinsplitter> yes
18:13:24 <Steinsplitter> no rfc, no deployment ;)
18:13:25 <Lydia_WMDE> darkweasel: well or at least something like it. we're unsure if categories will be done in that way in the future
18:13:44 <Lydia_WMDE> but that's a big topic that needs a lot of thinking
18:13:45 <dschwen> but right now I think we are in a phase where we should discuss _what_ is going to be happening
18:13:49 <fabriceflorin> Let’s also keep in mind that this project involves many different communities besides Commons, so we hope to engage them in this conversation as well in coming weeks.
18:13:52 <Lydia_WMDE> and input from the community
18:14:10 <guillom> Steinsplitter: It wouldn't make sense to do an RFC at the beginning (when no code has been written and the behavior of the software still needs to evolve based on user feedback) and it wouldn't make sense to have an RFC at the end (when it's too late to change the code). Instead, I understand that the plan is to communicate widely, through talk pages, watchlist notices, maybe even site notices, so that the community constantly evaluates the
18:14:10 <guillom> requirements and the prototypes, and builds the tool *with* the developers.
18:14:12 * Keegan agrees with dschwen
18:14:40 <Steinsplitter> this looks like a done deal to be honest. And it is difficult to understand for non english speaker and not wikidatians whats happen exactly.
18:14:49 <Steinsplitter> Which edits will be on wikidata exactly?
18:14:49 <guillom> Basically, it would be a 2-year-long constant RFC.
18:14:56 <darkweasel> Steinsplitter, nothing would be on wikidata
18:14:59 <darkweasel> as i understand it
18:15:09 <darkweasel> things would be in a separate namespace on commons
18:15:12 <darkweasel> that looks like wikidata
18:15:13 <marktraceur> Yeah, all image metadata would be on Commons, in the Wikibase extension
18:15:21 <Jheald> Seems to me it is actually two projects: one is using data from Wikidata in templates on Commons. The other is what to store on CommonsData. The two are fairly independent.
18:15:24 <Steinsplitter> oky :)
18:15:28 <marktraceur> You could *link* to Wikidata.
18:15:39 <Steinsplitter> ah okay, this sounds ok for me.
18:15:41 <thedjNotWMF> darkweasel: at most, info that is currently in a Creator or ARtwork or instituion template, by way of a link.
18:15:44 <multichill> Wikibase is the software, Wikidata is the site. Wikibase will also be on Commons
18:15:44 <Dereckson> doublespeak for no RFC, feedback gathered among people giving feedback, unconditional deployment at the end
18:15:48 <Jan_Ainali> Does it matter on which project the edits are?
18:16:00 <Steinsplitter> yes
18:16:02 <fabriceflorin> BTW, here is the preliminary roadmap we are considering for our next incremental steps in this inititative - to be adjusted based on community responses: https://commons.wikimedia.org/wiki/Commons:Structured_data#Roadmap
18:16:09 <Jan_Ainali> Why?
18:16:22 <Steinsplitter> because i can't investigate if there is vandalism or trolling.
18:16:27 <Steinsplitter> i can only do so on commons.
18:16:43 <DanielK_WMDE> Steinsplitter: why?
18:16:45 <marktraceur> But it's a non-issue, because Commons is where the data will be.
18:16:51 <Steinsplitter> it is annoying to ask othe rpeople
18:16:57 <Jan_Ainali> But there is admins on Wikidata to take care of vandalism on Wikidata
18:17:04 <Keegan> Yes, as marktraceur says it's a non-issue
18:17:09 <Steinsplitter> marktraceur, thanks. this schould be added to the faq :)
18:17:15 <guillom> Gosh; working collaboratively with other people. How unwikimedian :P
18:17:17 <gi11es> commons file pages are already linking/referring to wikidata items through templates at the moment
18:17:20 <marktraceur> Steinsplitter: It sounds like we're already planning on that
18:17:26 <Steinsplitter> Jan_Ainali: i don't like to ask admins there.
18:17:33 <Jan_Ainali> Well...
18:17:40 <marktraceur> Are there other questions?
18:17:45 <Lydia_WMDE> we are working on better tools to track wikidata changes on other projects btw
18:17:49 <guillom> Can I have a pony?
18:17:55 <moogsi> guillom: no.
18:17:56 <Keegan> Back to the topic: Discuss structured data!
18:17:58 <Steinsplitter> yes, how much costs this project to the WMF/WMDE :)
18:18:00 <Dereckson> Steinsplitter: look, if you see you spend a lot of time to fix stuff on Wikidata, you'll become a Wikidata contributor, and be able to get sysop rights there too.
18:18:01 <guillom> But a structured pony!
18:18:04 <dschwen> Yeah: can we get decent documentation for WikiBase/WikiData? :-)
18:18:05 <Jheald> marktraceur: No. The label for a Q-item will be on Wikidata, not Commons. Yes, CommonsData will point to a Q-number; but what that Q-number says will be on Wikidata
18:18:17 <moogsi> guillom: maybe.
18:18:35 <Keegan> Steinsplitter: the topic is structured data, please stick to the topic :)
18:18:37 <Lydia_WMDE> dschwen: we had an intern work on that this summer
18:18:43 <dschwen> nice
18:18:46 <Steinsplitter> Dereckson: no.
18:18:50 <Lydia_WMDE> dschwen: so i hope the user documentationimproved considerably
18:18:59 <Lydia_WMDE> it is probably not perfect yet but much better
18:19:04 <Steinsplitter> Keegan: i stick on the topic. plese read my questions again. thanks.
18:19:29 <Lydia_WMDE> more help on improving documentation is always welcime
18:19:30 <Steinsplitter> i asked on AN and there was ZERO work with the communety. only fyi. :)
18:19:30 <Jheald> Fabrice: Do you have an agenda for this mtg that you want to move through ?
18:19:44 <gi11es> call me naive, but I would assume each project fights equally against vandalism, trolling, etc. wikidata admins would take it as seriously as commons'
18:19:56 <fabriceflorin> Jheald: Here is one way to visualize where data might be stored across our sites: https://commons.wikimedia.org/w/index.php?title=File:Structured_Data_-_Slides.pdf&page=17
18:19:59 <guillom> How much of a dream is it to hope that this project will solve category intersection and make Commons finally searchable? :)
18:19:59 <Keegan> This is community work right here, so let's work :)
18:20:05 <Dereckson> and as usual, we'll have a set of people with sysop rights on both projects
18:20:40 <Steinsplitter> commons is commons and not wd.
18:20:48 <darkweasel> if you're going to change the whole category system – i hope you're considering that not everything with a commons category has a wikipedia article (and thus a wikidata item)
18:20:58 <Keegan> commons will stay commons, Steinsplitter.
18:21:03 <multichill> Steinsplitter: Please move on to the next question. We want to leave some room for other people to ask questions
18:21:19 <DanielK_WMDE> guillom: it's not much of a dream, i think. with "topics" from wikidata describing media on commons, you can search in any language, and combine topics.
18:21:26 <Steinsplitter> multichill: thy can ask. i can't say that this channel is flooded by questions.
18:21:48 <Lydia_WMDE> darkweasel: we will not get rid of categories. we will add better tools next to them. also on wikidata not everything needs a wikipedia article. much more is allowed there
18:21:51 <DanielK_WMDE> guillom: including "sub-topics" in the search is still not that easy, but it'S a problem we are going to solve (for categories too, btw)
18:21:55 <guillom> DanielK_WMDE: So in the end I'll be able to search for Pictures of the Mona Lisa taken in October 2009 with a Nikon D90?
18:22:19 <DanielK_WMDE> guillom: if the picture is annotated correctly, then yes.
18:22:27 <Steinsplitter> it is possible to edit cats in the fildescription?
18:22:29 <susannaanas> I have read descriptions of how properties will be divided between Wikidata and Commons for items such as Mona Lisa. How will they?
18:22:31 * guillom hugs DanielK_WMDE.
18:22:33 <Steinsplitter> *directly
18:22:40 <guillom> \o/
18:22:47 <darkweasel> will things like "october 2009", "nikon d90" be pulled automatically from the EXIF in general?
18:22:55 <DanielK_WMDE> Steinsplitter: exactly like now. no change.
18:22:57 <darkweasel> (even without uploadwizard)
18:23:07 <guillom> darkweasel: yes but at the moment you can't search using those; or the resolution
18:23:12 * DanielK_WMDE hugs guillom back
18:23:13 <Steinsplitter> DanielK_WMDE: great! so we don't need to rewrite bots.
18:23:20 <dan-nl> i'm interested to know how the pages actually work. for example, if i edit https://www.wikidata.org/wiki/Talk:Q17616737 the wikitext is no longer human readable. will a file page have both a wikidata subpage and a main wikitext page?
18:23:37 <Dereckson> darkweasel: the current state of the project is to have our own set of wikidata, how to handle EXIF tags is a separate issue
18:23:38 * JeanFred just caught back
18:23:47 <marktraceur> Welcome to the jungle, JeanFred
18:23:49 <Keegan> Hi JeanFred
18:23:53 * guillom hugs JeanFred too, for good measure.
18:23:56 <dan-nl> and can i store metadata that may not be in wikidata yet and i would not want displayed on the file page?
18:23:57 <DanielK_WMDE> darkweasel: possibly. not sure yet. it will definitly be possible to override what was taken from the exif
18:23:59 <Dereckson> darkweasel: you're suggesting a feature request: import from EXIF tags metadata not yet defined
18:24:00 <fabriceflorin> Jheald: We don’t have a specific agenda for this Q&A, as we would rather hear what questions the community has (and IRC chats are notoriously hard to keep focused on an agenda). That said, we would invite folks to consider the questions posed on this discussion page (and perhaps add new ones, as well as chime in after this chat): https://commons.wikimedia.org/wiki/Commons_talk:Structured_data
18:24:05 <Steinsplitter> hi JeanFred :)
18:24:24 <darkweasel> DanielK_WMDE, that sounds good :)
18:24:32 <gi11es> susannaanas: at the moment if you look at a file as an example https://commons.wikimedia.org/wiki/File:Mona_Lisa.jpg you can see that already some information is linked to metadata (artist, current location)
18:24:36 <Jheald> It would be nice to be able to edit topics and other CommonsData items through some of the wikitext of the File page
18:24:50 <dan-nl> if there are 2 pages, a wikidata page and wikitext page, how is the file page display determined?
18:24:51 <Steinsplitter> edit topics?
18:24:55 <Dereckson> darkweasel: if you're interested by this idea, please note it on a roadmap and in relevant times open a bug on Bugzilla to describe the feature request
18:25:03 <multichill> dan-nl: This is an example of us going overboard with LUA. Our actual templates on Commons should be better readable
18:25:26 <DanielK_WMDE> dan-nl: by templates on the wikitext page. just like infoboxes on wikipedia pull data in from wikidata, the file description page will pull in data from the media info page
18:25:29 <gi11es> susannaanas: the general idea, and this open discussion is a way to check that the separation makes sense to everyone, is that wikidata will still take care of storing general information about known artworks, artists, institutions, etc. and the commons wikibase information will be mainly about the photograph itself, not the artwork
18:25:36 <fabriceflorin> Hey JeanFred, glad you could join us :)
18:25:43 <gi11es> susannaanas: does my quick description make sense?
18:25:46 <multichill> So maybe we'll modify {{Artwork}} to pull the creator (if it's not set in the template) from the wikibase info
18:25:47 <Lydia_WMDE> Jheald: there are widgets being developed by the community for doing this on wikipedia atm. those are awesome experiments. i'd like to see them develop further and then see what we can integrate directly into the software. possibly also for commons
18:26:03 <Jheald> Question: I wrote (probably rather too much) on the talk page at Commons:Structured Data. Is this the kind of thing you were looking for? And will you have any comments coming back?
18:26:20 <susannaanas> It does, I just want to confirm, as I hear "All data will be stored in Commons" comments
18:26:59 <anomie> fabriceflorin: Is structured data for media not on Commons being considered too?
18:27:15 <dan-nl> thanks …. so the template might change so that it could accept a wikidata value or regular text value?
18:27:37 <dan-nl> and then if there are both values it would prefer the wikidata value?
18:27:38 <susannaanas> gi11es: Where will these divisions be discussed for items like maps?
18:27:39 <Lydia_WMDE> anomie: yes we are considering this but it'll not happen at the beginning
18:27:42 <Lydia_WMDE> we will start with commons
18:27:43 <Jheald> Lydia: Widgets being able to edit through templates on the file page would be very nice. So also would be presenting the metadata as 'fake' wikitext and picking up attempts to change that wikitext, eg by bots
18:27:45 <Lydia_WMDE> but keep others in mind
18:28:06 <fabriceflorin> anomie: We have to start somewhere, and beginning with Commons seems like the natural first step.
18:28:14 <multichill> dan-nl: Whatever we (the community) decide to give more priority.
18:28:14 <thedjNotWMF> dan-nl: actually, if the value is on commons, it would prefer the information from commons most likely.
18:28:39 <gi11es> susannaanas: do you have expertise regarding maps? it would be good to have someone on board with a focus on that type of media, if you feel like getting more involved
18:29:06 <thedjNotWMF> gi11es: you just asked that to the driving force behind wikimaps :)
18:29:11 <anomie> fabriceflorin, Lydia_WMDE: As long as it's on the roadmap, ideally in the "just add additional wikis to the config to enable it" sense
18:29:12 <gi11es> hah
18:29:23 <dan-nl> so, {{tl|Information | author = Q17616737 | Frans Hals }} would prefer Frans Hals?
18:29:26 <Steinsplitter> (DanielK_WMDE, kurz offtopic weil du da bist: Habt ihr ein scipt mit denen man interwikis nach WD mergen kann von com: namespace und die lokal entfernen)
18:29:35 <Jheald> "All data stored on commons" simply isn't true, as https://commons.wikimedia.org/w/index.php?title=File:Structured_Data_-_Slides.pdf&page=17 shows -- instead what is stored on Commons will often be links to items stored on Wikidata
18:29:40 <dan-nl> for example
18:29:42 <susannaanas> gi11es: We do the Wikimaps project, it's not only me, we are a community
18:29:48 <darkweasel> i'm definitely looking forward to structured data, i hope it will be as nice in practice as it sounds now :)
18:29:50 <Lydia_WMDE> Jheald: noted. it does seem like a pretty niche feature to me atm that'd be a huge effort to implement. but we can see if we can do some things that make this easier
18:30:02 * Jan_Ainali ponders OT when local uploads will be stopped and Commons be the only media database. Well, one can dream...
18:30:25 <marktraceur> Jan_Ainali: When we abolish copyright, I guess. So maybe 3 months?
18:30:26 <marktraceur> :)
18:30:35 <DanielK_WMDE> Steinsplitter: generally we don't do improt scripts, we leave that to the community. perhaps ask multichill or amir or other folks who deal with bots a lot.
18:30:49 <gi11es> susannaanas: then I would advise keeping track of this project closely, so that you can comment on the data storage division as it evolves from the drafts we have now to something more solidified
18:30:50 <Lydia_WMDE> darkweasel: at the beginning it will definitely not be as nice. but if we all pull together it'll be awesome in the future :)
18:31:10 <thedjNotWMF> dan-nl: assuming bogus syntax, then likely...
18:31:22 <gi11es> susannaanas: we do our best to keep in mind the very different use cases, but it's much better to have an expert look as a sanity check
18:31:24 <susannaanas> gi11es: More than keen to, dependent on it
18:31:43 <multichill> Steinsplitter: More scripts would be written like https://www.mediawiki.org/wiki/Manual:Pywikibot/Scripts#Wikidata to assist in converting things on Commons
18:31:58 <thedjNotWMF> actually... on that topic.
18:32:24 <thedjNotWMF> i know that dbpedia has some bots for that as well.
18:32:31 <Steinsplitter> ok
18:32:38 <fabriceflorin> One of the issues we have been discussing is what would be a good data structure for this project. For example, one suggestion is that every file could contain one or more works, with one or more contributors, and one or more licenses. What do you think of this approach, as roughly diagrammed here? https://commons.wikimedia.org/w/index.php?title=File:Structured_Data_-_Slides.pdf&page=15
18:32:54 <DanielK_WMDE> Jheald: things like "topics" will refer to wikidata items. information about the topics (e.g. the poluation and mayor of New York) will be managed on Wikidata. But even if several topics from Wikidata are linked, the important information is which topics are linked to which file, and that is stored on commons,.
18:33:35 <Jan_Ainali> Is there any thoughts on doing annotations in a structured way?
18:33:44 <darkweasel> hmmm that sounds interesting, so you're intending to make derivative works less awkward?
18:34:11 <Lydia_WMDE> darkweasel: yes!
18:34:23 <moogsi> fabriceflorin: i have been concerned about commons' current inability to distinguish in metadata between a file and the work depicted in that file, so it's encouragingto see it considered already
18:34:24 <fabriceflorin> darkweasel: Yes, the idea is to have a more general purpose way to recognize that multiple works and multiple contributors can be involved in the same file.
18:34:45 <moogsi> files often contain more than one instance of IP and more than one author, and it's currently a bit of a mess
18:34:46 <Jheald> DanielK_WMDE: Up to a point. But it also matters how that information is presented to ppl reading the file descriptions. Which is something that I think worries Steinsplitter & vandal fighters.
18:34:46 <multichill> https://commons.wikimedia.org/wiki/File:The_Nightwatch_by_Rembrandt_-_Rijksmuseum.jpg <- example image with annotations
18:35:19 <Lydia_WMDE> Jheald: yes and it worries us too. and it is definitely on my agenda for 2015 to build better tools to fight vandalism
18:35:22 <darkweasel> that sounds good, but i hope it won't be too complicated for people who want to upload just one own photo or edit its properties :)
18:35:34 <Lydia_WMDE> Jheald: which is actually easier in structured data than in unstructred text for example
18:35:41 <multichill> Jan_Ainali: Not a focus at the moment, but the community might build something themselves
18:35:49 <Jan_Ainali> multichill: Thanks for the example! :)
18:35:49 <DanielK_WMDE> Jheald: you would see the edit on your watchlist on commons. so at least as many eyes will be there to watch as there are now.
18:36:22 <darkweasel> aside from that, i think "license" should be a property of "work" not "file" – if a file is multi-licensed, later contributors might in principle choose to release their changes under only one license
18:36:24 <Lydia_WMDE> plus the people watching it on other projects like wikipedia!
18:36:28 <fabriceflorin> darkweasel: Good point. One idea that has been proposed it to not burden the uploader with identifying every work and contributor during the upload process, but to invite them after upload to improve the metadata on their page.
18:36:41 <DanielK_WMDE> that's perhaps an improtant point to mention again: if data from wikidata is used on a page on commons, and you watch that page, you will see any edits to the data item, as if they had happened locally to the page
18:36:43 <Jan_Ainali> multichill: I imagine it could be a propery with qualifiers to do it.
18:37:45 <Jheald> Lydia & Daniel: good points. What matters I guess is making sure the tools used by Commons vandal fighters can keep up
18:37:47 <DanielK_WMDE> it's a feature that needs improvement (still broken in the advanced watchlist), but it'S there.
18:37:56 <Steinsplitter> It seems the important questions are ansvered, so we can sleep all well and chill. :) oh joy.
18:38:22 <tgr> darkweasel: that's the way it's planned, yes
18:38:27 <thedjNotWMF> Jan_Ainali: it's probably not too hard, but it's not very high on the priority list probably, since if the core is there, then this is 'implementable'. and thought it might not be easy then, it's also not out of the realm of posisibilities.
18:38:55 <thedjNotWMF> Jan_Ainali: and until that time it can be on wikitext still
18:38:55 <fabriceflorin> Steinsplitter: Yes, I think we have a lot of great contributors focusing on this project, and we plan to move very carefully at each step of the way, in consultation with community members like you :)
18:39:10 <Jan_Ainali> Oh DanielK_WMDE that brings up another interesting question: If I watch a page on Wikipedia where an image is used, will I see in my watchlist edits on Wikidata that affects the image?
18:39:26 <Steinsplitter> fabriceflorin: :)
18:39:39 <Jan_Ainali> missed a comma between watchlist and edits
18:40:04 <DanielK_WMDE> Jan_Ainali: no, because the integration of commons with wikipedias isn't that suphisticated :)
18:40:26 <darkweasel> btw, an idea – you could also try to change the concept of filenames – right now they can be only in one language, and the unique identifier needs to be the same thing as the <title>
18:40:29 <Jan_Ainali> DanielK_WMDE: I might submit a feature request ;)
18:40:33 <DanielK_WMDE> but such edits shouldn't have an impact on what you see on wikiepdia anyway
18:40:48 <marktraceur> darkweasel: There's a patch for the backend in core that will use a SHA1 for the unique ID on the backend
18:40:52 <darkweasel> it might be a good idea to have a separate "title" attribute that's presented to the user in most places (except when they want to insert the file somewhere)
18:40:53 <marktraceur> But that won't change the display
18:41:30 <DanielK_WMDE> darkweasel: moving away from filenames fro *referring* to an image is going to be hard. it's something that would be nice, but it's not somethign i want to tie to the structured data project
18:41:49 <DanielK_WMDE> darkweasel: however, images will have localizable labels/titles, that can be used in listings, for captions, etc
18:41:54 <Jheald> Q: There are already quite sophisticated models on Wikidata for eg scans from books, distinguishing a "manifestation" (the scan) from an "edition" (a particular variant) from a "work" (the underlying book). Are these models on your radar?
18:42:14 <fabriceflorin> darkweasel: Glad you are thinking along the lines of a translateable “title”, which is high on the priority list. The actual file name will always be preserved, but we could surface the title more often.
18:43:00 <marktraceur> fabriceflorin: I don't know about "always", it could be changed maybe in the future, but it's not currently on our roadmap I think
18:43:02 <darkweasel> ah, that's good, so basically the answer is "yes" – i hope that will reduce the need to rename files that some people feel far too often now on commons :)
18:43:58 <Lydia_WMDE> Jheald: it will be largely up to the community but there are some things we need to settle on to build the tools we'll be working on. we'll get into those detailed discussions over the next weeks and months
18:44:06 <Dereckson> darkweasel: the rename on Commons is limited to precise cases, like misspelling or non informative name like IMG45504.jpg / I'm not sure how a good title could eliminate these rename needs.
18:44:18 <Lydia_WMDE> with the community
18:44:56 <fabriceflorin> marktraceur: Good point. From a product standpoint, it would be good if we started using clear titles more, and have the file names become less important. It would seem useful if over time the file names could be simply alphanumeric, without burdening them from having to include content titles as well.
18:44:57 <darkweasel> Dereckson, well if no one ever sees the filename anymore except when inserting it into an article, then i can't see much of a reason why it shouldn't be IMG45504.jpg
18:45:32 <DanielK_WMDE> darkweasel: though it's still nice to have "sensible" file names when reading wikitext
18:45:36 <fabriceflorin> Are there some important questions we haven’t addressed yet?
18:45:38 <Dereckson> darkweasel: to be able reading the text of the article using the media to have a meaningful information for example
18:45:39 <dennyvrandecic> darkweasel: and the insertion into an article should be invisible thanks to VE
18:45:44 <marktraceur> darkweasel: It will help when we can figure out a way to add images on-demand in an edit window instead of having to use the file name - see e.g. VE's image insert dialog
18:45:45 <DanielK_WMDE> otoh, once we no longer need to read wikitext...
18:46:01 <dennyvrandecic> exactly
18:46:08 <darkweasel> yeah, that's more a matter of commons policy than of structured data
18:46:17 <darkweasel> so not really your concern :)
18:46:26 <DanielK_WMDE> it'S our concern to make it possible :)
18:46:44 <Jheald> Lydia_WMDE: Probably important to think what different dates you want to hold on CommonsData -- and any other things you may want to sort a selection by
18:46:48 <moogsi> fabriceflorin: we haven't really talked about topics vs categories, but i think they are fuzzy enough that there may be no intelligible questions :)
18:46:48 <Jan_Ainali> Perhaps I missed, will "topics" be Wikidata items or Wikibase Items on Commons?
18:46:54 <Lydia_WMDE> Jheald: yeha totally
18:46:59 <moogsi> what is a 'topic'?
18:47:12 <DanielK_WMDE> moogsi: whatever has a Q-number on wikidata.
18:47:25 <multichill> For example Berlin is a topic.
18:47:28 <Lydia_WMDE> Jan_Ainali: the former
18:47:33 <Jan_Ainali> Awesome!
18:47:45 <Lydia_WMDE> because then we can reuse label translations and so on and so forth
18:47:55 <moogsi> DanielK_WMDE: that makes i18n much easier
18:47:58 <Lydia_WMDE> get additional data, links to wikipedia and so on
18:48:01 <DanielK_WMDE> moogsi: indeed
18:48:04 <fabriceflorin> moogsi: I think it is best to think of categories and topics as co-existing, rather than an either/or approach: both have their value, and can play a useful role in our multimedia ecosystem.
18:48:09 <dennyvrandecic> in my understanding, categories basically won't change. everything the community is doing now with categories, they will continue to be able to do.
18:48:24 <Lydia_WMDE> yeah
18:48:25 <Jheald> Will topics also have an (optional) property, to indicate how they relate the the file (eg painter, movement, depicted subject, etc) ?
18:48:30 <moogsi> i have no love for categories and have always found them disastrously unfit for purpose
18:48:42 <darkweasel> but i hope it will be possible to filter even categories by date, filesize, license etc.
18:48:52 <dennyvrandecic> darkweasel: just as now
18:49:13 <darkweasel> so no improvements planned there? because that's one thing i was looking forward to
18:49:13 <dennyvrandecic> darkweasel: this proposal is not about improving this part, in my understanding
18:49:19 <dschwen> if topics are generic wikidata items then the info Jheald wants is already there
18:49:29 <Lydia_WMDE> Jheald: possibly. or they should be using different properties maybe?
18:49:37 <multichill> And don't forget that categories and topics can be connected!
18:49:39 <dschwen> what we cannot do (i believe) is adding prepositions to topics
18:49:45 <fabriceflorin> I would encourage everyone to take a long-term perspective with this initiative: it will take some time for our ecosystem to evolve, and we will not be able to do everything we want all at once. But by starting with small experiments, we can measure and validate our assumptions, learn from any small mistakes, and gradually build a world-class multimedia system :)
18:50:08 <DanielK_WMDE> darkweasel: we have plans for querying/searching by those things. how these searches would integrated categories is an open question. but with the new cirrus search, that should not be a problem
18:50:13 * multichill looks forward to the day he can shutdown https://commons.wikimedia.org/wiki/User:CategorizationBot
18:50:15 <DanielK_WMDE> perhaps even including subcategories
18:50:26 <darkweasel> ok that sounds better
18:50:39 <Keegan> multichill: I'm looking forward to that day too. I hate that bot :)
18:50:40 <Jan_Ainali> This will enable a sort of a backwards lookup of {{objectlocation}} through coordinates on wikidata for images on commons without coordinates attached. That will make all map lovers happy :D
18:50:44 <Jheald> dshwen: it would need to be on CommonsData, because digging things out of Wikidata is hard, especially if you don't know a-priori whether or not they should be there
18:51:06 <thedjNotWMF> as I mentioned at Wikimania, let us not forget how long it took us to add an {{tl|information}} template to every file. Now we have 100x more files, but it basically a similar problem. we can't do it in one go. too much work.
18:51:20 <Steinsplitter> *away* now, thanks again or replying to my questions. It is great to work with people liek you on wikimedi.
18:51:24 <thedjNotWMF> so a gradual approach is a given. there is no other way
18:51:24 <manybubbles> cirrus will grow deep category knowledge at some point. I'm not sure when.
18:51:25 <multichill> Jan_Ainali: First step would be to move the coordinates from the template to CommonsData and just have {{objectlocation}} show those
18:51:33 <fabriceflorin> What experiments and prototypes do you think we should work on first? Would it be useful to start with a simple field, like location or creation date, and try to get it to work with structured data?
18:51:34 <darkweasel> there are still files without information templates – just saying
18:51:46 <Jheald> If you can refine selections, it should be equally possible to refine the contents of categories, just treating the category as an initial selection
18:51:55 <Keegan> thedjNotWMF: information templates are still missing! I added some the other day to really old files
18:52:01 <Jan_Ainali> multichill: yes, that is a nice task for a bot!
18:52:29 <DanielK_WMDE> manybubbles: johannes is eager to make it happen :)
18:52:33 <multichill> Jan_Ainali: Take a look at https://www.wikidata.org/wiki/Wikidata:Coordinates_tracking , we're already doing it for Wikipedia
18:52:42 <darkweasel> fabriceflorin, yeah, that sounds good – location is probably a good start, date= is a too versatile parameter
18:52:53 <Jan_Ainali> multichill: Or I think you missunderstood me. I was talkning about files without any location at all
18:53:00 <multichill> Right
18:53:10 <fabriceflorin> We have 8 minutes left: any final topics you would like to discuss?
18:53:14 <Lydia_WMDE> alright. we have about 8 mins left
18:53:18 <Lydia_WMDE> any big remaining questions?
18:53:24 <Scott_WUaS> �fabriceflorin: Where does Structured Data / Wikidata and possible future use cases stand in terms of planning - see https://meta.wikimedia.org/wiki/Wikidata/Notes/Future - vis-a-vis WUaS. (We're already seeing many interesting cases using Wikidata/Wikibase emerge such as http://wikiba.se/). In talking with you about this further, could I develop a weekly or biweekly office hour for projects interested in possible
18:53:25 <Scott_WUaS> future use cases? Does such an office hour already exist? May I possibly please email you about this?
18:53:28 <multichill> That should be possible Jan_Ainali, I'm already doing that for monuments
18:53:36 * JeanFred shouts out to dschwen FastCCI.
18:53:49 <Jan_Ainali> multichill: You are awesome!
18:54:15 <Scott_WUaS> and Lydia_WMDE
18:54:16 <thedjNotWMF> i'm also wondering what people would find a good starting point in terms of data...
18:54:16 <Jheald> Q that I asked above, re the Commons:Structured Data talk page. Is what I wrote there the kind of thing you are looking for ? Will you be commenting back ?
18:55:10 <susannaanas> I would like to invite interested to advise and discuss map metadata https://docs.google.com/spreadsheets/d/1Hn8VQ1rBgXj3avkUktjychEhluLQQJl5v6WRlI0LJho/edit#gid=0 and move the discussion where it belongs
18:55:14 <Lydia_WMDE> Scott_WUaS: totally. you're always welcome to email me :)
18:55:27 <Keegan> Jheald: I found it useful, I know everyone read it. I'm sorry there aren't any comments back yet - things are still spinning in the air, as you know :)
18:55:28 <thedjNotWMF> Jheald: yes, it is very valuable
18:55:49 <Jan_Ainali> susannaanas: Make it wiki page, that is where it belongs ;)
18:56:02 <Jheald> Oh and can we confirm that this will be on a Wikibase? I was confused by how that relates to DanielK's earlier proposal for a special File Info page, with its own ad-hoc syntax ?
18:56:05 <Keegan> Five minute warning!
18:56:06 <Lydia_WMDE> Jheald: yes that is very helpful
18:56:21 <Keegan> Of course the conversation can continue after that, but it'll be off hours
18:56:28 <susannaanas> Jan_Ainali: :)
18:56:33 <fabriceflorin> Jheald: Thanks for contributing on our discussion. We are reviewing your many comments and will respond in coming weeks. We also invite more people to add comments on that page: https://commons.wikimedia.org/wiki/Commons_talk:Structured_data
18:56:33 <thedjNotWMF> Jheald: if i have one thing to comment on that, that i would like to see a bit more of what you value in terms of priority and for which reasons perhaps.
18:56:48 <Jheald> Keegan: Besides, I imagine there are little whirlpools like the MediaViewer to attend to some of the time...
18:56:55 <marktraceur> "little"
18:56:58 <DanielK_WMDE> Jheald: ad hoc syntax? no. The wikibase software will be installed on commons. structured data about files will be in a special namespace. normal file description pages can access that data (probably mostly via Lua in templates)
18:57:08 * dschwen raises his arms 5 mins late. Thx JeanFred
18:57:13 <Keegan> ...>_>
18:57:16 <Keegan> Never heard of it.
18:57:53 <DanielK_WMDE> Jheald: i hope that clears up any confusion i might have created
18:57:56 <fabriceflorin> Hehe. Yes, MV has kept us busy. But we can’t wait to start on Structured Data, which many contributors have told us is the most important thing our team could be working on in coming years. :)
18:58:40 <Keegan> Well, thanks for coming all. Very useful, I believe
18:58:42 <Jheald> DanielK: sure, & the special namespace isn't editable wikitext, so for an end user like me, it shouldn't really matter how the data is physically arranged.
18:58:53 <DanielK_WMDE> indeed.
18:58:58 <DanielK_WMDE> same as on wikidata
18:58:58 <fabriceflorin> Thank you all so much for joining this chat. It’s such a pleasure to be working with smart, constructive collaborators like you. I look forward to more collaborations with you all in the future!
18:59:04 <Jan_Ainali> fabriceflorin: If I as a contribuotor can chip in, please don't make us wait years!
18:59:09 <Lydia_WMDE> yes indeed! thank you so much for coming
18:59:11 <Keegan> We'll have another one of these in the near future
18:59:28 <Keegan> #endmeeting Structured Data