EMWCon Spring 2016/Provenance working group
The Provenance working group (Yingjie, BlaueBlĂźte) at EMWCon Spring 2016 developed some ideas on tracking provenance of data stored in Semantic-MediaWiki installation and on how to make use of such metadata.
Goals
[edit]- Enable users to identify trustworthiness of data in SMW.
- Capture the WhereâWhenâWho of data to facilitate (content) management.
- Enable analysis of history of sematic dataâlineage.
- Track connection between âclassesâ and âinstancesâ over timeâhas the definition of a âclassâ changed after it was instantiated?
Use Facets
[edit]- Define provenance metadata along with, e.g., property values.
- View provenance data alongside page display, query results, etc.
- Use provenance data in queries, e.g. to restrict queries based on trustworthiness.
Sources of provenance data
[edit]- wiki-internal
- contributors
- edit timestamps
- external
- editor-provided, like references
- hybrid (?)
- external ratings of contributors
- external ratings of individual pages (e.g., page rank?)
- wiki-internal
Implementiation Ideas
[edit]Strategies
[edit]- Amending SMW syntax
- Subobjects
Example
[edit]This shows how external provenance data could be defined:
'''[[Name::John Doe]]''' was born on [[Has birthdate::1978-03-12|ref=personal website|refurl=http://www.johndoe.me/]] and currently works for [[Has employer::NASA|ref=Humanityâs Quest to Reach Mars, The New York Times, May 27, 2016|refbrief=New York Times|refdate=2016-05-27]].
...or:
'''[[Name::John Doe]]''' was born on {{Provenance template|property=Has birthdate|value=1978-03-12|ref=personal website|refurl=http://www.johndoe.me/}} and currently works for {{Provenance template|property=Has employer|value=NASA|ref=Humanityâs Quest to Reach Mars, The New York Times, May 27, 2016|refbrief=New York Times|refdate=2016-05-27}}.
A query could then look like this:
{{#ask: [[Category:Person]] [[Has Employer::NASA]] |?Name |?Has birthdate |newer_than=2015-01-01 |show_unreliable=yes |show_provenance=yes }}
This query might then return something like:
 | Name | Date of Birth | Ref | Last Edit |
---|---|---|---|---|
John Doe | John Doe | March 12, 1978[old!] | New York Times | 2 days ago |
Jane Doe | missing | before January 1, 2015 |