SDoC TTL dumps are different enough from the Wikidata dumps that we need to adapt the process. The exact adaptation needed need to be discovered.
Acceptance criteria:
- dumps are munged correctly and can be loaded into Blazegraph
SDoC TTL dumps are different enough from the Wikidata dumps that we need to adapt the process. The exact adaptation needed need to be discovered.
Acceptance criteria:
The munger should exclude rdf:type statement by default:
SELECT ?o { wd:M19705716 a ?o . }
returns :
schema:ImageObject schema:MediaObject wikibase:Mediainfo
similar query on do not return such statements.
I think that schema:ImageObject should be kept since we may have AudioObject and VideoObject
Change 616104 had a related patch set uploaded (by ZPapierski; owner: ZPapierski):
[wikidata/query/rdf@master] Small fixes to sdoc data reload
Change 616105 had a related patch set uploaded (by ZPapierski; owner: ZPapierski):
[wikidata/query/rdf@master] Allow usage of sdc prefixes
Change 616110 had a related patch set uploaded (by ZPapierski; owner: ZPapierski):
[operations/puppet@production] Use correct UriScheme in Blazegraph
Change 616104 merged by jenkins-bot:
[wikidata/query/rdf@master] Small fixes to sdoc data reload
It would be helpful if at least one of the rdf:type statements were retained, as they make it easy to select a subset of M-IDs for a query to work on
SELECT ... WITH { SELECT ?file WHERE { ?file a schema:MediaObject } LIMIT 5000 } AS %files ...
@Jheald we perhaps don't need to have both schema:MediaObject and wikibase:Mediainfo?
Change 616110 merged by Ryan Kemper:
[operations/puppet@production] [wcqs] use correct UriScheme in blazegraph
Change 616105 merged by jenkins-bot:
[wikidata/query/rdf@master] Allow usage of sdc prefixes