Page MenuHomePhabricator

Make `haswbstatement:` work for the EntitySchema property
Closed, ResolvedPublic1 Estimated Story Points

Description

As a user I want to search what statements are being made using an EntitySchema value in order to see what class of Items are governed by an EntitySchema.

Problem:
Currently, users are unable to search for statements that made using the EntitySchema property P12861. When users search haswbstatement:P12861 it comes up with zero results.

We need to make sure this works so that users can see what statements are being made using this property so that they can see what class of Items are governed by EntitySchemas.

This can currently be done using the What links here on the property page, but ideally we would like to have both options available for people to make this search.

Acceptance criteria:

  • Search for satements using P12861 are able to be made with haswbstatement:

Notes

While testing the deployment ^, I noticed that this doesn’t actually affect entities with EntitySchema-type statements (e.g. Q5); it only affects entities with other statements (e.g. Item-type statements, though probably some other types too) that had an EntitySchema qualifier. Which in practice was probably only P12861 (until I reproduced the situation on the sandbox item to verify the fix).

I think this is actually a hint that we ought to add entity-schema to the $wgWBRepoSettings['searchIndexTypes'] in the production config, so that EntitySchema statements are indexed? @Lydia_Pintscher or @Arian_Bozorg: I assume we want users to be able to search for haswbstatement:P12861=E10?

Event Timeline

Change #1052699 had a related patch set uploaded (by Lucas Werkmeister (WMDE); author: Lucas Werkmeister (WMDE)):

[operations/mediawiki-config@master] Add entity-schema to $wgWBRepoSettings['searchIndexTypes']

https://gerrit.wikimedia.org/r/1052699

I think the above change should work, but I’d love for someone from Discovery-Search to take a look and see if it makes sense. I can only partially test it locally – I can see that EntitySchema IDs start to show up in the statement_keywords of action=query&prop=cirrusbuilddoc API output, but I don’t actually have ElasticSearch installed locally, so I don’t know if the search works.

Gehel set the point value for this task to 1.Jul 8 2024, 3:50 PM
Gehel subscribed.

The patch from @Lucas_Werkmeister_WMDE seems good, we'll merge it.

It will take ~8 weeks to reindex all entities, after that the result will be available. New edits will be available with the usual ~10 minutes delay after the edit. If needed, we could identify all entities having an EntitySchema property and reindex those manually, but this requires some work and instrumentation on our side. Let us know if that's important enough or if we can live with new edits being processed correctly and 8 week wait time for full coherence.

If needed, we could identify all entities having an EntitySchema property and reindex those manually, but this requires some work and instrumentation on our side.

The data type is brand new, so this should be a pretty small set of entities. There’s only one property with this data type at the moment (list), and there are currently fewer than 500 entities linking to it; so if the hard part is just identifying the entities, that hopefully helps :) but if the hard part is reindexing a list of specific entities, then my guess would be that we can wait for the 8 weeks to pick up any stragglers that aren’t being edited anyway (unless @Arian_Bozorg disagrees).

That sounds like a plan, thanks so much :)

Change #1052699 merged by jenkins-bot:

[operations/mediawiki-config@master] Add entity-schema to $wgWBRepoSettings['searchIndexTypes']

https://gerrit.wikimedia.org/r/1052699

Mentioned in SAL (#wikimedia-operations) [2024-07-15T13:02:33Z] <logmsgbot> lucaswerkmeister-wmde@deploy1002 Started scap sync-world: Backport for [[gerrit:1052699|Add entity-schema to $wgWBRepoSettings['searchIndexTypes'] (T369495)]]

Mentioned in SAL (#wikimedia-operations) [2024-07-15T13:15:39Z] <logmsgbot> lucaswerkmeister-wmde@deploy1002 lucaswerkmeister-wmde: Backport for [[gerrit:1052699|Add entity-schema to $wgWBRepoSettings['searchIndexTypes'] (T369495)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)

Mentioned in SAL (#wikimedia-operations) [2024-07-15T13:33:24Z] <logmsgbot> lucaswerkmeister-wmde@deploy1002 Finished scap: Backport for [[gerrit:1052699|Add entity-schema to $wgWBRepoSettings['searchIndexTypes'] (T369495)]] (duration: 30m 51s)

If needed, we could identify all entities having an EntitySchema property and reindex those manually, but this requires some work and instrumentation on our side.

The data type is brand new, so this should be a pretty small set of entities. There’s only one property with this data type at the moment (list), and there are currently fewer than 500 entities linking to it; so if the hard part is just identifying the entities, that hopefully helps :) but if the hard part is reindexing a list of specific entities, then my guess would be that we can wait for the 8 weeks to pick up any stragglers that aren’t being edited anyway (unless @Arian_Bozorg disagrees).

Here are all the potentially affected pages, if it helps: P66482

For the sandbox item, haswbstatement:P12886=E123 seems to work \o/

(Note that it’ll stop working as soon as someone resets the sandbox again ^^)

Gehel claimed this task.