Shortcuts: WD:RAQ, w.wiki/LX

Wikidata:Request a query

From Wikidata
(Redirected from Wikidata:RAQ)
Jump to navigation Jump to search

P549 duplication

[edit]

Hi guys !

It's been a while since I last bugged you with my requests.

I've been associating items of astronomers with their scientific articles items for a while now, and I noticed that several Mathematics Genealogy Project ID (P549) items are duplicates of existing ones (example : 1 and 2). I've been trying to automatically identify these duplicates for some time, but, of course, my query attempts on the subject time out (see below), with or without limits, with or without more identifiers for the non-549 items, etc..

Do you have any ideas on what could be done?

SELECT DISTINCT ?item1 ?item2 ?l1 ?l2
WHERE
{
	?item1 wdt:P31 wd:Q5 ;
	      wdt:P549 [] ;
          wikibase:identifiers 1 ;
          rdfs:label ?l1 .
   ?item2 wdt:P31 wd:Q5 ;
          wdt:P496 [] ;
          rdfs:label ?l2 .
       FILTER(LANG(?l1) IN ("en")).
       FILTER(LANG(?l2) IN ("en")).
  MINUS{?item2 wdt:P549 [].} .
       FILTER(?l1 = ?l2).
}
Try it!

Simon Villeneuve (talk) 12:07, 5 December 2024 (UTC)[reply]

@Simon Villeneuve: Quitte à procéder en deux temps c'est plus simple, on peut commencer par lister les items avec des homonymes, et trouver les homonymes pour chacun d'entre eux. On trouve environ 8000 noms, certains avec des dizaines d'homonymes :
SELECT distinct ?item1 ?l1 WHERE {
  ?item1 wdt:P31 wd:Q5;
         wdt:P549 _:b101;
         wikibase:identifiers 1 ;
         rdfs:label ?l1.

    ?itemHomonym wdt:P31 wd:Q5;
           wdt:P496 _:b102;
           rdfs:label ?l1.
    FILTER((STR(?itemHomonym)) > (STR(?item1)))
    MINUS { ?itemHomonym wdt:P549 _:b103. }

  FILTER(LANG(?l1) IN("en"))
}
Try it!
Il n'est pas nécessaire de filter sur "?l1=?l2" on peut juste réutiliser "?l1" dans les deux motifs de graphes .
Je prépare un notebook pour aider à la suite si nécessaire. author  TomT0m / talk page 15:26, 5 December 2024 (UTC)[reply]
Coucou,
Merci !
C'est la première fois que je vois une notation comme _:b101. Pourquoi utilises-tu celle-ci plutôt que les crochets ([]) ?
Sinon, en ajoutant ?itemHomonym dans les variables de début, j'arrive à 55 000 résultats.
Penses-tu que ces possibles homonymes méritent d'être listés quelque part ? Si oui, où ? Simon Villeneuve (talk) 16:24, 5 December 2024 (UTC)[reply]
Salut, j'ai fait un notebook : https://public-paws.wmcloud.org/User:TomT0m/Nobels/Duplicate_mathematicians.ipynb (tout n'est pas listé, il y en a trop et si on affiche tout ça ne fonctionne pas, il faut recalculer dans son espace perso et virer ce que j'ai mis pour n'afficher que le début et la fin). Voir le tableau à la fin du notebook, il faudrait peut-être
Les syntaxes pour les nœuds blancs c'est pour nommer les nœuds et c'est juste parce que j'ai cliqué sur le diamand pour formater la requête dans l'éditeur, ça n'a pas d'importance particulière. Ça permet éventuellement d'identifier un nœud blanc dans la requête pour le réutiliser je crois (?). author  TomT0m / talk page 19:22, 5 December 2024 (UTC)[reply]
@TomT0m: Coucou,
Je suis embêté par la ligne FILTER((STR(?itemHomonym)) > (STR(?item1))) de la requête. En effet, les numéros des éléments doublons du math genealogy project sont presque toujours supérieurs aux itemHomonym, alors que cette ligne demande de garder les numéros des itemHomonym supérieurs à ceux du MGP. J'ai reformulé la requête pour afficher les STR et c'est bel et bien cela. Simon Villeneuve (talk) 13:07, 15 December 2024 (UTC)[reply]
P.S. : J'imagine que c'est parce que la requête explore les chiffres du numéro de l'élément de gauche vers la droite à partir du Q et que dès qu'elle tombe sur un chiffre plus grand d'un côté, elle le classe automatiquement plus grand que l'autre malgré que le chiffre global est plus petit. Par exemple, Q102441089 est considéré plus petit que Q39183791 parce que le premier chiffre à gauche du Q est 1 pour le premier et 3 pour le second. Je fais cette déduction en changeant le sens du < dans la requête.
Si c'est ça, alors il me semble y avoir un fail fondamental dans la fonction. Simon Villeneuve (talk) 13:13, 15 December 2024 (UTC)[reply]
@Simon Villeneuve Ah oui je me suis complètement emmêlé les pinceaux sur ce que tu attendais je crois. Sur le notebook ça n'a aucun intérêt en plus le "inférieur". (je l'ai viré dans le notebook). Mais de toute façon pour le notebook c'est pas très grave, l'idée c'était de procéder en deux temps et de ne garder un seul élément "témoin" par nom, pour éviter l'explosion combinatoire. Si tu fais une jointure sur le nom pour les noms avec beaucoup d'homonymes (comme certains noms chinois avec 50 candidats en bas avec 50 éléments correspodants, tu te retrouves avec 2500 paires, ça fait exploser la cardinalité. L'idée c'est du coup de ne garder qu'un élément par nom (maintenant je garde l'élément en candidat doublon, c'était là que ça pêchait peut être mais c'était pas grave, potentiellement, tant qu'on gardait le nom associé)
Pour être sûr, tu voudrais un truc comme "Pour chaque nom pour lequel on a un élément avec un unique identifiant « math genealogy project », ou voudrait lister tous les mathématiciens homonymes pour vérifier qu'il y a un doublon ?"
Si c'est ça après une ou deux corrections je pense (pas encore sûr qu'il n'y a plus de soucis) que le notebook devrait correspondre au besoin, à peu prêt, maintenant. Si tu cliques sur le tableau en bas sur un nom comme Kaushik Das (Q102362383), il n'a qu'un unique identifiant MGP et il y a 3 candidats listés pour "l'original" :
Ça donne : 
Mustafa Cengiz 3 1 2 3
author  TomT0m / talk page 13:28, 21 December 2024 (UTC)[reply]
(c'est bien maintenant on peut faire du copier coller de tableau HTML et ça marche ! C'est moins bien par contre c'est incompatible avec l'outil réponse pour l'instant), obligé de reformater en wikicode) author  TomT0m / talk page 13:31, 21 December 2024 (UTC)[reply]
Après il y a aussi une grosse limitation avec le notebook public, c'est que si on essaye d'afficher le tableau en entier dans le notebook il est trop gros et il est impossible de sauvegarder la page pour l'exporter publiquement. Il est donc affiché tronqué à l'heure actuelle (tout à l'heure c'était les co
Si les plus petits sont résolus et qu'il y a des faux positifs pour éviter d'encombrer les lignes il va falloir gérer les faux positifs éventuels, en rajoutant des identifiants le cas échéant, ou au pire faudra gérer des "différent de" dans les requêtes pour les exclure. On verra à l'usage et sur ce qu'on veut faire de tout ça. author  TomT0m / talk page 13:47, 21 December 2024 (UTC)[reply]

Adding located in the administrative territorial entity in a query

[edit]

I have this query https://w.wiki/CLJV but need in addition located in the administrative territorial entity (P131) Pmt (talk) 17:24, 7 December 2024 (UTC)[reply]

SELECT DISTINCT ?item ?itemLabel ?administration ?admLabel WHERE {
  {
    SELECT DISTINCT * WHERE {
      ?item p:P31 ?statement0.
      ?statement0 (ps:P31/(wdt:P279*)) wd:Q11315.
      ?item wdt:P17 wd:Q20;
            wdt:P131/wdt:P279* ?administration.
    }
    LIMIT 1000
  }
  SERVICE wikibase:label {
    bd:serviceParam wikibase:language "[AUTO_LANGUAGE], en, sv, nb" .
    ?item rdfs:label ?itemLabel . ?item schema:description ?description .
    ?administration rdfs:label ?admLabel .
  }
}
Try it!

Here You go, regards, Piastu (talk) 15:24, 8 December 2024 (UTC)[reply]

@Pmt:
#title: shopping centers in Norway
#defaultView:Map,Table
SELECT distinct ?item ?itemLabel ?itemDescription ?image ?coord ?verwaltungseinheitLabel ?streetLabel ?zip ?adress ?website
WITH {
 SELECT distinct ?region                   # subquery for all administrative territorial 
 WHERE {
 hint:Query hint:optimizer "None" .         
 ?region wdt:P131* wd:Q20.                 # set administrative territorial: Norway or other        
 MINUS { ?region wdt:P576 _:b0. }          # no end date of region
 }
} AS %region
WHERE {
  INCLUDE %region.
  ?item wdt:P131 ?region.      
  ?item (wdt:P31/wdt:P279*) wd:Q11315 .    # is a shopping center (or subcategory)
  MINUS { ?item wdt:P582  _:b1.}           # no end date
  MINUS { ?item wdt:P576  _:b2.}           # no demolished date
  MINUS { ?item wdt:P3999 _:b3.}           # no date of offical closure 
  MINUS { ?item wdt:P31 wd:Q19860854 }     # no destroyes buildings
  MINUS { ?item wdt:P31 wd:Q15893266 }     # no former entity
  OPTIONAL { ?item wdt:P18 ?image . }     
  OPTIONAL { ?item wdt:P625 ?coord. }  
  OPTIONAL { ?item wdt:P669 ?street. }  
  OPTIONAL { ?item wdt:P281 ?zip. }  
  OPTIONAL { ?item wdt:P6375 ?adress. }  
  OPTIONAL { ?item wdt:P856 ?website. }  
  OPTIONAL { ?item wdt:P131 ?verwaltungseinheit. }  
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE]". }
}
shopping centers in Norway
This is my solution for you. You can set the administrative territorial by your self. Also you can set your focus (current shopping center, maybe cinemas or others). Best regards --sk (talk) 15:51, 20 December 2024 (UTC)[reply]

Filter by state not valid any more (P582)

[edit]
SELECT distinct ?id ?end
    WHERE {
    ?id wdt:P131* ?target.
    values ?target {wd:Q1741}
    ?id p:P131* ?t2 .
    minus {?t2 pq:P582 ?end . } # check if still (qualifier P582 to enddate)

    values ?id {wd:Q116955599} # filter for simpler test
  }
Try it!

How to get rid of all matches where the assignment to located in the administrative territorial entity (P131) = Vienna (Q1741) is not valid any more (by checking end time (P582))? In the case above the relevant match is the filtered Two Apostles Saints (Q116955599). best --Herzi Pinki (talk) 12:15, 16 December 2024 (UTC)[reply]

SELECT distinct ?id ?end
    WHERE {
    ?id p:P131* ?statement.
    ?statement ps:P131* ?target .
    values ?target {wd:Q1741}
    OPTIONAL { ?statement pq:P582 ?end . }
    FILTER ( !bound(?end) )

    values ?id {wd:Q116955599} # filter for simpler test
  }
Try it!
learned more: this will do the required thing. thanks --Herzi Pinki (talk) 22:23, 18 December 2024 (UTC)[reply]
the solution above works to filter away items with end time (P582), but it ignores the path down located in the administrative territorial entity (P131) and matches only items directly in ?target, not in subpaths thereof. Still need help. --Herzi Pinki (talk) 15:32, 19 December 2024 (UTC)[reply]
@Herzi Pinki I'm not sure this is possible with arbitrary long paths actually. This is the kind of cases it would be very convenient to have all "non historical" locations actually at the "best ranks" statements, it would be way easier.
It's easy to get all the statements on any paths from ?id to ?target that have an end date, this query works : https://w.wiki/CUUE for example.
From there there might not really something that works. We would like to express "there exists a path from ?id to ?target such that no ?statement along the path contains one on the statements of the previous query". But here we got a problem because say there is a path
"?id" P131=> "ancient location (with end date)" P131=> "something actual (without end date)" P131=> "Vienna"
If we want to apply the approach I took above, with an intermediary (sketch of query) ?id P131* ?intermediary . ?intermediary P131* "vienna" we got two matches for this query for ?intermediary, "ancient location" and "something actual", in our result set. Filtering out to remove all the "end date" statements will remove only one of the two "ancient location", so we still have a matching path to go from ?id to "Vienna" although one edge has an "end date".
We could remove all pairs "?id" "?target" for which there exists a path with an edge with an end date but … that would remove too much results as there also could be a path without an edge !
I think there is a way if we put a bound in the number of steps however because we can check every steps by checking their statements. I'll try to propose something, like, say, in the following days. author  TomT0m / talk page 17:43, 20 December 2024 (UTC)[reply]
Thanks @TomT0m: for caring. Full of hope. --Herzi Pinki (talk) 20:47, 20 December 2024 (UTC)[reply]

@Herzi Pinki: I think I got somewhere, I created a template, {{Now localized into}} that hopefully will help. It should work for "not to long" path, I think by default 4 or 5 or something like that, I'll update the doc. If not enough there is a parameter to increase the max depth.

SELECT distinct ?id ?end {
      {
	?id p:P131 ?id_to_end_1_P131_2_stmt .
	?id_to_end_1_P131_2_stmt ps:P131 ?id_to_end_1_P131_2_val .
} .
{
	?id_to_end_1_P131_2_stmt wikibase:rank ?id_to_end_1_P131_2_stmt_rank .
	values ?id_to_end_1_P131_2_stmt_rank { wikibase:PreferredRank wikibase:NormalRank } . minus {
		?id_to_end_1_P131_2_stmt pq:P582 []
	}
} .
{
	{
		?id_to_end_1_P131_2_stmt ps:P131 ?end
	} union {
		{
			{
				?id_to_end_1_P131_2_val p:P131 ?id_to_end_2_P131_3_stmt .
				?id_to_end_2_P131_3_stmt ps:P131 ?id_to_end_2_P131_3_val .
			} .
			filter(?id_to_end_2_P131_3_stmt!=?id_to_end_1_P131_2_stmt) .
		} .
		{
			?id_to_end_2_P131_3_stmt wikibase:rank ?id_to_end_2_P131_3_stmt_rank .
			values ?id_to_end_2_P131_3_stmt_rank { wikibase:PreferredRank wikibase:NormalRank } . minus {
				?id_to_end_2_P131_3_stmt pq:P582 []
			}
		} .
		{
			{
				?id_to_end_2_P131_3_stmt ps:P131 ?end
			} union {
				{
					{
						?id_to_end_2_P131_3_val p:P131 ?id_to_end_3_P131_4_stmt .
						?id_to_end_3_P131_4_stmt ps:P131 ?id_to_end_3_P131_4_val .
					} .
					filter(?id_to_end_3_P131_4_stmt!=?id_to_end_2_P131_3_stmt && ?id_to_end_3_P131_4_stmt!=?id_to_end_1_P131_2_stmt) .
				} .
				{
					?id_to_end_3_P131_4_stmt wikibase:rank ?id_to_end_3_P131_4_stmt_rank .
					values ?id_to_end_3_P131_4_stmt_rank { wikibase:PreferredRank wikibase:NormalRank } . minus {
						?id_to_end_3_P131_4_stmt pq:P582 []
					}
				} .
				{
					{
						?id_to_end_3_P131_4_stmt ps:P131 ?end
					} union {
						{
							{
								?id_to_end_3_P131_4_val p:P131 ?id_to_end_4_P131_5_stmt .
								?id_to_end_4_P131_5_stmt ps:P131 ?id_to_end_4_P131_5_val .
							} .
							filter(?id_to_end_4_P131_5_stmt!=?id_to_end_3_P131_4_stmt && ?id_to_end_4_P131_5_stmt!=?id_to_end_2_P131_3_stmt && ?id_to_end_4_P131_5_stmt!=?id_to_end_1_P131_2_stmt) .
						} .
						?id_to_end_4_P131_5_stmt ps:P131 ?end .
						{
							?id_to_end_4_P131_5_stmt wikibase:rank ?id_to_end_4_P131_5_stmt_rank .
							values ?id_to_end_4_P131_5_stmt_rank { wikibase:PreferredRank wikibase:NormalRank } . minus {
								?id_to_end_4_P131_5_stmt pq:P582 []
							}
						} .
					}
				} .
			}
		} .
	}
} .
      values ?target {wd:Q1741} 
      values ?id {wd:Q116955599}
   }
Try it!
Hope this fits ! Please tell me. author  TomT0m / talk page 22:15, 21 December 2024 (UTC)[reply]

Thanks @TomT0m:. the above should be:

SELECT distinct ?id ?target{
      {{Now localized into|subject=?id|location=?target}}
      values ?target {wd:Q1741} 
      values ?id {wd:Q116955599}
   }
which does not match anything, as wd:Q116955599 has an end-date set. replacing the last values-clause by something that does not have an end-date set, delivers just that object (e.g. wd:Q254476). This is ok. Items with end-date are not shown and items without are shown. But the filter for the value was set by me to simplify the validation of results returned. I just need all of them. And
SELECT distinct ?id {
      {{Now localized into|subject=?id|location=?target}}
      values ?target {wd:Q1741} 
      ?id wdt:P31/wdt:P279* wd:Q33506; # museums
   }
as well as
SELECT distinct ?id {
      {{Now localized into|subject=?id|location=?target}}
      values ?target {wd:Q1741} 
      ####?id wdt:P31/wdt:P279* wd:Q33506; # all
   }
both run into timeouts. Now Vienna is huge and there might be limits. But it also runs into timeout for values ?target {wd:Q687402} (a small municipality)
Another note: the expansion of the partial sparql is not done during sparql execution, but on wiki parsing. Took me some time to understand. As I wanted to integrate the filter for ?end to kartographer sparql queries like de:Benutzer:Herzi Pinki/Vorlage:Wikidata Karte/Nachbarschaft, which needs wiki parsing anyhow, this is not a big issue. But for the later I would need also P131 to be passed as an argument, as there it is shares border with (P47).
I think, here we are beyond the scalability of WD. best --Herzi Pinki (talk) 07:46, 28 December 2024 (UTC)[reply]
@Herzi Pinki: We need to be smarter than the query service here, and help him a bit. A few tricks makes the query work, and it can be shown for example if you don't try to retrieve the museum subclasses, removing the /wdt:P279* makes the query work fast. The trick is to make sure it checks for the current administrative location after, and only after getting the museum items. I think as the museum class has quite a number of subclasses adding the subclasses makes the query planner chose a wrong plan, together with the weird structure of my template result. So it might be useful to turn the planner down and let us do the ordering by a hint.
We have to first retrieve the museums and there is a solution for that using, using the "around" service to search for all items with coordinates locations like 10 kilometers around the coordinates of Vienna, then for each of them check if they are a museum, an last check if they are currently localized in Vienna. What I came up with is something like this:
select distinct ?id {
  
hint:Query hint:optimizer "None".    
    
      values ?target {wd:Q1741} 
      ?target  wdt:P625 ?target_coord.
  SERVICE wikibase:around {
    ?id wdt:P625 ?museum_loc .
    bd:serviceParam wikibase:center ?target_coord .
    bd:serviceParam wikibase:radius "20" .
    bd:serviceParam wikibase:distance ?dist.
  }
  
      ?id wdt:P31/wdt:P279* wd:Q33506 
  {{now localized into|subject=?id|location=?target|depth=6}}
 }
This works in less than 10 seconds in the cases of Paris for example. We might want to adjust the "around" radius if we can find a trick to find the city radius. This assumes all the museums have a coordinate location (P625) View with SQID statement also.
For the locations touching the city, I'm not sure I understand what you want ? You want the museums in the touching city also ?
I have plans to generalize a bit the approach of the template and make it work for any property paths, like {{PropertyPathCheckAlong|source=?id|target=?target|pathP131/P47|criteria=no enddate}} but this might take some time.
Anyway depending of what you want to do it might be better to just go "full around" service.author  TomT0m / talk page 10:43, 28 December 2024 (UTC)[reply]
@Herzi Pinki: Good news !
I fixed an issue in the code generating the query and this makes things way better ! If there were loops if the "P131" path, this seems to be catastrophic. I added conditions to avoid looping on the generated code.
And I think I have also a solution for the second query with an open question, by being redundant a bit and :
select distinct ?id {
  
hint:Query hint:optimizer "None".    
    
      values ?target {wd:Q1741} 
      ?id wdt:P131* ?target .
      {{now localized into|subject=?id|location=?target}}
 }
And the same approach finally works and makes the call to the around service useless, with the fix, after all :
And you don't need the {{Now localized into}} template to use the query, I think, you can just copypaste the resulting query as if it was a non-generated one ! The only part that probably needs to be changed on the fly is the "values ?target" in the generated code, I think. This can be "templatized" at your target wiki. author  TomT0m / talk page 15:49, 28 December 2024 (UTC)[reply]

Federated query with IDREF

[edit]

I tried this federated query in many ways but I always get an error "Status Code=500, Status Line=SPARQL Request Failed, Response=Virtuoso S0002 Error SQ200: No table DB.DBA.SPARQL_BINDINGS_VIEW_C_0". Any ideas? I started from User:Difool/queries#Query_IdRef which works well.

SELECT DISTINCT ?item ?itemLabel ?start ?end ?idref ?idrefurl ?pref ?name
WHERE {
  ?item p:P39 ?st . ?st ps:P39 wd:Q130613200 .
  OPTIONAL { ?st pq:P580 ?sd } . BIND(YEAR(?sd) AS ?start)
  OPTIONAL { ?st pq:P582 ?ed } . BIND(YEAR(?ed) AS ?end)
  OPTIONAL { ?item wdt:P269 ?idref } .
  BIND(URI(CONCAT("http://www.idref.fr/",?idref,"/id")) AS ?idrefurl)
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en" . }
  SERVICE <https://data.idref.fr/sparql> {
    SELECT DISTINCT ?idrefurl ?pref ?name WHERE {
      ?idrefurl rdf:type foaf:Person; skos:prefLabel ?pref; foaf:name ?name.
    }
  }
}
Try it!

Thanks, Dìdaxis (talk) 11:22, 19 December 2024 (UTC)[reply]

@Dìdaxis: This one seems to work :
SELECT DISTINCT ?item ?itemLabel ?start ?end ?idref ?idrefurl ?pref ?name
WHERE {
  ?item p:P39 ?st . ?st ps:P39 wd:Q130613200 .
  OPTIONAL { ?st pq:P580 ?sd } . BIND(YEAR(?sd) AS ?start)
  OPTIONAL { ?st pq:P582 ?ed } . BIND(YEAR(?ed) AS ?end)
  OPTIONAL { 
    ?item wdt:P269 ?idref .
      
    BIND(URI(CONCAT("http://www.idref.fr/",?idref,"/id")) AS ?idrefurl)
    SERVICE <https://data.idref.fr/sparql> {
      ?idrefurl rdf:type foaf:Person; skos:prefLabel ?pref; foaf:name ?name.
    }
  } .

  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en" . }

}
Try it!
I wrapped the binding and the service call into the optional because it's no need to try to get something if the idref is not bound. The subquery into the call is probably also not needed because of how subqueries works, they probably get the whole dataset before trying to filter with the external ?idrefurl (the dataflow is from the inside to the outside, the outside variables values may not be taken into account at first), so this is probably wildly inefficient anyway. Here you have a limited numbers of idrefs to get so better pass them to the service, I guess. The limit is not really needed I guess. author  TomT0m / talk page 11:57, 19 December 2024 (UTC)[reply]
Addendum :
  • A subquery without limits when the service is called inside the optional works fine also, it's unneeded but the idref server seems to understand the fact that it does not need to compute the whole result set
  • A subquery with limits when the service is called inside the optional works fine also does take more time and does not returns any results. This is because the semantics is different, first it takes 100 (limit) results, then it checks if it fits with the ?idrefurl values given by the service call. As it's unlikely that they are correlated, it fails.
  • If the service call is outside the optional AND the bind is inside the optional, it also works, with a subquery or not, without limit
  • if the bind is outside the optional, it fails. The problem seems to arise when there may be unbound ?idref or ?idrefurl, although in practice in our case there should not be if the call is done after the optional … but we cannot guarantee that it is done ! Actually the service seems to be called before the "optional" are computed if it is outside, so it may pass unbound values to the virtuoso service, which seems to be impossible. adding a blazegraph hint to run it last makes it works, which confirms the issue.
author  TomT0m / talk page 12:23, 19 December 2024 (UTC)[reply]

Query all images for an element

[edit]

I want to query all images for an element. In the example here, Q129029 has 2 images, but the query result has only 1 line, and a single image.

SELECT
  ?item ?wdimage
WHERE {
  { ?item wdt:P31 wd:Q977367 }
  { ?item wdt:P18 ?wdimage. }
  FILTER (?item = wd:Q129029).
}
Try it!

Any ideas on what's missing here? Thanks in advance. Pruna.ar (talk) 12:29, 19 December 2024 (UTC)[reply]

@Pruna.ar: Yes. You are using the "truthy" statements which does only include best ranks statements and only allows to get the main statement value. To get all the values you have to use the "full" forms which also allows to access to the qualifiers, sources, ranks …
In the rdf export doc the first can be searched for "claims" properties and "statement" properties for the second if you want more informations.
I included the rank in the query for explanation :
SELECT
  ?item ?wdimage ?rank
WHERE {
   ?item wdt:P31 wd:Q977367 ;
         p:P18 [
           ps:P18 ?wdimage 
           ; wikibase:rank ?rank 
         ] .
  values ?item { wd:Q129029 } .
}
Try it!
You can remove or comment the ; wikibase:rank ?rank to remove the ranks. Here you have one "preferred" and one "normal" rank. By default, if there are preferred ranks statements those are shown using "wdt:" properties, not the "normal" one. If there are no "preferred" ranks, the "normal" one are shown. The "deprecated" one are never shown by default. See Rank.
To remove the "deprecated" one for the previous query we can use a filter or a values :
SELECT
  ?item ?wdimage ?rank
WHERE {
   ?item wdt:P31 wd:Q977367 ;
         p:P18 [
           ps:P18 ?wdimage 
           ; wikibase:rank ?rank 
         ] .
  values ?item { wd:Q129029 } . # allowed value for ?item
  values ?rank { wikibase:PreferredRank wikibase:NormalRank } # allowed values for rank
  # or would also works
  # filter (?rank != wikibase:DeprecatedRank)
}
Try it!
author  TomT0m / talk page 13:00, 19 December 2024 (UTC)[reply]
Awesome explanation @TomT0m! Thanks a lot, it solved my problem and also taught me something new :-) Pruna.ar (talk) 18:27, 20 December 2024 (UTC)[reply]

Deaths in 2024

[edit]

Hello. I have this query https://w.wiki/AaND which looks at Olympians who died during the year. However, it doesn't seem to return people where only their month or year of death is known. For example, someone who died 1st July 2024 will be shown, but if their death date is simply "July 2024" or "2024" it isn't returned. How do I include all of these individuals? Thank you. HelplessChild (talk) 09:24, 23 December 2024 (UTC)[reply]

Have we got death Olympians whose death date is simply "July 2024" or "2024"? Show us an example please. Doc Taxon (talk) 09:29, 23 December 2024 (UTC)[reply]
Hi. Here's a couple that just have the month/year of death for 2024:
Can't find one that's just "2024", but there could ones that just have a death year and nothing else. Thank you. HelplessChild (talk) 10:41, 23 December 2024 (UTC)[reply]
@HelplessChild I've noticed a couple of problems with your SPARQL query, firstly since you're checking the date precision the two FILTERs checking the precision would need to be FILTER(?precision_0 >= 9 ) and FILTER(?precision_1 >= 9 ) to include dates with only MM-YYYY or YYYY, see https://www.wikidata.org/wiki/Help:Dates for a list of precision values. Also the FILTER checking the start and end dates should be >= and <= to catch the extreme values. Piecesofuk (talk) 12:56, 23 December 2024 (UTC)[reply]
I think a simpler way to write the query would be:
SELECT DISTINCT ?item ?itemLabel ?DOD ?OlympediaID WHERE {
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE]". }
  {
    SELECT DISTINCT ?item ?DOD ?OlympediaID WHERE {
      ?item wdt:P8286 ?OlympediaID.
      ?item wdt:P570 ?DOD. 
      FILTER (YEAR(?DOD)=2024)
    }
  }
}
Try it!

Piecesofuk (talk) 12:58, 23 December 2024 (UTC)[reply]

Oh wow - that's much better! Thank you. :D HelplessChild (talk) 13:03, 23 December 2024 (UTC)[reply]
@Piecesofuk @HelplessChild the approaches like this can work well, but they fail in other cases like if you want to get all birth a certain date (month / day) for example. In that kind of casethe more effective approach is to enumerate all the relevant dates in a "value statement", for example like in this query : all persons withan article on frwiki born on the 10th of may. Quite a long, but efficient, query. author  TomT0m / talk page 15:56, 23 December 2024 (UTC)[reply]
Thanks Tom, but that sounds way to complex for my mind! HelplessChild (talk) 18:05, 23 December 2024 (UTC)[reply]

items of people who died before 2004 that has P2013 (Facebook username)

[edit]

Hello recently i stumbled upon the item of Friedrich Engels who died on 1895 long time before zuckerberg was even born. To my surprise that item had P2013 (Facebook username), which off course cant be an offical facebook account. So I removed the claim from the item and its got my wondering:

is it possible to create a query that will find all the items of people who died before 2004 (the year that facebook launched) that have P2013 (Facebook username)?

if it is possible i would be happy to get that kind of query built for my.

Benbaruch (talk) 10:23, 27 December 2024 (UTC)[reply]

@Benbaruch:
I think artists rightholders may create a facebook page for artists after their death, so it might not be appropriate to delete everything ? Anyway, two options for you :
With the Wikidata query service
select ?item ?id {
  ?item wdt:P570 ?date filter(?date < "2004-01-01"^^xsd:dateTime )
  ?item wdt:P2013 ?id . hint:Prior hint:runFirst true .
}
Try it!
And with qlever, way faster : https://qlever.cs.uni-freiburg.de/wikidata/OKtL65 author  TomT0m / talk page 15:32, 27 December 2024 (UTC)[reply]
TomT0m, thanks a lot for the help. I Wil certainly won't remove all of them. there are even some verified ones but there's certainly some unofficial fan pages, and even pure nonsense that in my opinion should be removed. Benbaruch (talk) 19:31, 27 December 2024 (UTC)[reply]

Search and replace descriptions for Toki Pona

[edit]

I am very inexperienced in SPARQL. I wish to update several descriptions for the language Toki Pona (tok). for this, this query should select all items with a description with Toki Pona, filter for whose which match a certain regex and replace these with another description.

below is a made-up example in English to demonstrate. this is a table representing Wikidata with labels and description, the regex to match for countries would be territory, (.*$). this query needs to match it and replace every description with country in $1.

Label Description
foo bar
banana fruit
United States territory, North America
United Kingdom territory, Europe
United Arab Emirates territory, Asia

please ask if my explanation made no sense. I hope someone helps. Juwan (talk) 23:37, 27 December 2024 (UTC)[reply]

Legislative body members by occupation

[edit]

I wondered whether it would be possible to aggregate the level of education of the members of a legislative body by their occupation. I was thinking about this because when I was compiling data on politicians, I noticed that the more low-educated members there are in a legislative body, for example, the more likely it is that the country's system is dictatorial or moving towards dictatorship. One could do queries on this in historical legislatures (e.g. Nazi Germany /Q17856046/, Soviet Union /Q15628644/) and compare the results with countries today. Do you think there is a way?
Is it even conceivable to tell from an occupation what level of education a person has? (example: digger (Q1121576), chemist (Q593644), tailor (Q242468), lathe operator (Q28933489), literary historian (Q13570226) etc.) Pallor (talk) 15:46, 28 December 2024 (UTC)[reply]

Query for photographers who are not present in WP FR

[edit]

Hello,

I would like to get the items in WD which have P31 = Q5 (human being), P106 = Q33231 (photographer) p569 = born before 1870 and which are not present in WP FR. Best regards. Jatayou (talk) 15:24, 1 January 2025 (UTC)[reply]