Wikidata talk:WikiCite
Archives | ||||||
---|---|---|---|---|---|---|
| ||||||
Some scholarly article (Q13442814) statistics
[edit]As of 4th Apr 2023
- 38,911,011 – scholarly article items in all [1]
- 26,581,931 – scholarly article items without author (P50) and with author name string (P2093) = author name strings only [2]
- 1,111,985 – scholarly article items with author (P50) and without author name string (P2093) = all author name strings disambiguated to author Qids [3]
- 32,030,055 – scholarly article items with PubMed publication ID (P698) [4]
- 28,392,394 – scholarly article items with DOI (P356) [5]
- 22,720,690 – scholarly article items without main subject (P921) [6]
- 25,279,923 – scholarly article items without language of work or name (P407) [7]
- TODO – scholarly article items without title (P1476)
Kpjas (talk) 20:19, 4 April 2023 (UTC)
- Interesting observations. I think that "Articles without main subject" is especially important because:
- "Main subject" is the main reason to have articles in WD, since WD is not an authoritative article source, and doesn't have Abstracts.
- WD is flooded with articles about X but the item X itself is missing. Example:
- sliding window protocol (Q592860) was made in 2012 from Wikipedias
- 300 articles mentioning "sliding window" were imported from various sources
- I made sliding window (Q80681012) only in 2020: but that's the archetypical item, after which sliding window protocol (Q592860) is named; and all those articles could benefit from having "main subject" set to one of the two items.
- From your statistics, one might think that 1/3 of the articles have main subject: not so bad, right? However, an article should typically have at least 5-10 subjects, and there's no assessment whether those that have at least one, have adequate subjects
- Vladimir Alexiev (talk) 07:01, 30 April 2023 (UTC)
- @Vladimir Alexiev AFAIK main subject (P921) the situation with providing scientific articles items with this property is:
- by hand -- rather impractical
- executed by a specific tool that is QS -- carefully select scientific article items with phrases in their titles that we assume would make an adequate main subject e.g "BRCA1 mutation" or "Huntington's disease"
- provided by other tools or bots like SourceMD [8]
- Pubmed metadata contain keywords and MeSH -- why not pull these (copyright issue?)
- lo and behold ChatGPT is quite good at summarizing, perhaps also at providing main subjects for scientific articles, huh ?
- Kpjas (talk) 19:38, 30 April 2023 (UTC)
- I think there's a tagger by James Hare
- MESH is a great taxonomy system but to my unpleasant surprise PubMed articles don't have nearly enough MESH. Eg search https://pubmed.ncbi.nlm.nih.gov/?term=semantic web:
- https://pubmed.ncbi.nlm.nih.gov/31213735/ has no MESH
- https://pubmed.ncbi.nlm.nih.gov/36320897/#mesh-terms has some MESH but no "systematic review" (D000078202, D000078182)
- I've done some GPT trials and it's great at classifying papers (just from title, not even abstract) with MESH words, but terrible at picking the precise MESH code. So a secondary lookup will be needed. Examples below
- I've tagged the WikiCite telegraph group, let's see if they react https://t.me/c/1497612692/4268
- I asked GPT-4:
- classify these PubMed papers with MESH descriptors and qualifiers.
- (I provided title and abstract, but showing only title below)
- 1. Semantic Web in Healthcare: A Systematic Literature Review of Application, Research Gap, and Future Research Avenues
- MESH Descriptors:
- Semantic Web
- Health Care
- Systematic Reviews
- Electronic Health Records
- Medical Informatics
- Interoperability
- Decision Support Systems, Clinical
- Information Management
- Medical Informatics Applications
- MESH Qualifiers:
- utilization
- trends
- organization & administration
- methods
- standards
- 2. Semantic Web Technologies for Sharing Clinical Information in Health Care Systems Abstract:
- MESH Descriptors:
- Semantic Web
- Health Care
- Medical Informatics
- Interoperability
- Information Management
- Ontologies (Information Science)
- Electronic Health Records
- Clinical Information Systems
- Information Storage and Retrieval
- MESH Qualifiers:
- methods
- trends
- organization & administration
- standards
- utilization
- -- Vladimir Alexiev (talk) 07:17, 1 May 2023 (UTC)
- @Vladimir Alexiev AFAIK main subject (P921) the situation with providing scientific articles items with this property is:
- I've created a subpage for statistics at /Statistiscs. I'm gonna make a little bot to keep this updated, will reply with the repository when I do this. I'll add the metrics you mention, let me know if you have other ideas :) Carlinmack (talk) 11:06, 16 June 2023 (UTC)
- @Kpjas @Vladimir Alexiev I've created a bot User:UpdateWikiprojectBot to update these statistics weekly and added a summary like you mentioned above to the project page Wikidata:WikiProject_Source_MetaData#Statistics. Let me know if you have any suggestions or improvements Carlinmack (talk) 17:09, 30 June 2023 (UTC)
- Thanks a lot for this great resource! Re https://www.wikidata.org/wiki/Wikidata:WikiProject_Source_MetaData/Statistics#External_Identifiers
- Please add Arnet Miner publication ID (P7292), the main Chinese science knowledge graph. Back in Jan they had 332,899,739 PUBLICATIONS
- Where do you get "Total in source" from? I wonder about DOI=318,158,353 because the main DOI source CrossRef has 147,086,458
- --Vladimir Alexiev (talk) 07:26, 3 July 2023 (UTC)
- I use the following query to get the list of identifiers: https://w.wiki/6xjm
- I've added Wikidata property for items about scholarly articles (Q29548341) to Arnet Miner publication ID (P7292) so it will be added to the list when I run the script next
- I use number of records (P4876) to fill in the "Total in source" and if you go to DOI (P356) you can see the reference URL for this number :) I'll add a link to this property to the table
- Carlinmack (talk) 17:36, 4 July 2023 (UTC)
- I use the following query to get the list of identifiers: https://w.wiki/6xjm
- Thanks a lot for this great resource! Re https://www.wikidata.org/wiki/Wikidata:WikiProject_Source_MetaData/Statistics#External_Identifiers
- @Kpjas @Vladimir Alexiev I've created a bot User:UpdateWikiprojectBot to update these statistics weekly and added a summary like you mentioned above to the project page Wikidata:WikiProject_Source_MetaData#Statistics. Let me know if you have any suggestions or improvements Carlinmack (talk) 17:09, 30 June 2023 (UTC)
Source reliability assessment
[edit]I recently drafted Wikidata:WikiProject Source Reliability to capture efforts to annotate source entities with information related to their reliability. I'd love feedback on how to make that a useful complement to SourceMD. Sj (talk) 17:07, 25 April 2023 (UTC)
- Please see property proposal: Wikidata:Property proposal/assessed source reliability. Harej (talk) 03:01, 26 April 2023 (UTC)
- Nice, that seems the minimal property that could capture a range of different evaluations. The Source MetaData WikiProject does not exist. Please correct the name. Sj (talk) 16:38, 28 April 2023 (UTC)
Finalize a rename to WikiProject WikiCite
[edit]Move Wikidata:WikiProject Source MetaData -> Wikidata:WikiProject WikiCite
Previously discussed and approved in 2017 with mostly approvals, some abstention, and no opposition.
I called for comment, got it, then neglected to executed the move.
I am posting again to express intent to do the rename soon based on past approval. I do not expect objections, and in my view, over the years the scope of activity here has always overlapped with WikiCite activities.
Comments from anyone? Bluerasberry (talk) 17:59, 5 May 2023 (UTC)
- Support ArthurPSmith (talk) 16:59, 8 May 2023 (UTC)
- Comment Why not just Wikidata:WikiCite? Harej (talk) 04:03, 11 May 2023 (UTC)
- I like that, @JakobVoss: already redirected that here a while back. Sj (talk) 20:49, 15 May 2023 (UTC)
- Ok with me too. ArthurPSmith (talk) 16:44, 17 May 2023 (UTC)
- I like that, @JakobVoss: already redirected that here a while back. Sj (talk) 20:49, 15 May 2023 (UTC)
- Support for Wikidata:WikiCite. --Daniel Mietchen (talk) 22:07, 16 May 2023 (UTC)
- Support for Wikidata:WikiCite -- Kpjas (talk) 11:40, 16 June 2023 (UTC)
- Support for Wikidata:WikiCite. —Ismael Olea (talk) 09:52, 3 July 2023 (UTC)
- @Phoebe: You are hosting a WikiCite chat at Wikimania in a couple of weeks. I expect to join. Can I add to the agenda a last check and possibly live move of this project from Wikidata:WikiProject Source MetaData to Wikidata:WikiCite? Thanks. Bluerasberry (talk) 13:59, 4 August 2023 (UTC)
@Bluerasberry, ArthurPSmith, Sj, Daniel Mietchen, Kpjas, Olea: WikiProject Source MetaData has now been renamed to WikiCite. Harej (talk) 19:29, 8 August 2023 (UTC)
- ok "my" page was moved while working on it. I take note.--Alessandro Marchetti (WMCH) (talk) 20:16, 8 August 2023 (UTC)
Other research resources
[edit]There is a concept of "research resources", which includes anything that a research project uses. This is relevant here because we index scholarly papers, and those papers name the resources which research projects use.
This month a recommendation came out from Research Data Alliance which instructs on making persistent identifiers for scientific instruments.
This paper is one of many. This paper alone may not be so remarkable, but this is part of a trend for identifying all resources. Eventually we should delineate all these resources in Wikidata, connected to this project. Bluerasberry (talk) 16:03, 4 August 2023 (UTC)
Deletion of authors of notable publications
[edit]I report this discussion, which is relevant for the project since it regards the possible deletion of parts of bibliographic data. Your comments are welcome.
I also report that I am unable to user {{Ping project}}
with your two lists of participants, i.e. Wikidata:WikiCite/Participants and Wikidata:WikiCite/More/Participants, because the template doesn't work with pages not containing the word WikiProject. So, after the project was moved from Wikidata:WikiProject Source Metadata to the present Wikidata:WikiCite, it has become impossible to use ping project for its member. If possible, I would suggest to find some solution for this issue. --Epìdosis 00:25, 1 October 2023 (UTC)
WikiCite in continued limbo
[edit]I have no update but someone asked me again how WikiCite is. The general situation is that from 2017 Wikidata developers have said that WikiCite is too big for Wikidata. The effect is that while WikiCite development continues, contributors have slowed and limited their development of WikiCite content. Another effect is that because WikiCite is limited, anytime anyone proposes a project which is comparable in size to WikiCite, that project is halted immediately due to the size problem. Presumably Wikidata would have a much larger community with many more projects if we could manage the data hosting and querying.
WikiCite content is the biggest single component of Wikidata, and Wikidata is at its limit for database architecture due to hosting lots of content. There have been longstanding proposals to fork WikiCite content away from the main Wikidata. The standing counterargument is that WikiCite is only the beginning and if invited, lots of people would upload much more content to Wikidata about many things. By forking off WikiCite, we also establish the precedent that others should plan to fork off their large collections, and then we commit to federating the various interconnected Wikibase instances hosting all these. To my knowledge the current state of things is that no one representing Wikidata or the Wikimedia Foundation has any concise or comprehensible statement describing future plans, but the developers would like the community to be aware that all this WikiCite stuff is to remain in perpetual limbo until such time as a decision is made. This position began in about 2017. Here is the documentation.
- 2018 Wikidata:WikiCite/Roadmap
- 2019 Wikidata:WikiProject Limits of Wikidata
- 2021 wikitech:User:AKhatun/Wikidata Scholarly Articles Subgraph Analysis
- 2021 Wikidata:SPARQL query service/WDQS backend update/Blazegraph failure playbook
- 2021 WikiCite panel discussion (WikidataCon 2021 recording) (video)
- 2023 meta:WikiCite/Roadmap 2023
If anyone has anything else to share then please do. Bluerasberry (talk) 18:55, 3 October 2023 (UTC)
- @Bluerasberry: A frank summary. Federation has been discussed off and on for a long time; maybe we can talk here about what is missing in the current wikibase ecosystem that would be needed to make it work smoothly? Mastodon is an example of a federated social media ecosystem that seems to be working fairly well. Of course the web as a whole has been that way from the start. What would be needed to extend Wikidata with, say, a WikiCite wikibase and make it all work smoothly? ArthurPSmith (talk) 19:20, 3 October 2023 (UTC)
- Maybe it's time to start thinking how a federated WikiCite should be. Could be the coming Data Modelling Days 2023 the moment to start writing the WikiCite model for it's own Wikibase instance? —Ismael Olea (talk) 09:06, 4 October 2023 (UTC)
- So I think the key to this has to be simple contextual namespaces somehow. Within Wikidata Q\d (and P... etc) has a clear meaning, but it's referring only to the entity on wikidata.org. How do we name things to simply relate them across wikibases? If I understand it correctly, to this point federation implementations within wikibase only essentially copy stuff from one wikibase to another (the collection of properties, for instance). What we need is to be able to refer to entities across wikibases without creating new entities within the referring wikibase. The mastodon-style approach would suggest attaching a locator symbol - Q5@wikidata for instance. The local wikibase would need to fetch remote entities as needed and probably cache them somehow for local use. Suppose we have two wikibases, wikidata as it is and a "wikicite" wikibase. So we could have Qxxxx@wikidata items and Qxxxx@wikicite items, Pxxxx@wikidata and Pxxxx@wikicite properties, etc. This is obviously relatively easily extensible, but then how do the @namespaces get translated into actual API endpoints etc? Would simply using the DNS names - wikidata.org etc. be ok? ArthurPSmith (talk) 00:43, 5 October 2023 (UTC)
- Sadly I don't have the knowledge to answer. But I can see a parallel case with Structured Data on Commons, which, AFAIK, runs its own Wikibase instance. Would be great if you could lead an activity proposal for exploring and discussing this case in the Data Days: Wikidata talk:Events/Data Modelling Days 2023. —Ismael Olea (talk) 11:59, 5 October 2023 (UTC)
- Huh - thanks, I somehow missed that was happening. Will check into it. ArthurPSmith (talk) 17:10, 5 October 2023 (UTC)
- Sadly I don't have the knowledge to answer. But I can see a parallel case with Structured Data on Commons, which, AFAIK, runs its own Wikibase instance. Would be great if you could lead an activity proposal for exploring and discussing this case in the Data Days: Wikidata talk:Events/Data Modelling Days 2023. —Ismael Olea (talk) 11:59, 5 October 2023 (UTC)
- So I think the key to this has to be simple contextual namespaces somehow. Within Wikidata Q\d (and P... etc) has a clear meaning, but it's referring only to the entity on wikidata.org. How do we name things to simply relate them across wikibases? If I understand it correctly, to this point federation implementations within wikibase only essentially copy stuff from one wikibase to another (the collection of properties, for instance). What we need is to be able to refer to entities across wikibases without creating new entities within the referring wikibase. The mastodon-style approach would suggest attaching a locator symbol - Q5@wikidata for instance. The local wikibase would need to fetch remote entities as needed and probably cache them somehow for local use. Suppose we have two wikibases, wikidata as it is and a "wikicite" wikibase. So we could have Qxxxx@wikidata items and Qxxxx@wikicite items, Pxxxx@wikidata and Pxxxx@wikicite properties, etc. This is obviously relatively easily extensible, but then how do the @namespaces get translated into actual API endpoints etc? Would simply using the DNS names - wikidata.org etc. be ok? ArthurPSmith (talk) 00:43, 5 October 2023 (UTC)
If Wikicite is going to be split I was under the impression that this would be only on the triple store level, so you would still have the data in one Wikibase (Wikidata) put you would split Q items based on whether they were scholarly article (Q13442814) or not? — Finn Årup Nielsen (fnielsen) (talk) 16:20, 18 October 2023 (UTC)
- Pretty much, yes. Federation for the most part only affects the SPARQL queries. There is no short-to-medium-term need to migrate the citation data off of Wikidata, so most tools that don't do SPARQL should still work. Infrastruktur (talk) 17:52, 18 October 2023 (UTC)
- Yes I just saw this in the plan published the other day - that was new to me though, previously it sounded like they wanted to split the scholarly articles off into a completely separate wikibase. Maybe I just misunderstood. Anyway, I guess we'll get a feel for this in early 2024 when this starts being tested? ArthurPSmith (talk) 18:51, 19 October 2023 (UTC)
- The short term plan per the WMDE is to split the graph. Scholarly articles is atm. half of Wikidata. This won't affect Wikicite much in the short term. Long term maybe the wikicite stuff needs to be split off, but we are nowhere near close to that, and it would be stupid to do this prematurely without a plan since a lot of labor is involved. I do indeed run into certain queries on Wikidata that time out, but nothing of that strikes me as critical. Yes, Wikidata is slow for some things, but only the query side of things is really in trouble at this point in time. Infrastruktur (talk) 19:23, 19 October 2023 (UTC)
Grants data
[edit]Many papers and other research resources which WikiCite indexes are grant funded. We could make Wikidata items for grants and interlink papers, software, data, people, institutions, and funders. We have some pilots in this, but not any showcased complete datasets.
I want to share some ongoing news and trends.
- It is a small thing, but Open Grants (Q109929664) is convening some events in the next month as described at https://www.ogrants.org/upcoming_events . This organization is actually trying to get published grants, and they have a collection of about 200 of them. Here in Wikidata we are mostly interested in grant metadata, but since this is only 200 grants and since that makes a complete corpus, it would be interesting to profile this collection. I am going to their meetings.
- Open Research Funders Group (Q45759536) is a consortium of most of the big United States foundations along with some in the UK. They have representatives from all these foundations meeting in this group to pilot standardization and open publishing of grant metadata. They brokered a relationship with Crossref, and the plan is to assign dois to all grants so that grant citations can be generated. When Crossref has this index, then we will be able to interlink grants like we do for papers.
- I personally an investigator on a small project called SEEKCommons (Q118147033) where we are trying to raise access and visibility of open environmental resources of interest to community groups. A strategy that we are exploring in this is identifying about 100 National Science Foundation grants where the commitment was producing such resources, then matching grant to all those resources indexed in Wikidata. The current state of things - and this may surprise non-scientists and outsiders - is that for any given grant, it is hard for anyone to identify what outcome it had. Where we want to take things is that anyone can see whatever results come of sponsored research.
There is not quite a WikiProject which matches the needs for grants, but I am thinking of developing Wikidata:WikiProject Award to include grants. If anyone has thoughts then share. Bluerasberry (talk) 14:50, 20 October 2023 (UTC)
How to ping the project?
[edit]I tried using the old and new name, both fail it seems, see https://www.wikidata.org/w/index.php?title=Wikidata:Requests_for_permissions/Bot/LccnBot&diff=2029017611&oldid=2029012620 So9q (talk) 08:37, 13 December 2023 (UTC)
Clarifying guidelines for "Affiliation" and "Employer" properties
[edit]FYI I'm currently raising some doubts regarding the current utilization of affiliation (P1416) and employer (P108) (along with some related properties) in the discussion here. I appreciate any feedback in advance. Alexmar983 (talk) 16:17, 17 February 2024 (UTC)
- I second that. Kpjas (talk) 08:30, 25 February 2024 (UTC)
PagesBot
[edit]There is currently an RFP for a bot that reads page(s) statements on items of type scholarly article, it then infers number of pages statements before adding them with a reference.
Please see: Wikidata:Requests for permissions/Bot/PagesBot
Any comments would be much appreciated! Cheers, Aluxosm (talk) 12:34, 17 March 2024 (UTC)
April 2024 Wikidata content pie chart
[edit]We have an updated pie 🥧 chart for Wikidata:Statistics! Wow the WikiCite slice sure is large and tasty.
Thanks user:VIGNERON for generating this. Bluerasberry (talk) 16:44, 8 April 2024 (UTC)
Wikidata Query Service graph split for WikiCite
[edit]I am posting to share early news and to recruit anyone to join future unplanned discussions on the split of Wikidata into two pieces, WikiCite and everything else. There is not other published news but discussions like this are chaotic.
See Wikidata_talk:WikiCite#WikiCite_in_continued_limbo for background. Post here if you want to join discussions or get news. This is both a big deal but also I hardly have more published information to share. I am entering this as a heavy contributor to WikiCite and Wikidata:Scholia, and while there are people who have more information, this is a discussion where people are exchanging information and I do not see anyone with answers to many basic questions. One basic question is "Why?", and the answer to that is that queries to this content are breaking multiple Wikidata:WikiProject Limits of Wikidata. Something is not sustainable; determining what the problem is and what to do in response is a challenge. Split of the graph is a proposal in exploration. Bluerasberry (talk) 16:54, 8 April 2024 (UTC)
- Would love to stay in the loop! Recently I was talking about the WQS performance, and I thought I remembered that a replacement (query service) being worked on / looked at. I'm not managing to find much discussions about that, however. It looks like this ticket might be a good starting point for a deep dive? My interest in the subject is still mostly as a casual contributor. --Azertus (talk) 12:57, 10 April 2024 (UTC)
- Do you know if those queries timing out are running into fundamental limits of the back-end or are they related to (mostly arbitrary) performance limits set to be able to keep a public service running well? Azertus (talk) 13:01, 10 April 2024 (UTC)
- @ Azertus: I believe there will be a significant announcement on this soon, but one of the relevant pages on the current plan is Wikidata:SPARQL query service/WDQS graph split which also has some relevant subpages. ArthurPSmith (talk) 15:49, 11 April 2024 (UTC)
- @Bluerasberry please ping me whenever anything fundamental regarding Wikicite crops up. I'm very keen on Wikicite shaping and growing into something really interesting and worthwhile. Cloning Scholia would be rather futile but creating something with a vision like The Internet Archive would be great IMO. Kpjas (talk) 09:01, 27 April 2024 (UTC)
- @Kpjas: I am sitting on more confusing information than I can process. No one really has a complete story. Part of what I want to do, probably by end of May, is write an article telling some kind of story for English Wikipedia's Signpost newsletter.
- About Internet Archive - check out the Internet Archive Scholar at https://scholar.archive.org/
- I and some other WikiCite people have collaborated a bit with that project. There are several of these - WikiCite, Internet Archive Scholar, Semantic Scholar, OpenAlex, ORCID - which have open data and contribute to each other but no one has written stories about who is doing what, with what funding, which data is unique, and what the challenges are.
- From April 17 WDQS Scaling update is WMF news which is both essential but also not comprehensible without a lot of background.
- If you want to video chat to help me sort notes and outline a Signpost article then I would appreciate. This deserves a few long academic papers but for now short journalism is the achievable goal. Bluerasberry (talk) 15:15, 27 April 2024 (UTC)
- @Bluerasberry Please bear in mind that obviously I am not abreast of the WD/WkiCite state of affairs.
- Apparently for quite a long time I have been confused about the exact aims of WikiCite, guidelines, roadmap and coordination of the project. Overall a lofty and worthwhile endeavour but I think the landscape of knowledge and information management has lately been and is changing now at an astonishing speed. Where does WD stand now ?
- To me personally WikiCite first appealed as a useful repository of scientific papers references. Dozens of Wikipedias adopted templates (Cite Q and the like) to this end (but regrettably not the Polish language WP).
- WikiCite as a global repository for scientific world at large seems to be far away and hard to achieve (am I too pessimistic?).
- I think writing an article for Signpost about all this is really a good idea and I'd support you if I could. WD and WikiCite is still a lot of fun to collaborate with like-mined people. Kpjas (talk) 16:16, 27 April 2024 (UTC)
- @Kpjas:
- Yes, things are changing at an astonishing speed.
- Scopus/Elsevier Web of Science/Clarivate and Google Scholar must all have staff who know how much it costs to develop a product. I think a satisfying product could be US$1 million in Wikidata, but they must all have put 10s of millions into theirs. Elsevier and Clarivate charge hundreds of universities US$100,000/month for products where Wikicite/Scholia are usually sufficient.
- Semantic Scholar is the best-funded nonprofit project because it comes from Allen Institute. Our Research is developing Open Alex right now with US$5-10 million from Arcadia Fund. Internet Scholar may be halted; they need some things. Wikicite's major problem is the failure of backend Blazegraph after Amazon hired all the staff of the nonprofit org developing it.
- Yes, you hit the target, I want WikiCite as the general resource for the world. Any of Wikidata:WikiProject Limits of Wikidata could have been the problem, but the problem really turned out to be the Wikidata_talk:WikiCite#WikiCite_in_continued_limbo issue identified in 2018 with no development since then. I do not think there is anyone who has a plan for what to do after the graph split when we could hit the same issue after 3 more years of uploads.
- The part that has intrigued me is the possibility of offering a lower quality service than the commercial ones, but for free, and to the 90% of the world who cannot afford the paid services. Also, by making any kind of free service, we get a chance to raise ethical issues that companies would not address, and also create a marketplace and customer expectation that there should be really inexpensive services.
- Thanks for talking through a bit and yes I want to talk more. Will get something started for Signpost. Bluerasberry (talk) 19:33, 27 April 2024 (UTC)
Best paper award
[edit]Some scientific conference shave (multiple) best paper award(s). How should we model this? @NandanaM: has added best paper information to The 22nd International Semantic Web Conference (Q119153957) and I have currently a Synia panel with this information: https://synia.toolforge.org/#scientificeventseries/Q6053150 It is using the winner (P1346) property and the qualifier object of statement has role (P3831). Does anyone have feedback on this? — Finn Årup Nielsen (fnielsen) (talk) 16:41, 9 April 2024 (UTC)
Weird item about an article that has the title of a book, what is it ?
[edit]The item Hidden harmonies: the lives and times of the Pythagorean theorem (Q114012895) is classified as a scientific article but has the title of a book I want to cite, but it's not clear weather it's actually about the book, or a review of the book. Can someone sort that out ? Is there a need to create an item for the book ? author TomT0m / talk page 08:42, 16 April 2024 (UTC)
- @TomT0m: If you follow the DOI or publisher links you'll see it is almost certainly a review, not the actual book. ArthurPSmith (talk) 14:10, 16 April 2024 (UTC)
Wikidata to split as sheer volume of information overloads infrastructure
[edit]From the 16 May 2024 issue of The Signpost (Q16639816)
Wikidata to split as sheer volume of information overloads infrastructure (Q126011233) by @Bluerasberry
Kpjas (talk) 09:19, 22 May 2024 (UTC)
Data models
[edit]Hi all, as part of the rationalization of the navigation template for data models (approved slowly over the last couple of months) I have pushed for the creation of Wikidata:WikiProject Wikidata for research/Data models/Researchers.
The content of this page is going to evolve as well, I think it should/could become one day the landing page of a redirect named probably Wikidata:WikiCite/Data models/Researchers or Wikidata:WikiCite/Data Models for Researchers or Wikidata:WikiCite/Data models/authors), but in the meantime we are starting to centralize the discussion. We couldn't even agree on the naming or strcuture of these pages, but I hope we are making some step in the right direction. Alexmar983 (talk) 17:23, 5 July 2024 (UTC)
This is clearly a work in progess and I will raise this aspect also at Wikimania where I have a talk titled "Wikidata and authority control for researchers: the cases of Switzerland and Italy", just few slides about the general scaffolding that we shoud have in this area. Researchers’ data are a significant portion of biographical items, so we should converge after so any years at least on some key points. --Alexmar983 (talk) 17:38, 5 July 2024 (UTC)
Property Proposal: indexer
[edit]Hi All,
Kindly requesting your input as Support or Oppose, for the following property proposal for 'indexer' at https://www.wikidata.org/wiki/Wikidata:Property_proposal/indexer
You may also Comment to improve or critic the proposal. Appreciate any constructive feedback. Wallacegromit1 (talk) 08:27, 9 July 2024 (UTC)
Q127946760
[edit]I have already entered a journal (Q15699685). Now trying to enter vol 1 of the same journal. It has 4 issues. Each issue has some articles. How to enter in Wikidata. Regards Vjsuseela (talk) 19:15, 27 July 2024 (UTC)
Community input into WDQS graph split: a publication type property proposal
[edit]@Bluerasberry: On Thursday I was on the regular WMF call about the WDQS graph split (see the "Wikidata Query Service graph split for WikiCite" thread above) with User:DCausse (WMF) and meta:User:ABaso (WMF). It is now timely to consider a basic point about modelling of WikiCite data, once the split happens.
I got into this area through a discussion on Wikidata talk:SPARQL query service/WDQS graph split/WDQS Split Refinement. The basic point is that certain fundamental terms such as "clinical trial" are in Wikidata at present overloaded: the same item is used for the actual trial and for a report on the trial.
It is not a new idea that a good way to go is to change the current practice of using instance of (P31) both for fundamental identification, as scholarly article (Q13442814) typically, and for publication type information such as systematic review (Q1504425). In fact there was some discussion about that in 2020: see Wikidata talk:WikiProject Source_MetaData/Archive 4#Towards more consistent P31 usage across the WikiCite corpus, a thread introduced by User:Daniel Mietchen. It seems to me the time has come.
Therefore, for reasons given in detail in the WDQS graph split subpage, I favour introducing a new property, which would support statements that give the publication types of a scholarly article. On the call I said I'd start a discussion here, before moving to Wikidata:Property proposal/Generic, which is presumably the right place to make the idea formal. Charles Matthews (talk) 10:14, 12 August 2024 (UTC)
- @Charles Matthews: Yes agreed. I set up Wikidata:WikiProject Clinical Trials and with a group, introduced a few hundred thousand records of clinical trials. It is an error to mark research papers about clinical trials, even if they are the defining papers, as "instance = clinical trial".
- I expect that I support your publication type idea but let's discuss further and bring in other opinions before making change at scale. Bluerasberry (talk) 15:26, 12 August 2024 (UTC)
- Lane, thanks. I was first aware of the issue with clinical trial (Q30612) through the way it has two MeSH descriptor ID (P486) statements. When David Causse asked me how widespread this kind of ambiguity is, in the WikiCite publication types, it was clear that there were dozens of cases. Doing some detailed work with MeSH, I came up with a list of 29 items to fix. At this point, I thought it was evident that the current use of instance of (P31) in this area was a kludge, and David agreed that a cleaner model would be a big advantage. Charles Matthews (talk) 06:27, 14 August 2024 (UTC)
I'm mostly a fan of the Lens classification of publication types. Sj (talk) 19:19, 30 August 2024 (UTC)
(136,955,934) Journal Article
(70,975,780) Unknown
(21,906,117) Book Chapter
(8,333,715) Component
(8,072,406) Conference Proceedings Article
(7,337,049) Dataset
(7,065,168) Book
(6,640,794) Dissertation
(3,825,093) Preprint
(1,753,482) Libguide
(1,544,529) Other
(1,231,091) Journal Issue
(1,070,469) Report
(874,716) Reference Entry
(861,058) Conference Proceedings
(564,094) Review
(389,679) Standard
(210,052) Editorial
(194,567) Letter
(69,675) News
(64,450) Journal
(25,175) Clinical Trial
(11,440) Journal Volume
(1,649) Clinical Study
(781) Working Paper
Wikidata:Property proposal/publication type of scholarly article is now live. Charles Matthews (talk) 10:47, 12 September 2024 (UTC)
The proposal was approved, and publication type of scholarly work (P13046) now exists. Populating the WikiCite area with statements using it is a major task. I expect some planning will precede implementation. Charles Matthews (talk) 08:50, 8 October 2024 (UTC)
Researchonline ljmu
[edit]Hi, while revising research repositories and databases I have discovered this one.
There might be some chances to create ID for articles (see [9] and [10]), and for keywords or subjects ([11] and [12]).
I'm just leaving this here—if it's of any interest, let me know and I can put together a proposal. I don't need to, but if someone might use, I can do so. Alexmar983 (talk) 13:48, 13 August 2024 (UTC)
Data Models
[edit]Hi, if someone with more familiarity with the translation code architecture could take a look, I think we should add a subsection about data models, potentially linking to [[Wikidata Wikidata for research/Data models]]. In my opinion, it might also be useful to create a redirect through [[Wikidata /Data models]]. I believe this would be a good practical compromise. Regards. Alexmar983 (talk) 13:55, 13 August 2024 (UTC)
Ongoing imports
[edit]Hi I think that Wikidata:WikiCite#Ongoing imports isn't very practical at the moment. In my opinion, it would be better as a link to a separate subpage without the heavy translation architecture, similar to how we handle the "Properties" section. If we keep the current setup, we might miss some reports on ongoing initiatives.
This is just a minor suggestion, glad to know your opinion. Alexmar983 (talk) 14:00, 13 August 2024 (UTC)
- Support a subpage would be more usable, I agree. --Epìdosis 16:51, 13 August 2024 (UTC)
New CouchDb WikiProject
[edit]Yesterday I started a new project exploring a this open source json-based highly scalable, cluster able, replicable database as a viable backend for a growing amount of data.
My guess is that if someone were to host this, it would be s valuable addition to the other available backbends.
I'm wondering if it could even be a good replacement for WDQS which could be discontinued by WMF.
This would free up resources to invest e.g. in QLever or other alternatives.
WMF could also outsource and create incentives for others to host a Wikidata Graph backend. It's not written in stone that WMF have to host and maintain a graph backend at all. So9q (talk) 07:33, 13 September 2024 (UTC)
Roadmap for 2024
[edit]Hi! In light of the recent discussion of mass-import policy in the project chat and the graph split having been decided I suggest we discuss updating the roadmap.
My suggestion is this:
- a subproject to import of all references used in any Wikipedia into Wikidata and work with communities to use CiteQ everywhere to avoid local idiosyncratic citation templates
- a database of citation events, a structured database of references that appear on Wikipedia articles, with an understanding that not all of them cleanly map to specific sources (this could be used to gather statistics over time of the progress of importing all sources as items)
Additionally I suggest the community to decide that as much metadata as possible should be stored in Wikidata for each citation.
Additionally I propose to use CouchDb as a database for 2) and discuss what format of json is most desirable to have. The goal is to have an updated database that for every change in a Wikipedia update the citation database based on the wikitext. This database should be hosted by WMF but I don't know how to make that a reality. So9q (talk) 08:35, 13 September 2024 (UTC)
New proposal for two identifiers for journals
[edit]An opinion is requested from the participants of this project here and here. Thanks, Horcrux (talk) 12:20, 2 October 2024 (UTC)
National efforts
[edit]- Wikidata:WikiCite/Italy @Alexmar983, Epìdosis:
- Wikidata:WikiCite/Switzerland @Alessandro Marchetti (WMCH):
- Wikidata:WikiProject PCC Wikidata Pilot, a United States effort which had very active participants
I am just seeing the Italian and Swiss efforts. Hello from the United States! I wonder if we have other national WikiCite programs or pilots. Bluerasberry (talk) 15:25, 19 October 2024 (UTC)
- User:Bluerasberry as far as we could gather they are listed here. Colombia is starting now with some effort so we prepared a line for them since we discussed at Wikimania. There may be some initiatives also in Tunisia, I seem to recall form a chat message on Telegram (some import of local journals). Other activities could exist, but they are not yet confirmed. Based on our experience, we simply recommend clustering activities at the national level, even if coordination remains limited at the beginning.--Alexmar983 (talk) 19:49, 19 October 2024 (UTC)