Wikipedia:Wikidata/2018 Infobox RfC
Since the 2013 RfC on Wikidata Phase 2, using Wikidata in infoboxes has gone from the concept stage to more extensive mainspace experimental implementations. This had led to some controversies: some editors have been converting infoboxes to use Wikidata, believing use of Wikidata brings significant benefits, while other editors have been removing Wikidata from infoboxes, believing use of Wikidata brings significant problems. The opposing efforts are counterproductive. It is time to ask the broader Wikipedia editor community where to take it from here. 20:49, 6 April 2018 (UTC)
The poll is now closed, for the closing summary see #Discussion. But note:
At the bottom of that discussion is this summary of the summary: There is a consensus that data drawn for Wikidata might be acceptable for use in Wikipedia if Wikipedians can be assured that the data is accurate, and preferably meets Wikipedia rules of reliability. For the other issues raised within this RfC, there was no clear consensus.
Background information
[edit]What is Wikidata?
[edit]- See also: RfC: Wikidata Phase 2 (2013), RfC: Wikidata in infoboxes, opt-in or opt-out? (2016) and RfC: Linking to wikidata (2018)
Wikidata is a Wikimedia project that "acts as central storage for the structured data of its Wikimedia sister projects including Wikipedia, Wikivoyage, Wikisource, and others." In the same way that Wikimedia Commons hosts media files, Wikidata hosts claims (a piece of data in a named field in a Wikidata "item" about some person, place, thing, or concept). Most of the information is stored in a language-neutral format, using Q-codes in place of words, so that it can be used across the different language Wikimedia projects (with exceptions for language-specific phrases such as media captions).
Every Wikipedia article has a corresponding "item" on Wikidata. This includes links to pages on the same topic in other language Wikipedias and other Wikimedia projects, as well as labels and descriptions in different languages, and structured statements about the topic. Wikidata has much broader inclusion criteria than Wikipedia, and contains many items that do not currently have Wikipedia articles.
The possibility of using Wikidata in infoboxes was originally discussed in Wikidata Phase 2 back in 2012 (also see Phase 2 technical proposal and the Wikidata deployment Q&A on Meta), which led to the 2013 RfC.
Where does Wikidata information come from?
[edit]Wikidata information is added both by individual editors and bot imports. Individual statements can be added by hand, or they can be imported from Wikipedia infoboxes, categories and articles (across all language Wikipedias), or from external databases.
How is Wikidata information included in Wikipedia?
[edit]Special code (magic words and modules) can request information from Wikidata. These are usually only used in a template. Every time you preview, save, or purge a page, all requested Wikidata values appear freshly on the rendered page. The information does not appear in the wikitext.
Changes to Wikidata values are stored in the Wikidata item history, rather than in the Wikipedia article history. This is the same as template code (where formatting, displayed parameters and template-stored data can change, although parameter values remain in the article) and inclusions of other pages into Wikipedia articles. Viewing a version of the page from history will retrieve and show current values from Wikidata (and current versions of templates), rather than the page as it appeared on that date.
How are we currently using Wikidata information?
[edit]We use Wikidata to provide most of the interwiki links to other language Wikipedia articles (although a small number of custom interwiki links remain locally, e.g., to sections of articles on other languages). We also use Wikidata information in templates such as {{Authority control}}, which provides links to catalogue entries on the article subject.
More recently, a number of infoboxes use information from Wikidata following from the Wikidata Phase 2 RfC in 2013.
How can we use Wikidata information in infoboxes?
[edit]Wikidata can be used to display all of the content in infoboxes, just some of it, or none of it. Images, captions, coordinates, locations, links, dates, units (including conversion between metric and imperial), websites, and maps can all be shown using Wikidata values. However, information that isn't structured/well-defined (such as free text or vague statements) can't be described in Wikidata.
Data from Wikidata is fetched and presented in Wikipedia using templates. This can be through existing templates, such as {{Coord}} and {{convert}} (which have been modified to provide access to Wikidata information), or it can be by using templates that provide access to Wikidata-specific "modules" such as Module:Wikidata, Module:WikidataIB and Module:Wd, that written using code called "Lua". These Lua modules can also evaluate the data and only pull data with certain qualities. For example, the module can only fetch data that has a reference (with caveats, see below), and it can be used to select specific fields or suppress fields.
We currently have a mix of opt-in and opt-out Wikidata infoboxes (e.g. {{Infobox person/wikidata}}, which is opt-in, vs {{Infobox artwork/wikidata}}, which is opt-out). A 2016 RfC VPP RfC: Wikidata in infoboxes, opt-in or opt-out? did not provide a consensus on which approach to take here, and the options below reflect this.
Infoboxes that are currently entirely from Wikidata can be found at Category:Articles with infoboxes completely from Wikidata.
If there is data in an infobox from Wikidata, how can it be edited?
[edit]In some versions of infoboxes, the data resides solely in Wikidata and is just displayed in the infobox in the article; the data itself is not in Wikipedia. If you look at wikitext in the edit window, you will see only a short template for the infobox like {{infobox_gene}}. The infobox will change whenever Wikidata is changed (it is rendered anew each time the Wikipedia page is re-cached).
These infoboxes have a link that sends you to Wikidata to change the data there. You can do this using your Wikimedia unified login. You cannot edit the data from within Wikipedia, although you can provide new values using Wikicode in the traditional way, which are then displayed instead. A prototype for editing Wikidata information from within Wikipedia is under construction by the Wikidata developers, although development work on this is planned for the end of 2018, with a beta version in 2019.
How can I watch changes to Wikidata information?
[edit]You can watch Wikidata pages using your Wikidata watchlist, and Wikidata also has a RecentChanges page. You can also follow changes in your Wikipedia watchlist and at Special:RecentChanges. This is not currently enabled by default: you need to go to the 'Recent Changes' and/or 'Watchlist' tabs of Special:Preferences and tick the check-box next to 'Show Wikidata edits in your watchlist' (they are turned on separately). Wikidata edits do not show up in the article history page, analogous to how edits to templates also aren't included in article histories.
Due to the high rate of edits on Wikidata (as each change to a value is saved separately), and the large number of articles that can be affected by an edit, including Wikidata edits on watchlists and RecentChanges causes technical problems. In order to reduce the dangerously high rate of watchlist changes, changes that affect many articles do not appear in the watchlist for more than the first few articles that are affected, the rest are simply dropped (phab:T177707), although changes to the item directly linked to the Wikipedia article are always displayed. Including any Wikidata changes in recent changes/watchlists was completely disabled on Wikimedia Commons and the Russian Wikipedia in October 2017 (phab:T171027) as they are currently the heaviest users of Wikidata; however this did not apply to any other Wikimedia projects. Additionally, there can be a lag between an edit being made on Wikidata and the change showing up here, which is tracked at d:Special:DispatchStats. Technical work to improve these issues is ongoing, and changes that do not affect the article are no longer shown.
Differences between the Wikidata and en-Wikipedia policies and guidelines
[edit]The Wikidata community governs itself, just as the English Wikipedia community governs itself.
We (English Wikipedia) have a number of policies and guidelines that are not used by the Wikidata editing community - they have their own policies and guidelines.
- WP:V and WP:BLP: Wikidata has no sourcing policy. d:Wikidata:Verifiability has been proposed since January 2015. d:Wikidata:Living persons (draft) has been proposed since April 2013. d:Wikidata:Living people was created in September 2017; it is not tagged as a policy or guideline and consists of the Board of Trustees' 2009 resolution on living people. d:Wikidata:Requests for comment/Privacy and Living People is currently running.
- WP:RS
- WP:NOR: Wikipedia prefers secondary sources to primary sources, although both can be used on Wikidata.
- WP:NPOV: Where there are multiple sources with contradictory values, Wikidata stores all of values (and the references where available). The values can then be marked as 'preferred', 'normal-ranked' and 'deprecated'. However, more fine-grained controls are not available, and there is no Wikidata policy on this.
What about references?
[edit]Each individual statement on Wikidata is referenced separately. References are stored in two different ways. In most cases they are stored in the 'references' part for each statement, where you can provide reference URL (P854), title (P1476), retrieved (P813), etc. for the given statement. Alternatively, the reference can have its own Wikidata entry (for example, if it is a notable book), in which case the reference can be an instance of stated in (P248), which is sometimes accompanied by additional information such as page(s) (P304). Wikidata also uses imported from Wikimedia project (P143), which is used to show where the imported information came from, however this is frequently used as "Imported from X Wikipedia", which is a circular reference (and as such, this is not treated as a valid reference here, and it is ignored when only importing referenced values). Additionally, in some cases that reference may be a circular reference to unsourced information within wikidata, and in some cases it is a reference to the Wikimedia Labs.
Wikidata stores references without any formatting, and different properties (title, authors, url, DOI, etc.) are stored separately. The reference information can then be reformatted as appropriate when it is fetched from Wikidata. However, we have a lot of different formatting styles here (e.g., see WP:CITEVAR), which complicates the process of reformatting the Wikidata information to follow the different formatting styles.
Titles, other names, descriptions and site links on Wikidata cannot be referenced. Some of these provide context, which can vary between languages. Others, like sitelinks, don't need to be referenced. Other pieces of information, such as images, can have references although they do not need them (and for these properties, referencing requirements can be selectively removed through the infobox code).
Various Wikipedia reference-related procedures, e.g. requesting a page number with {{page needed}}, indicating a failed verification with {{failed verification}}, defunct weblinks rescued by InternetArchiveBot, implementing the outcome of a WP:RSN discussion, etc., need to be adapted for Wikidata, and may need the editor to fix things immediately rather than adding them to backlogs.
References from Wikidata can be shown here using {{Cite Q}} and Module:Wd; however, they are opt-in in most Wikidata infoboxes (set refs=yes to show them; these exclude "imported from" references), and a recent deletion debate about {{Cite Q}} ended with no consensus (with caveats).
Do we have to use Wikidata infoboxes?
[edit]No. Wikipedia / Wikimedia projects are not required to use data from Wikidata anywhere. It is a decision to be made by each editing community. Wikis that decide to use Wikidata can determine how they want to use it. The decision is ours to make. Different options are discussed below.
Questions:
[edit]This RfC has four questions.
Can Wikidata infoboxes be used in mainspace?
[edit]Option # | Short name | Description | Consequences |
---|---|---|---|
1A | No | Infoboxes with Wikidata integration are not allowed in mainspace | Rollback current mainspace uses of Wikidata in infoboxes |
1B | Experiments only | Allow mainspace experiments of infoboxes with Wikidata integration | More or less a status quo; some fully deployed Wikidata infoboxes may need to be rolled back |
1C | Explicit consensus required | An unadvertised local consensus does not suffice to convert an infobox to a Wikidata implementation; broader consensus procedures, such as RfCs, are needed | Neither WP:EDITCONSENSUS nor WP:LOCALCONSENSUS are sufficient |
1D | Passive consensus sufficient | Unadvertised consensus on template talk (possibly a proposal and passive absence of response) is sufficient to convert an infobox to a Wikidata implementation | WP:EDITCONSENSUS does not suffice, while WP:LOCALCONSENSUS may suffice |
1E | Separate Wikidata version | Wikidata infoboxes are allowed, but it must be done in a separate copy of the template that individual editors can choose to use | Would make infoboxes with a "/Wikidata" extension to the template name permanent, concurrently with the original |
1F | Roll-out | Every infobox that is technically ready to convert from local content to Wikidata content may be implemented in mainspace | A local consensus cannot override this Wikipedia-wide RfC |
What can Wikidata infoboxes display?
[edit]Option # | Short name | Description | Consequences |
---|---|---|---|
2A | None | Wikidata is not used in infoboxes. | Wikidata is not used in any infoboxes, even if individual editors or WikiProjects would like to use it. |
2B | By-consensus | Only allow Wikidata for approved types of information or approved boxes. | Only use Wikidata with consensus for specific purposes. Possible examples: coordinates, ID numbers, or a chemical-property box. A future Village Pump RFC will be needed to evaluate consensus on specific usages, or to establish another location or process for approval. |
2C | Opt-in | Infoboxes may be modified to show Wikidata values when an article-page requests information from Wikidata. | Wikidata information only appears when the article requests wikidata. For example:
|
2D | Opt-out1 | Infoboxes may be modified to show Wikidata values when a field is absent. | Wikidata values will be displayed if a field is absent.
If there's no year parameter, the year from Wikidata will be used. |
2E | Opt-out2 | Infoboxes may be modified to display wikidata values by default. | Wikidata values will be displayed if a field is blank or absent.
Removing Wikidata data from the infobox (without inserting a replacement value) requires
|
2F | Wikidata-only | Infoboxes are not required to support local values. Infoboxes may be modified to accept information from Wikidata only. | There may be no way to alter or delete information by editing the article's wikitext. Editors may be required to edit fields on Wikidata itself. |
2G | Case-by-case | which of the above applies is decided on an article-by-article basis (so doesn't depend on the topic of the infobox, but on which model the editors of the article choose to adopt) | More compliant to the general infobox-related principle that decisions about an infobox are to be taken on the level of the individual article where the infobox may be applied, and not on a level that affects the infoboxes of multiple articles at once. |
What reference standard should Wikidata infoboxes require?
[edit]Option # | Short name | Description | Consequences |
---|---|---|---|
3A | Policy compliant (all) | Each statement imported from Wikidata must comply with Wikipedia policies, including but not limited to those pertaining to verifiability and biographies of living persons, and should follow the guideline on identifying reliable sources. All such statements must display a reference, even if they repeat a statement that is referenced in the body of the article. | Requires choice 2C. Editors are responsible for examining Wikidata's information and sourcing before activating it. Wikidata should not be enabled for a field which has no data in Wikidata.
There is no software support to regulate the import of changes from Wikidata. Once Wikidata has been enabled for a field, any changes at Wikidata will be imported automatically. It is also not possible to reliably review changes after they have been imported. You may opt-in to show Wikidata on your watchlist, however it does not work reliably. References are imported from Wikidata alongside the infobox information, and appear as inline references in the infobox (e.g., using Module:Wd's 'references' option) |
3B | Policy compliant (contentious only) | Same as option 3A, except that non-contentious statements that are well-referenced in the body of the article may be summarized in the infobox without a citation | |
3C | Require-source | Only Wikidata values which claim a source can be used (excluding things like images). | Sourcing is subject to Wikidata policies, see #What about references?. E.g., we use "onlysourced=yes" as the default for Module:WikidataIB |
3D | BLPs-sourced | Same as option 3C, but applies only to BLPs | |
3E | Challenge | Unsourced Wikidata values may be displayed. If challenged and removed, it should not be restored until a reference is supplied. | Needs manual enforcement. A challenged field must be removed from |fetchwikidata= or added to |suppressfields= .
|
3F | No-Refs | Values are not required to be referenced on Wikidata at all | E.g., we use "onlysourced=no" as the default for Module:WikidataIB |
3G | Case-by-case-Refs | Which one of the above applies is decided in a case-by-case scenario, for each article page where the values are imported | Needs flexible infoboxes, or a variety of infoboxes on the same topic to choose from |
Note: Wikidata references containing the word 'Wikipedia' (such as "Imported from: English Wikipedia") are specifically excluded from all options here to avoid circular references.
What should we do with local data?
[edit]Some editors have deleted values from infoboxes so that identical or equivalent values will be pulled from Wikidata. Other editors have restored those local values to the wikitext. Both kinds of edits have little or no effect on the displayed infobox. To avoid fruitless edit warring, we should decide which kind of edit is appropriate and which is inappropriate.
Option # | Short name | Description | Consequences |
---|---|---|---|
4A | Keep-local | Allow local values only for infobox content | Neither Wikidata's advantages nor its problems regarding infobox values are further imported in Wikipedia. Bot operations and/or other techniques such as maintenance categories and edit filters may (need to) be set up to enforce such prohibition on importing Wikidata content to Wikipedia. |
4B | Subst-only | Instead of displaying live values from Wikidata, the values are imported to Wikipedia once, by substituting them; substitution can be triggered again when Wikidata values are improved at a later stage | E.g. {{subst:Wikidata|property|linked|Q4004168|P123}} produces the static value Hogarth Press (i.e. publisher (P123) of A Haunted House and Other Short Stories (Q4004168)): technically this is however not (yet) a universally applicable method, so would need additional programming to make it so. In this option Wikipedia would no longer be dependent on Wikidata changes occurring after the data have been imported. Like for the previous option, this may require bot operations and/or other techniques such as maintenance categories and edit filters to enforce the option and/or warn when values at Wikidata (claims) and at en.Wikipedia (local data) are no longer consistent. |
4C | local data precedence | When local data are defined they take precedence over Wikidata claims | Infoboxes need to be programmed to accept local data overriding imported data for all parameters, so that only if no local value is defined it is imported from Wikidata |
4D | Case-by-case-values | Local data and the possibility to import the corresponding wikidata claims are allowed concurrently, deciding which one to display in mainspace on a case-by-case basis | Local consensus or editor decisions determine whether Wikidata values are used; Adds complexity in that additional parameters with which the choice can be passed to the infobox need to be implemented |
4E | Migrate-override | Wikidata claims take precedence over locally defined values. | local data allowed, but not displayed unless no relevant data exists on Wikidata (they are only visible in editing mode and can be used to check whether the Wikidata values still correspond) |
4F | Migrate & delete | When Wikidata values are imported, local values are deleted | Less confusion where the displayed information originates than the previous two options; may need additional programming to effectively delete mainspace content; When a Wikidata claim is deleted, it is however more difficult to recover the local data than in the preceding options |
Discussion
[edit]- The following discussion is an archived record of a request for comment. Please do not modify it. No further edits should be made to this discussion. A summary of the conclusions reached follows.
There is clearly a group who are opposed in any way to the use of Wikidata, and who all voted the same identical way: "1A 2A 3A 4A" These 31 Wikipedians (out of the 94 people who participated in the poll part of this RfC) in effect opposed to any use of Wikidata data in infoboxes. These 31 are not a majority of the people participating, nor necessarily representative of the larger Wikipedia community, although if this were an actual vote with the option to make it first past the post that would be the "most popular".
No such group stands out amongst the rest who participated in the polling portion. Although more numerous, their preferences were scattered across the various other options; it is clear they are willing to consider using Wikidata, but there is no consensus how to use Wikidata. Again, this does not suggest this is necessarily representative of the larger Wikipedia community. This reading of the poll is supported by a recurring theme in the discussion portion of the RfC, that individual Wikiprojects should be offered the opportunity to experiment with material from Wikidata. This suggestion may warrant further exploration.
Despite this clear bifurcation, there is a consensus on one point: if Wikipedia wants to use data from Wikidata, there needs to be clear assurances on the reliability of this data. Of the many options, 3A, "Each statement imported from Wikidata must comply with Wikipedia policies", was the only option in the entire matrix in the poll to gain a majority: 52% percent of those expressing a preference in question 3 supported this option. If we define this to mean that the data drawn from Wikidata must be what is generally considered "reliable", then when we combine the share of votes from two other related options, 3C ("Require source") & 3D ("BLPs sourced"), this results in 78% of participants in the poll expressing some form of assurance about the reliability of the material drawn from Wikidata.
This perceived consensus is in agreement with Wikipedia community beliefs. In general, we want the contents of Wikipedia articles to be as accurate as possible, which we ensure by basing articles on reliable sources. This is not a new development. Some 10 years ago Nature magazine performed an informal review of Wikipedia content as compared to Encyclopaedia Britannica content, which resulted in Wikipedia being found about as reliable as the venerable & widely praised Encyclopedia Britannica. A Wikipedian at the 2008 Wikimania conference, Andre Engels, made the observation that where Encyclopedia Britannica argued with Nature magazine about the accuracy of the reviewers, Wikipedians responded by fixing the errors.
This perceived consensus is also echoed in posts in the "General discussion" portion: some participants expressed concerns about where information for Wikidata objects came from, how reliable these sources were, & how they were protected from vandalism. In response, other participants defended the data in the Wikipedia objects, arguing that "constraint violation reports" protected against vandalism or decresed accuracy over time. In llywrch's opinion, these two groups were often arguing past each other, often engaging in attacks on the arguments, not on acknowledging the concerns behind them. One side failing to vindicate the sources it drew the data from (as well as providing citations), the other failing to seriously investigate whether or not constraint violations met its concerns for maintaining the quality of its data. Here the RfC process failed, because its participants were more concerned with winning the argument (or the poll) than with arriving at a consensus. If Wikidata content is to be used in Wikipedia, these two groups must listen to each other & respond to their concerns constructively.
Lastly, when Wikipedia attempts to work with another Wikimedia project, friction results. The two projects have goals that are different, although not necessarily in conflict. Furthermore, as neither speak with a unified voice, there will inevitably be some misunderstandings and/or discounting of the goals of the other. It is always helpful for both sides of a debate to make an effort to understand the others concerns & goals, to reduce unnecessary conflict. Friction has also arisen from the Wikimedia Foundation pushing Wikidata acceptance on other projects. In Wikipedia's case, this has initially taken the form of "short descriptions" drawn from Wikidata being used on the mobile version of Wikipedia -- at Foundation direction, not either community. This issue may be superseded by the start of Wikipedia:WikiProject Short descriptions, but so far the impact this new Wikiproject will have is unclear.
Summary of this summary: There is a consensus that data drawn for Wikidata might be acceptable for use in Wikipedia if Wikipedians can be assured that the data is accurate, and preferably meets Wikipedia rules of reliability. For the other issues raised within this RfC, there was no clear consensus. --- llywrch (talk), Swarm, Fish and karate 16:37, 13 June 2018 (UTC)
Poll
[edit]The following discussion is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.
- 1A because the data on WikiData is too unreliable and there is no proper way of controlling or checking changes to that data. Wrong material stays for months undetected, possibly damaging our BLPs and other sensitive data, or just result in sub-standard infobox material, and it may result in material that may be damaging to our readers. Blindly rolling out WikiData infoboxes also results in transclusion of material that we chose not to display due to our local policies (but will be displayed because it is referenced - except if we explicitly override such material). Displaying WikiData-data may in some cases need extensive programming behind templates and need local options in the pages on the templates to chose the right value. WikiData policies and guidelines are not in line with en.wikipedia's policies and guidelines (and they don't need to be), and WikiData data checking policies and vandal-fighting capabilities are not sufficient. (note that I for me this option includes that sometimes a whole infobox is superfluous, whether with WikiData or not).
- (failing 1A, my follow up options are 2A (which basically is 1A), 3A with the caveat that that does not exist - data can and is changed while the references stay unchanged so there is no firm link between the reference and the data (and there is a lack of data checking policies and vandal-fighting capabilities), and, strictly, 4A - we should always keep our local data in case we want to override WikiData if we do decide to use that, as we at any time may need to override a value and then we need to revive it from the history of the document, noting there that WikiData can choose to fill fields with data according to their standards, which may be different from our standards; as I said, their vandal-fighting capabilities are not sufficient, and what WikiData policies/guidelines decide to be the value that is to be in a field may not align, at any time, with what en.wikipedia wants to display). --Dirk Beetstra T C 21:07, 6 April 2018 (UTC)
- After this long discussion, and thinking about this a bit further, it is my view that data (when transcluded) is sourced from WikiData. WikiData is not a reliable source by any of the definitions that en.wikipedia holds for that (the data may be correct (after checking with the local source on WikiData provided for that datapoint), but that does not make the data reliable).
- The only way where I could consider to transclude data from WikiData, is if they would, drastically, change their datamodel, and turn into a reliable source - which likely means that they would need to find ways to properly check their data, and being able to protect or mark data that has been verified. --Dirk Beetstra T C 10:59, 20 May 2018 (UTC)
- Definitions (as these do seem to be used in arguments over and over):
- By our definitions of defining how WikiData data is displayed in our articles, the data is 'transcluded' (see Transclusion and Wikipedia:Transclusion -
"transclusion is the inclusion of part or all of an electronic document into one or more other documents by hypertext reference. Transclusion is usually performed when the referencing document is displayed, and is normally automatic and transparent to the end user.[1] The result of transclusion is a single integrated document made of parts assembled on the fly from separate sources, possibly stored on different computers in disparate places."
and"Transclusion means the inclusion of the content of one document within another document by reference. In Wikipedia transclusion, the MediaWiki software will refer to the content of one page, the template, for inclusion into the content of any other page, the target page."
; - By that definition of transclusion, WikiData is the source of that information, per Wikipedia:Identifying_reliable_sources#Definition_of_a_source/Wikipedia:Verifiability#Reliable_sources:
"The piece of work itself (the article, book)"
- the data is hosted on WikiData, WikiData is the piece of work itself, which is sourced onto Wikipedia. The data that is on WikiData is sourced elsewhere (imported from somewhere else), or 'unsourced' (entered manually but not obtained from an external source to WIkiData), and a datapoint may be referenced independently. - WikiData is an open Wiki by all definitions applied on Wikipedia, see Wikipedia:Identifying_reliable_sources#User-generated_content (
Content from websites whose content is largely user-generated is also generally unacceptable. Sites with user-generated content include personal websites, personal blogs, group blogs, internet forums, the Internet Movie Database (IMDb), Ancestry.com, content farms, most wikis including Wikipedia, and other collaboratively created websites. In particular, a wikilink is not a reliable source
) and d:Wikidata:General_disclaimer ("WIKIDATA MAKES NO GUARANTEE OF VALIDITY"
. Whether data on WikiData is referenced itself is irrelevant - much information on en.wikipedia is referenced, but en.wikipedia remains an unreliable source. - The references on WikiData data are according to WikiData standards. WikiData has its own sourcing requirements. WikiData has NO obligation to follow en.wikipedia sourcing standards, nor has it any obligation to follow other content standards like WP:BLP, WP:NOT, etc.
- WikiData has no control measures to ensure that their data is correct, that references actually state what is claimed, whether data is still the same as what is originally referenced, whether the reference is still the same as original, or whether the independent reference is a reliable source by any standard.
- By our definitions of defining how WikiData data is displayed in our articles, the data is 'transcluded' (see Transclusion and Wikipedia:Transclusion -
- In short: data transcluded from WikiData is transcluded from an unreliable source (by en.wikipedia standards). --Dirk Beetstra T C 06:49, 21 May 2018 (UTC)
- Sure. Data transcluded from the English Wikipedia is transcluded from an equally unreliable source. Not sure why you need so many words to express such a simple statement. An open Wiki (including all Wikimedia projects) can not be a reliable source. We have this written somewhere in a policy for ages.--Ymblanter (talk) 07:22, 21 May 2018 (UTC)
- @Ymblanter: Seen your statement below regarding filtering of 'bad sources', I still doubt that we are on the same line to explain what data to exclude and include by selection of said bad sources. Maybe I need to be even more elaborate? The argument 'we can filter data so we exclude data that has a bad source' is utter nonsense, because without extensive artificial intelligence you cannot filter data that has a 'bad source'. And we have an essay for probably also many years that negates an argument along the lines 'it is fine to use data from the external unreliable source WikiData, because en.wikipedia also transcludes data from an equally[citation needed] unreliable source': WP:OTHERCRAPEXISTS (and by said argument, it is fine to reference data to imdb, because imdb is just as unreliable[citation needed] as en.wikipedia. (unfortunately, I think that en.wikipedia is more reliable than WikiData by approximately an order of magnitude). --Dirk Beetstra T C 11:02, 21 May 2018 (UTC)
- First, reliability of the English Wikipedia in terms of WP:RS is exactly the same as Wikidata, and is the same as IMDB. It is zero. Nil. Furthermore, one does not need an artificial intelligence to determine which sources used on Wikidata are reliable. For a very simple reason - the vast majority of these sources (99.9999 or higher%) of these sources were added by bots. What bots were doing is well documented. For example, we know that if the statement is unsourced or if it is sourced to the English Wikipedia or IMDB it is not reliable and must not be transcluded. If it is sourced to a reliable database such as Australian Biographic Dictionary it is reliable and should be transcluded. Bots did not add statements from sources we would have any difficulties determini[ng whether they are reliable or not. The rest of the sources on Wikidata were added by humans manually, and the owerwhelming majority of these humans are experienced Wikipedia editors who know the sourcing policies. There might be indeed some sources added as vandalism, or some sources which have been vandalized. but I am sure the scale of this vandalism is way smaller than vandalism on Wikipedia. For one, I have never seen a vandalized source on Wikidata.--Ymblanter (talk) 11:13, 21 May 2018 (UTC)
- Besides that it sounds really strange to me that refspam has not started on WikiData yet (I know that spammers strike sometimes on WikiData first, their links may be transcluded on hundreds of Wikis in one shot), but fine. So at best we are at the WP:OTHERCRAPEXISTS argument: "is fine to use data from the external unreliable source WikiData, because en.wikipedia also transcludes data from an equally[citation needed] unreliable source". --Dirk Beetstra T C 11:45, 21 May 2018 (UTC)
- First, reliability of the English Wikipedia in terms of WP:RS is exactly the same as Wikidata, and is the same as IMDB. It is zero. Nil. Furthermore, one does not need an artificial intelligence to determine which sources used on Wikidata are reliable. For a very simple reason - the vast majority of these sources (99.9999 or higher%) of these sources were added by bots. What bots were doing is well documented. For example, we know that if the statement is unsourced or if it is sourced to the English Wikipedia or IMDB it is not reliable and must not be transcluded. If it is sourced to a reliable database such as Australian Biographic Dictionary it is reliable and should be transcluded. Bots did not add statements from sources we would have any difficulties determini[ng whether they are reliable or not. The rest of the sources on Wikidata were added by humans manually, and the owerwhelming majority of these humans are experienced Wikipedia editors who know the sourcing policies. There might be indeed some sources added as vandalism, or some sources which have been vandalized. but I am sure the scale of this vandalism is way smaller than vandalism on Wikipedia. For one, I have never seen a vandalized source on Wikidata.--Ymblanter (talk) 11:13, 21 May 2018 (UTC)
- @Ymblanter: Seen your statement below regarding filtering of 'bad sources', I still doubt that we are on the same line to explain what data to exclude and include by selection of said bad sources. Maybe I need to be even more elaborate? The argument 'we can filter data so we exclude data that has a bad source' is utter nonsense, because without extensive artificial intelligence you cannot filter data that has a 'bad source'. And we have an essay for probably also many years that negates an argument along the lines 'it is fine to use data from the external unreliable source WikiData, because en.wikipedia also transcludes data from an equally[citation needed] unreliable source': WP:OTHERCRAPEXISTS (and by said argument, it is fine to reference data to imdb, because imdb is just as unreliable[citation needed] as en.wikipedia. (unfortunately, I think that en.wikipedia is more reliable than WikiData by approximately an order of magnitude). --Dirk Beetstra T C 11:02, 21 May 2018 (UTC)
- Sure. Data transcluded from the English Wikipedia is transcluded from an equally unreliable source. Not sure why you need so many words to express such a simple statement. An open Wiki (including all Wikimedia projects) can not be a reliable source. We have this written somewhere in a policy for ages.--Ymblanter (talk) 07:22, 21 May 2018 (UTC)
- (failing 1A, my follow up options are 2A (which basically is 1A), 3A with the caveat that that does not exist - data can and is changed while the references stay unchanged so there is no firm link between the reference and the data (and there is a lack of data checking policies and vandal-fighting capabilities), and, strictly, 4A - we should always keep our local data in case we want to override WikiData if we do decide to use that, as we at any time may need to override a value and then we need to revive it from the history of the document, noting there that WikiData can choose to fill fields with data according to their standards, which may be different from our standards; as I said, their vandal-fighting capabilities are not sufficient, and what WikiData policies/guidelines decide to be the value that is to be in a field may not align, at any time, with what en.wikipedia wants to display). --Dirk Beetstra T C 21:07, 6 April 2018 (UTC)
- 1F, 2E, 3D, 4F is my preference. Wikidata infoboxes are now technologically mature enough for this, and are well proven both here and on other wikis - Wikidata is used in hundreds of thousands of infoboxes across different languages and projects now. I don't buy the unreliability arguments for Wikidata as much as I don't buy them for Wikipedia - more eyes on data will help resolve those concerns, and most of the examples of unreliability I've seen have been over-dramatised and vastly distort the reality. For Q1, 1E is close behind as an interim option, particularly for more tricky cases such as BLPs (although it means a lot of duplication of templates). 1C or 1D would be OK. 2D would also be OK, as they also balance hiding controversial info while being simple to understand/implement. I don't think we need to keep duplicates of information (in the same way as we don't keep interwikis here any more), hence my vote for 4F, but 4C would also be OK. 1A 2A and 4A would be shooting ourselves in the foot. 4B would be very hard to implement, and inherently duplicates info, which is bad. BTW, sorry to everyone for how long this RfC has turned out to be! Thanks. Mike Peel (talk) 21:25, 6 April 2018 (UTC)
- On Question 2: I think this is the most useful question on which to gain a consensus. Having consistency across en.WP on how Wikidata is included or excluded from infoboxes will build editors' confidence in their use of those infoboxes and in knowing that they can edit any infobox and have it display as they intend. In August 2016, we successfully decided that infoboxes should use only one parameter,
|coordinates=
for coordinates, instead of a variety of latitude and longitude parameters, and I believe that has led to significant improvements in how coordinates are used and displayed in infoboxes and articles. I believe that deciding on a single way to show or hide wikidata values in infoboxes across en.WP will lead to a similar consistency and ease of use. Good luck, everyone. – Jonesey95 (talk) 22:49, 6 April 2018 (UTC) - These are big questions, and I've suggested on the talk page we should split this !vote section to allow each to be considered separately. (Follow-up perhaps best there). I hadn't been following this page, so wasn't aware this RfC was in the offing. But my first reaction is to think: Devolve decision to WikiProjects -- but require approval from relevant WikiProjects in the area of each template or set of templates for each step up the ladder of usage. Some WikiProjects I believe track and manage their Wikidata items very tightly -- I'm particularly thinking of the project behind {{Infobox_gene}} here. If the WikiProject is very confident of its data management and change tracking on Wikidata, then it makes no sense to me to force them to have to manage data in two places, and force them to track changes and possible vandalism in duplicated template data here as well, which may be a lot messier to model at scale.
- There are other projects which particularly strong and capable -- I'm thinking in particular some of the transport projects, MilHist, Medicine, astronomy -- which probably also have the capability to tightly monitor 'their' items on Wikidata if they want to. Some of the international projects from particular parts of the world may also find Wikidata may be the best channel to stay current and get data updates from their home wikis. I see no advantage in banning these projects from drawing from Wikidata, if that is what they want to do, and are comfortable that they can manage. (Perhaps keeping reference data here, and tracking divergences with Wikidata that may appear, or perhaps not). The latter may also be a technique that works for projects with very wide data, where there might be controlled bulk updates on Wikidata -- some of the monuments projects for example; or artists, art, and artworks, where there's been a real drive to get data into Wikidata from a very wide range of external sources, that it would be useful to be current with. At the other extreme, there may be WikiProjects that may have confidence that a very particular focus area of their subject is well-curated -- there is now superb data in Wikidata on 20th-century members of the UK parliament, for example, that is very actively curated.
- Given all of that, I think a "one size fits all" policy is probably a mistake. Yes, there are probably aspects where it makes sense for en-wiki to lay down blanket requirements for safeguards: sourcing and monitoring of certain aspects of data on BLPs for example. But I do think it makes sense to allow WikiProjects the freedom to authorise the use of data from areas of Wikidata that they may have confidence in, if they have confidence in that data, and also confidence in their own capability to make the call. Jheald (talk) 23:56, 6 April 2018 (UTC)
- Coding my response, I think that translates to 1C (WikiProject discussion required); 2E (if the WikiProject is okay with the infobox, this should be allowed. The requirement for WikiProject approval or final say might also be read as 2B or 2G. I do think the WikiProjects should have the final say, including as to which fields should or should not be drawn from Wikidata, but in most cases if using Wikidata at all, I think 2E is probably the mode that makes most sense. But an alternative for sensitive fields might be to always keep a local value, and flag when that differs. There seems to be no particular way to code that, but it might follow from requiring WikiProject discretion under #1); 3E (An infobox, for the most part, is meant to recapitulate information presented (and sourced) in more detail elsewhere in the page. The wikidata value should be open to challenge, ie a reference being insisted on, if no such source is given in the main text; but even then it shouldn't be automatically excluded (even for BLPs) if it is merely banal or uncontroversial. That seems to be the standard applied to non-wikidata infoboxes, and it seems to work well enough. I think 3E is compatible with allowing bots to auto-challenge fields that are of particular sensitivity -- in any case they should comply with groundrules set by wikiprojects). 4F 4C. (I'm happy with fields being dematerialised to Wikidata, if WikiProjects approve; but an option to allow editors to make local over-rides in wikitext must be preserved. These local over-rides in turn could then be dematerialised in a future sweep, if Wikidata had been updated to match them. This would not apply to any fields WikiProjects consider so sensitive that they need to be represented as local values, as discussed under 2 above; but, again, this would seem to be covered by giving Wikiprojects the discretion to set the details of which fields Wikidata values can directly be used for, under 1 above. Jheald (talk) 23:24, 6 May 2018 (UTC)
- Support and agree with Jheald. A blanket policy for all our 5 million articles is a mistake. We should allow local consensus in topic-specific areas. This will better reflect whether editors which are best accustomed to that area have confidence or not in Wikidata, based on their experience and exposure in the use of Wikidata in that topic area. A blanket policy will be extremely counterproductive, either allowing Wikidata in sparsely edited areas with a potential for vandalism, or disallowing it in areas where there is heavy editor activity and confidence in Wikidata.--Tom (LT) (talk) 00:02, 7 April 2018 (UTC)
- @Tom (LT): A general "support" vote here doesn't make sense... But that said, can you explain why you think that this issue varies between topics, please? Thanks. Mike Peel (talk) 00:41, 7 April 2018 (UTC)
- @Mike Peel Too many options here. I support inclusion and oppose a blanket yes or no. Some areas have many editors and some have few editors. Some areas (eg where I work, medicine and anatomy) have a strong Wikidata presence and active WD editors, and good experience with WD. Other areas may not have the same experience, exposure or WD presence. --Tom (LT) (talk) 03:20, 7 April 2018 (UTC)
- Can you explain why you're trying to influence other peoples' votes and declaring that they don't make sense? And demanding explanations for them? — Preceding unsigned comment added by 68.234.100.169 (talk) 19:51, 9 April 2018 (UTC)
- @anon I wasn't trying to influence the !vote, I was just pointing out that it doesn't make sense to just say "support" where there are 4 questions and a set of answers, with an expectation of choosing between those answers. No demands were made, just a request. Thanks. Mike Peel (talk) 20:25, 9 April 2018 (UTC)
- Can you explain why you're trying to influence other peoples' votes and declaring that they don't make sense? And demanding explanations for them? — Preceding unsigned comment added by 68.234.100.169 (talk) 19:51, 9 April 2018 (UTC)
- @Mike Peel Too many options here. I support inclusion and oppose a blanket yes or no. Some areas have many editors and some have few editors. Some areas (eg where I work, medicine and anatomy) have a strong Wikidata presence and active WD editors, and good experience with WD. Other areas may not have the same experience, exposure or WD presence. --Tom (LT) (talk) 03:20, 7 April 2018 (UTC)
- @Tom (LT): A general "support" vote here doesn't make sense... But that said, can you explain why you think that this issue varies between topics, please? Thanks. Mike Peel (talk) 00:41, 7 April 2018 (UTC)
Closer: note the following 4 votes are from the same editor (I have no problem with that, but people sometimes skim when counting votes ... this is just to clarify for those people). - Dank (push to talk) 03:31, 5 May 2018 (UTC)- That's because there were four questions, so I answered each one separately, with the expectation at the time that the RfC might be reformatted into four separate voting sections and it would have been easier to unpick my votes when that happened. Please feel free to reformulate them into a single vote if you think that is somehow less confusing. --RexxS (talk) 20:10, 5 May 2018 (UTC)
- Done, striking, thanks. - Dank (push to talk) 13:39, 7 May 2018 (UTC)
- That's because there were four questions, so I answered each one separately, with the expectation at the time that the RfC might be reformatted into four separate voting sections and it would have been easier to unpick my votes when that happened. Please feel free to reformulate them into a single vote if you think that is somehow less confusing. --RexxS (talk) 20:10, 5 May 2018 (UTC)
- Votes as follows:
- 1D because it's the Wiki-way that we deal with consensus normally.
- 2C for most infoboxes; 2E for infoboxes in areas where all of its uses would be curated on deployment (e.g. astronomy).
- 3C for most fields because that's what we require normally. There may be some fields (e.g. images) where 3E or 3F would be sensible.
- 4C we should always retain the ability to override a WIkidata value with a locally supplied one. --RexxS (talk) 00:49, 7 April 2018 (UTC)
- Oppose. Why do we need Wikidata info in infoboxes? It would just be a good opportunity for vandals. (Remember the guy who added fake ads to templates?) Lojbanist remove cattle from stage 01:13, 7 April 2018 (UTC)
- The correct way of phrasing that vote would be 1A, not Oppose {{3x|p}}ery (talk) 01:18, 7 April 2018 (UTC)
- 1F, 2D, 3E, 4F/C - I support full rollout of Wikidata, but in phased manner. It should start be replacing images (completely) and some info like coordinates, dates, and other numbers should be Wikidata only unless disputed on article talk page. Such a migration will help converting data into semantic one, and would be highly beneficial for other languages. Further info which is widely disputed like religion, relations, etc. can be made opt-out1 so when there's a dispute a local value can be supplied to override Wikidata. I further don't buy the argument that the only info Infoboxes should hold is what is in the article already. I mean I can always add infobox info in the articles, who's stopping you. Capankajsmilyo (talk) 01:56, 7 April 2018 (UTC)
- Strong support as 1F, 2E, 3E, 4F. Yes, it is mostly uncharted territory for most editors, but it is time to evolve. The way we handle data is primitive compared to the more advanced websites on the internet. Enabling Wikidata in infoboxes will be a significant advancement in our editing environment, and will make things much easier for a large number of dedicated editors. Rehman 03:52, 7 April 2018 (UTC)
- 1F, 2E, 3E, 4C (but migrate all enwp data with a view to deleting local copy if satisfactorily migrated). My second choice, which I would still be happy with, is 1D, 2D/2E, 3D, 4C/4D. Wikidata is an invaluable resource and, imho, the future of Wikipedia. Best, Kevin (aka L235 · t · c) 05:08, 7 April 2018 (UTC)
- 1D, 2E, 3E, 4C—This should allow editors on roughly WikiProject level to make case-by-case decisions collaboratively, depending on the quality of both Wikipedia and Wikidata within their field of work. Quality of both projects (correctness, completeness, level of sourcing, amount of vandalism, etc. …) vary from topic to topic, and there is also different need for automation depending on the topic. 1D (passive consensus) would allow to discuss Wikidata use without the need to have a project-wide RfC that potentially attracts lots of wikipolitics. 2E and 3E resemble the situation of enwiki in my field of work (almost all information is non-controversial and correct, but it often isn’t inline-referenced at Wikipedia). 4C is an easy way to allow overwriting of Wikidata values in case one’s not happy with it, but I see the risk that lots of local data is going to be removed in order to get the Wikidata value shown; it would probably be useful to somehow compare values first with tracking categories and so on. —MisterSynergy (talk) 06:13, 7 April 2018 (UTC)
- 1A 2A 4A, viewing Wikidata as a valuable project but one whose referencing stringency is weaker and which is potentially vulnerable to malicious injection, so which should remain secondary. However, if there is a consensus to proceed, 1D 2B/C 3C 4C could preserve safeguards while allowing areas with specific WikiProject oversight to utilise data integration, as others have described above. AllyD (talk) 07:57, 7 April 2018 (UTC)
- 1A 2B 3A 4A Wikidata and Commons are the same in the sense that some of the page is stored elsewhere. However, this is storing core content elsewhere, and not just pretty images. We have known policies and guidelines that such information must adhere to, and I do not think that we can trust Wikidata to meet these standards on all occasions. That being said, if this is implemented, then 2B is fine -- there is some metadata and other bits and bobs of information that can be stored on Wikidata and used here, as specified in the consequences column, that would not be particularly problematic. talk to !dave 10:05, 7 April 2018 (UTC)
- Also, I do agree. абвгдеээээ???? This RfC is confusing. Why not support or oppose? talk to !dave 22:25, 7 April 2018 (UTC)
- 1F, 2F, 3C, 4F Migrating infoboxes to Wikidata would be a big win for the Wikimedia movement, and to have the biggest impact we need to go all in. With all infobox data in Wikidata, we will have many more pairs of eyeballs on Wikidata than at present, and so the reliability of the data will increase. And by requiring that the data at Wikidata be sourced, we will increase the number of citations in Wikidata as well. The net effect will be reliable, structured data that anyone can use, and the impact of that should not be underestimated. This is something much bigger than just the English Wikipedia - it would benefit the entire human race, in ways which we cannot yet fully predict. If there are problems with editing or watching Wikidata values from this site, they can be addressed with new software. A gradual rollout is probably best, so that we can identify any potential pain points and fix them before doing mass migration. — Mr. Stradivarius ♪ talk ♪ 11:16, 7 April 2018 (UTC)
- 1D, 2G, 3D, 4D. I do not think the creators of this RfC are interested in motivations, so I better save my time for more useful things. For a full disclosure, I am an administrator and bureaucrat on Wikidata.--Ymblanter (talk) 11:44, 7 April 2018 (UTC)
- 1F; 2F or 2E (let template editor choose: 2F better for new or rewritten templates; 2E when it simplifies migration); 3F (it's a waste of time to discuss, just edit the Wikidata items if some statement seems insufficiently referenced); 4F or 4E (avoid duplication, but allow to add new overrides in parameters on a case by case basis to simplify management). --Nemo 13:11, 7 April 2018 (UTC)
- 1A, 2A, 3A, 4A. Wikidata is a different project. What they choose to do with their data is fine, but at this point, I'm not seeing how it begins to support Wikipedia's desire to improve referencing. If 2A and 3A were supported properly, I could see 1B and 4B but at this point, I'm not seeing how Wikidata meetins our policies on sourcing. Ealdgyth - Talk 13:42, 7 April 2018 (UTC)
- 1A, 2A, etc.
Halfway between 1A and 1B, and you can infer the rest from that.A better answer would be: Unask the question. Some editors like the results of Wikidata content in articles they work on, some don't. So, that means you have to let people choose, right? This is faulty logic. Everyone is in favor of getting other people to help with work that they would otherwise have to do, but that's the wrong question. The right question is: Do you feel any sense of connection to our community and its policies? If you don't, then are you at least willing to help deal with problems that come up in the articles you care about? If not, then why are you trying to volunteer someone else to clean up after the mess that your vote is causing? The way to get the right questions and answers is to look closely at those wikiprojects where people are happy with wikidata integration, and keep asking questions until we understand clearly why they're happy ... and to do the same for those wikiprojects where it's not working well. When we understand the difference between what works and what doesn't, then we can craft the right RfC. - Dank (push to talk) 15:06, 7 April 2018 (UTC) Tweaked. - Dank (push to talk) 18:01, 20 April 2018 (UTC)- My vote here was misinterpreted in a tally below as being a non-vote. I'm not happy voting 1A; I don't want to risk alienating good-faith contributors and discarding high quality work. But apparently, any nuance might be used as an opportunity to discount my vote, so I'm discarding the nuance in the vote itself. - Dank (push to talk) 03:43, 24 April 2018 (UTC)
- My sense of the RfCs from years ago is that some voters (including me) were hopeful that this new thing called Wikidata would get better over time. What I see here is voters who are saying that, if anything, things are worse, that the problems caused by the lack of policies and different policies and norms at Wikidata are growing. What's very frustrating is that this impression of the direction Wikidata is taking is quite common among experienced Wikipedians, but only a small percentage of us are voting here. Democratic values and norms [note: not democracy] die when people choose not to participate; this is a problem that we need to figure out how to fix. - Dank (push to talk) 13:57, 7 May 2018 (UTC)
- @Dank: My perspective is that there are experienced Wikipedians working with Wikidata, and that things are always improving there (in particular, arbitrary access was a big technical breakthrough, and the average # of statements/references per item is continually increasing). But those that are complaining about absence/difference in policies often aren't willing to help fix the issues they're complaining about, which is a shame. Thanks. Mike Peel (talk) 14:53, 7 May 2018 (UTC)
- Even if there was a willingness to help I don't see how the culture of wikidata can be helped. I don't think overriding the wikidata" community's wishes to generate the policies we want would be a good idea; or to clash with people from other wikis with different standards there. Even if the policies were changed in word, they would need to be enforced and thus the bot operators, admins, and every one else need to change how they work. So there isn't a real way to fix the issues other than importing a lot of enwiki people to completely change the community and standards; I don't see that going very well/happening easily. (and a lack of willingness to help can as well stem from a feeling that it is a waste of effort when we've got a perfectly acceptable system here) Galobtter (pingó mió) 15:02, 7 May 2018 (UTC)
- It is not a perfectly acceptable system because information just does not get to the English Wikipedia in years whereas it does get to Wikidata quickly. We have plenty of outdated information which nobody cares / is able to update.--Ymblanter (talk) 16:07, 7 May 2018 (UTC)
- Then again, the information in wikidata may get updated but it may not be as correct. Priorities may be different. Our data gets updated reasonably quickly IMO because of our size (for smaller wikis it makes much more sense). And if we did want, we can perfectly run bots ourselves. Galobtter (pingó mió) 16:25, 7 May 2018 (UTC)
- Your reply possibly means you just do not work in the areas data does not get updated for years. As someone working on Russian and Ukrainian localities I see it on a daily basis (the last real-life example I had was from two days ago; Wikidata had it correctly). Concerning the bots, there is a topic below about them, with the conclusion it is not really possible--Ymblanter (talk) 16:26, 7 May 2018 (UTC)
- I mean a bot to do similar as done on wikidata Galobtter (pingó mió) 16:38, 7 May 2018 (UTC)
- Your reply possibly means you just do not work in the areas data does not get updated for years. As someone working on Russian and Ukrainian localities I see it on a daily basis (the last real-life example I had was from two days ago; Wikidata had it correctly). Concerning the bots, there is a topic below about them, with the conclusion it is not really possible--Ymblanter (talk) 16:26, 7 May 2018 (UTC)
- Then again, the information in wikidata may get updated but it may not be as correct. Priorities may be different. Our data gets updated reasonably quickly IMO because of our size (for smaller wikis it makes much more sense). And if we did want, we can perfectly run bots ourselves. Galobtter (pingó mió) 16:25, 7 May 2018 (UTC)
- (ec)Can't speak for others, but my main reasons for not being willing to go to Wikidata to fix the issues is that a) I don't see the purpose of Wikidata for enwiki (the purpose of Wikidata in itself is not my concern here), apart from Interwikilinks; b) I have seen the treatment and success others have got in trying to fix some of the issues (e.g. the things Nikkimaria tried to achieve, and the disinterest of Wikidata to consider their arguments); c) some editors I utterly dislike are the big shots at Wikidata, domineering many discussions (and no, this is not about Mike Peel or RexxS or Ymblanter, but some others, including one editor banned on enwiki for good reasons). Reading the Wikidata portal and similar places gives me very little incentive to go over there and "help fix the issues". Oh, and d) I find the editing of Wikidata extremely tedious and uninspiring. Directly editing a database is much more inhuman, uncreative and boring than editing enwiki. While it is apparently much more to the liking of some editors, I see it as contrary to what enwiki is and to what made enwiki the relative success it has become, i.e. humans writing an encyclopedia collaboratively, not bots filling up an encyclopedic database. Fram (talk) 15:06, 7 May 2018 (UTC)
- What I'm saying is, intentionally, a gross oversimplification: I hear more unhappiness from Wikipedians these days about Wikidata. Don't take my word for that ... we should rely on RfCs, as we're doing here ... but my perception informs my vote, as it does for the other voters here, which is a better explanation of the votes we're seeing than spurious arguments such as "These voters are bad people who are unwilling to compromise". This doesn't contradict what Mike is saying above, because things on the Wikidata side might in fact be getting better. The meta-issue here is troubling for me ... when faced with questions involving integration with an outside project that doesn't entirely share our values and culture, when Wikipedians are told "Okay, this is going to be different from Wikipedia, but we're going to display it on Wikipedia anyway ... we know this causes problems, just deal with it" ... the general response of Wikipedians has been remarkably passive; most of us aren't even showing up to vote. The thing that's always kept Wikipedia strong, through one crisis after another, has been the development of cultural institutions, community norms, and core policies that we usually manage to hold on to, at least on average, at least during big RfCs. But this isn't something I've spent much time studying. - Dank (push to talk) 15:50, 7 May 2018 (UTC)
- Well, if we concentrate on the big picture, Wikidata is very big and full of, well, data. It is full of data of all possible types - reliably sourced, badly sourced, unsourced, BLP, genes, VIAF, Commons categories etc. People who work here with data know this very well. Some of them found niche uses for some of these data (which is reliably sourced and is in some specific categories). The telescope infobox and {{Authority control}} have been already mentioned; {{Commonscat}} has not been mentioned but is another example. I personally work a lot with one infobox whioch I am not going to mention here but which pulls two of the fields from Wikidata and I still have to see that it malfunctions. Here we have people working with these templated. We also have people not working with these templates but who are unhappy with Wikidata out of general principle, for example, because visible subjects get often vandalized, and the reaction time to revert vandalism are hours or days, not seconds. (or multitude ofg other problems). Because of this they vote here 1A, 2A, 3A, 4A, thus basically, not being experts working with acceptable data, telling experts what to do. As simple as that. It is now likely be closed as no consensus, but I can easily imagine that an increasing participation could shift the result towards total prohibition of Wikidata.--Ymblanter (talk) 16:20, 7 May 2018 (UTC)
- What I'm saying is, intentionally, a gross oversimplification: I hear more unhappiness from Wikipedians these days about Wikidata. Don't take my word for that ... we should rely on RfCs, as we're doing here ... but my perception informs my vote, as it does for the other voters here, which is a better explanation of the votes we're seeing than spurious arguments such as "These voters are bad people who are unwilling to compromise". This doesn't contradict what Mike is saying above, because things on the Wikidata side might in fact be getting better. The meta-issue here is troubling for me ... when faced with questions involving integration with an outside project that doesn't entirely share our values and culture, when Wikipedians are told "Okay, this is going to be different from Wikipedia, but we're going to display it on Wikipedia anyway ... we know this causes problems, just deal with it" ... the general response of Wikipedians has been remarkably passive; most of us aren't even showing up to vote. The thing that's always kept Wikipedia strong, through one crisis after another, has been the development of cultural institutions, community norms, and core policies that we usually manage to hold on to, at least on average, at least during big RfCs. But this isn't something I've spent much time studying. - Dank (push to talk) 15:50, 7 May 2018 (UTC)
- Even if there was a willingness to help I don't see how the culture of wikidata can be helped. I don't think overriding the wikidata" community's wishes to generate the policies we want would be a good idea; or to clash with people from other wikis with different standards there. Even if the policies were changed in word, they would need to be enforced and thus the bot operators, admins, and every one else need to change how they work. So there isn't a real way to fix the issues other than importing a lot of enwiki people to completely change the community and standards; I don't see that going very well/happening easily. (and a lack of willingness to help can as well stem from a feeling that it is a waste of effort when we've got a perfectly acceptable system here) Galobtter (pingó mió) 15:02, 7 May 2018 (UTC)
- @Dank: My perspective is that there are experienced Wikipedians working with Wikidata, and that things are always improving there (in particular, arbitrary access was a big technical breakthrough, and the average # of statements/references per item is continually increasing). But those that are complaining about absence/difference in policies often aren't willing to help fix the issues they're complaining about, which is a shame. Thanks. Mike Peel (talk) 14:53, 7 May 2018 (UTC)
- Preferences with comments inline
- 1: D E I believe that Wikiprojects, who are de facto responsible for maintaining the majority of infobox mechanics and infobox data should be allowed to decide on a case by case basis.
- 2: G Again see above.
- 3: A or C
- 4: C Seddon talk 15:22, 7 April 2018 (UTC)
- 1F, 2E, 3D, 4F I fail to understand how anyone could think storing information about e.g. 12,000 genes in anything but a structured form shared across wikis is a good idea if we want the information to be correct and up to date. --D Wells (talk) 16:16, 7 April 2018 (UTC)
- 54367QRTE$#$$$^. I am not sure how this RFC could be more confusing. I agree that a blanket policy for this across all articles is a very bad idea. --Rschen7754 18:57, 7 April 2018 (UTC)
- I was not going to say anything else, but I am disappointed at the direction the discussion is going (though I suppose I should not have expected any differently). I do not think this all or nothing, my way or the highway, "everything must use Wikidata" and "nobody may use Wikidata" approach is going to lead to any consensus in either direction. I would encourage both parties to come to a compromise. After all, consensus is one of the Wikimedia principles. --Rschen7754 07:56, 14 April 2018 (UTC)
- 1B, 2B, 3A, 4C. As it stands question 3 is pointless. All data showing on a wikipedia article is *required* to be compliant with ENWP policies. Until such time as wikidata implements either compliant (or stricter, like Commons) policies, anything drawn from wikidata is required to adhere to local policies. If it isnt, its subject to removal regardless of any vote here. Only in death does duty end (talk) 19:54, 7 April 2018 (UTC)
- 1A, 2A, etc... there are three fundamental problems with using Wikidata: first (and formost) is the problem of policy compliance (especially in WP:BLP situations)... the second is that since it compiles much of its data FROM Wikipedia articles, using it IN Wikipedia articles creates a circular reference situation... third, figuring out how to correct errors in Wikidata is not intuitive. We really need to stop thinking of information in infoboxes as a collection of “data” (externally stored and imported into our articles by a bot) - the information in an infobox needs to be seen as part of the article TEXT (all be it text presented in a quick-lookup format). It needs to comply with policy, and must be easy to edit (on the page where it appears) if there is an error. Wikidata does not pass either requirement. Blueboar (talk) 20:23, 7 April 2018 (UTC)
- 54367QRTE$#$$$^ This RFC is extraordinarily confusing. If you pull data from wikidata, it should be hardcoded into an infobox so that it does NOT update when Wikidata does. This opens the door to all kinds of hard-to-detect vandalism. I have no problem using wikidata when a better source for that information cannot be found, or the information is uncontestable. Tazerdadog (talk) 20:56, 7 April 2018 (UTC)
- This seems like you are suggesting 4B. {{3x|p}}ery (talk) 22:46, 7 April 2018 (UTC)
- 4B with a caveat that a human needs to be in the loop when the data is added in the first place. A bot mass-importing values from Wikidata unchecked has the potential to be a huge mess. Similarly, care needs to be taken to avoid the problem of circular references Blueboar has accurately identified. Wikidata should be treated as a poor and unreliable (albeit sometimes convenient) source. Tazerdadog (talk) 03:06, 8 April 2018 (UTC)
- This seems like you are suggesting 4B. {{3x|p}}ery (talk) 22:46, 7 April 2018 (UTC)
- 1A or a cautious 1B; 2A or a cautious 2B; 3A or 3B; 4A While there may be limited situations in which Wikidata is useful, we should be very cautious about its use, require consensus for each use and stick firmly to the principle that an encyclopaedia is not merely a collection of data, but should be prose, written, edited and discussed by people, requiring reliable source citations. I've added a more philosophical analysis below. Bondegezou (talk) 21:36, 7 April 2018 (UTC)
- 1F 2E 3C 4E Wikidata is the future. It needs a lot more references up there, but that will only happen once we get going on it and start using the data. I only started using Wikidata because of Template:Taxonbar (edit | talk | history | links | watch | logs) and now i update it all the time. I see this is the next step for the Automatic Taxobox system as well. Nessie (talk) 23:00, 7 April 2018 (UTC)
- 1A, 2A, 3A, 4A And in passing, I think this is one of the worst-presented RfCs I've ever seen, which is quite a strong statement. It seems likely to further inflame an entrenched division between Wikidata supporters & others, and also to yet-further inflame the arguments over inclusion/not of infoboxes. Espresso Addict (talk) 03:17, 8 April 2018 (UTC)
- 3A is what matters, and everything flows from it. See my note in the discussion please (search for "the wikidata thing is hard"). Jytdog (talk) 04:30, 8 April 2018 (UTC) (added search term :) Jytdog (talk) 23:05, 9 April 2018 (UTC))
- 1C, 2B/2C, 4C, assuming that the community keeps its head on straight when there are RfCs calling for explicit consensus. I don't where to say this but Wikidata should never be used in an infobox for anything where BLP applies. No way jose. (authority control is obscure enough where it doesn't matter, but infoboxes are up high where everybody sees them and the chance for mischief is too high) Jytdog (talk) 07:44, 7 May 2018 (UTC)
- Bold-revert-discuss is the Wikipedia way. So are verifiability, reliable source, neutral point of view and biographies of living people etc. If something fails one of these it must be revertible from Wikipedia, because these are not necessarily policies of Wikidata, and if one does not know the policies of Wikidata one should not mess with their content. Also Wikipedians must not be coerced to edit Wikidata. We are volunteers and will edit what we choose to edit. Attempts at coercion will drive editors away, which I think is generally accepted not to be a good thing. They also foster resentment and incivility, also generally considered to be sub-optimal. Following this reasoning:
- 1D Passive consensus sufficiently covers BRD, with the proviso that if reverted, discussion is required and if there is no consensus, the default is not include.
- 2C, D, or E by WikiProject consensus. When an article is part of several WikiProjects, the consensus comes from the project for which the article is highest importance, but if this is contended, an RFC would be required. (Also in its way, BRD, but with a default suggestion.)
- 3A or 3B by WikiProject consensus. When there is disagreement between WikiProjects, the information can be seen as challenged, so must have a reference, which pushes it to 3A.
- 4C Local data precedence. When local data are defined they take precedence over Wikidata claims, as this is the simplest way to avoid coercing Wikipedians to edit Wikidata. Cheers, · · · Peter (Southwood) (talk): 07:47, 8 April 2018 (UTC)
- 1A, 2A, 3B and I don't understand the objections to the format of the RfC. Fundamentally, I do not think we should be allowing information that is unsourced into Wikipedia, and that's what Wikidata is: unsourced. Circular with Wikimedia projects, certain types of vandalism would be hard to correct (as you're trying to prove a negative), and this is not providing us with extra data integrity because we would need to include references on en-WP for all information included from Wikidata, and then when Wikidata is edited these references need updating (and users cannot be reliably informed of these changes, due to the poor Wikidata watchlist and non-existent cross-wiki watchlist system). I opine this with reluctance because I like the idea of Wikidata, but it has no sourcing policy. We should not use it anywhere on en-Wikipedia. — Bilorv(c)(talk) 09:53, 8 April 2018 (UTC)
- Cross-wiki watchlist do exist and it is much much better now. User:Capankajsmilyo(Talk | Infobox assistance) 03:25, 20 April 2018 (UTC)
- 1A 2B 3A 4A per Jytdog, Beestra; wikidata data cannot be monitored - it doesn't feature in watchlists, unless enabled - and then you get flooded with irrelevant language changes; it isn't visible, it's an extra step for people editing to understand how to edit wikidata, systems of Q2131234123 or whatever that is. Bots on wikidata don't need to follow our policies; the same with the data etc. Essentially, if we're displaying data in extremely visible places, it must follow our policies, not wikidata's extremely lackluster policies (hardly, one could say, even the bare minimums of WMF policy on BLPs)
- I'd be horrified if 1F is done; I expect mass confusion, and people not knowing how to even edit the infobox; the editing would be annoying, and it'd be very easy for wikidata's lack of admins, people etc to allow vandalism to quickly surface.
- I think the issues need to be ironed out first, if they can be ironed out. Perhaps some pieces of metadata, that don't really need changing anyhow, can be stored on wikidata, thus 2B Galobtter (pingó mió) 14:41, 8 April 2018 (UTC)
- Actually, amend to triple strong oppose use of wikidata per my comment here - where I again see how clear there is that there is no vetting at all of what occurs there, images being added if they vaguely match with no checking if it is even correct. I think I may even retract that 2B - I don't think anything on wikidata can be trusted to remain correct, and not be unvettedly mass changed, vandalized and what not. Galobtter (pingó mió) 03:58, 12 April 2018 (UTC)
- 1F: roll-out Wikidata infoboxes are an invaluable tool that Wikipedias should start to adopt.
- 2D: opt-out 1 local consensus should be able to override or remove inaccurate entries in Wikidata.
- 3B: Policy compliant (contentious only) Unsourced controversial information should obviously be removed. But information that is sourced elsewhere in the article shouldn't need citations, because like the lead, the infobox's job is to summarize the article.
- 4C: local data precedence Wikipedia editors should be able to easily override Wikidata information based on local consensus. AdA&D ★ 16:41, 8 April 2018 (UTC)
- 1A, 2A, 3A, 4A No use of Wikidata in infoboxes at any time, all information to be generated and controlled locally. Beyond My Ken (talk) 18:32, 8 April 2018 (UTC)
- 1A, 2A, 3A, 4A. I don't think Wikidata values are currently sufficiently well watched or sourced for any use in live Wikipedia, in info boxes or articles, and i remove such use when i see it. If people insist on SOME use, then 1C, 2B, 3A (3A is not negotiable IMO) 4C AND 4B. I don't trust Wikidata beyond this. DES (talk)DESiegel Contribs 03:00, 9 April 2018 (UTC)
- 1A, 2A, 3A, 4A per DESiegel. Like Bilorv I don't understand the objections to the format of this RfC; I don't think it's confusing at all. Double sharp (talk) 04:49, 9 April 2018 (UTC)
- 1A, 2A, 3A, 4A. I think this RfC shouldn't have been a second step, not a first one, as it focuses only on infoboxes and not on the more general issue of Wikidata use in enwiki. But since we have it, the above are my preferences, with 3A only if 1A and 2A are not accepted of course, and with for "references" read "reliable references, not wikipedia, findagrave, familysearch, ...". As we can't disallow the use of these sources on Wikidata (which has its own policies and doesn't need to comply with our sourcing requirements), it would probably necessitate the creation of some "infoboxsource-blacklist" probably, which would only create additional drama and overhead. For most information, there is no reason to use Wikidata in the infobox instead of local data, and a lot of reasons not to (confusing for newish editors and often for long-time editors as well, absent from page history, more risk of info conflicting with text of article, ...). Wikidata in infoboxes (and elsewhere in enwiki articles) also overrules our blocks and page protection. Fram (talk) 12:32, 9 April 2018 (UTC)
- @Fram: So, Fram, 12000 infoboxes for genes. Are you *personally* going to undertake to update those, when they get updated on Wikidata? Or should they just be left to rot? Jheald (talk) 12:47, 9 April 2018 (UTC)
- Jheald, no idea what you hope to achieve with this kind of aggressive questioning. Do you personally maintain all infoboxes in articles with Wikidata-driven infoboxes we have now? No? Then, according to your logic, you have no right to support the use of such infoboxes? Or what? Do you guarantee that, when infobox data gets updated on Wikidata, it is actually an improvement and not vandalism or an error? Please, let's discuss the use of Wikidata in infoboxes without making people personally responsible for either site/side. Fram (talk) 12:55, 9 April 2018 (UTC)
- Okay, so you're not going to personally own the consequence of your !vote. So who do you think would undertake this make-work that you would create? And how, given a vote for 1A incorporates a blanket ban on bot-importing from Wikidata? Jheald (talk) 13:25, 9 April 2018 (UTC)
- There is no blanket ban on bot-importing data from Wikidata: adding data from Wikidata directly into infoboxes (into the code of the enwiki article, not through a call which is run every time someone sees the article) is not discussed in this RfC. The question is about showing data which is stored on Wikidata, not about importing data from Wikidata into enwiki. Of course, when one imports data from Wikidata in this way, one can wonder why one shouldn't simply import the data from the reliable source behind the Wikidata data in the first place, reducing the chances that it has been vandalized between the import in Wikidata and the import in enwiki. But all of this is off-topic for this RfC. Fram (talk) 13:32, 9 April 2018 (UTC)
- Okay, so you're not going to personally own the consequence of your !vote. So who do you think would undertake this make-work that you would create? And how, given a vote for 1A incorporates a blanket ban on bot-importing from Wikidata? Jheald (talk) 13:25, 9 April 2018 (UTC)
- Jheald, no idea what you hope to achieve with this kind of aggressive questioning. Do you personally maintain all infoboxes in articles with Wikidata-driven infoboxes we have now? No? Then, according to your logic, you have no right to support the use of such infoboxes? Or what? Do you guarantee that, when infobox data gets updated on Wikidata, it is actually an improvement and not vandalism or an error? Please, let's discuss the use of Wikidata in infoboxes without making people personally responsible for either site/side. Fram (talk) 12:55, 9 April 2018 (UTC)
- @Fram: So, Fram, 12000 infoboxes for genes. Are you *personally* going to undertake to update those, when they get updated on Wikidata? Or should they just be left to rot? Jheald (talk) 12:47, 9 April 2018 (UTC)
- 1E, 2D or 2E, 3D, 4 don't know, my observation is that Wikidata is a newer project which is growing and improving rapidly and that some of the complaints against Wikidata my become invalid through documentation and technical development. It appears as though at least some of the resistance to use of Wikidata data on en.wiki is the feeling of loss of control of what is displayed on Wikipedia. My strong suggestion is that Wikidata instructions on basic editing and functions are improved. I'm unsure what this rapid development means for this RFC. John Cummings (talk) 15:43, 9 April 2018 (UTC)
- 1B, 2C, 3B, 4B. My main concerns with wikidata on WP is that vandalism is not as easily noticed and removed because it's not going to be on watchlists, and a lot of the data is unverified. 3B and 4B mitigate this. Natureium (talk) 17:22, 9 April 2018 (UTC)
- 1A, 2A, 3B, 4A; I echo the concerns of Bilorv, BMK, and others. En.wikipedia's ability to fight vandalism is sufficient, but barely so. I've seen no reason to suggest that Wikidata is at least as reliable; rather the opposite. The concept of Wikidata is wonderful. I really like it. The reality is...something else entirely. --Hammersoft (talk) 19:44, 9 April 2018 (UTC)
- 1A, 2A , 3B, 4 A-or-B. There are pros and cons with using wikidata, however aiming for some kind of 'middleground' just creates a mess with the worst of both worlds. It leaves us with a wasteful back-and-forth struggle, trying to drag the status-quo towards one of two inevitable stable endpoints. Either we don't use wikidata and we avoid all of the downsides, or we fully embrace wikidata and accept all of the downsides to fully realize any benefits it may bring. Any middleground is an unstable mess. I've been studying wikidata and involved with it, and the problems outweigh any asserted advantages. Using wikidata is like a giant automated bot silently importing remote content - except that content doesn't even show up in the article-source. You can't see it on edit, you can't change it on edit, you can't even find it with the search engine - because it's not there. It only exists in the read-view. The automated import bypasses all of our review and control processes, including page-protection and even user-blocks. A banned edit-warrior can bypass a fully protected page by editing the content on Wikidata, and it gets automatically imported. Options 3A and 3B aren't even possible, when edits are wikidata are auto-imported to here. It is also a serious problem that wikidata content is not subject to our policies, guidelines, and norms. The norms and standards and goals of the wikidata community are incompatible with our own - this includes the prevalent and abhorrent disdain for BLP, atrocious expectations on sourcing, lack of regard for promotional content, poor handling of vandalism, and more. The wikidata community is almost more a bot farm than anything else. Wikidata makes it extremely difficult to impossible to deal with unusual infobox values, such as date of birth "1882 or 1883". We seriously do not need to confuse new users with magical-content that doesn't exist in the articles, and which they have to learn an entirely different pain-in-the-ass system to edit the content. And not just new editors, some experienced editors have decided that they are uninterested or unwilling to edit the content over at wikidata. And I'll implicitly echo the critical concerns posted by others. Alsee (talk) 22:28, 9 April 2018 (UTC)
- 1B, 2B or 2C, 3A, 4C or 4D. Generally support local consensus for some areas, especially at Wikiproject level, that might allow more use, or less. @User:Jheald, if you think the medicine project "probably also have the capability to tightly monitor 'their' items on Wikidata if they want to" - have you asked them if they agree with that? I think you'll find they strongly disagree. Most people still have no idea how to alter anything on Wikidata, which I know puzzles adepts. But then article-writers are puzzled why many other editors are so reluctant to write anything in text (on article pages rather than talk pages). Johnbod (talk) 02:00, 10 April 2018 (UTC)
- 1C or 1D, 2D, 3D, 4C or 4D -- Generally, in certain domains, Wikidata data is good as or better than existing data on EnWiki -- we need to start getting eyes on that data, and letting Infoboxes incrementally, and through local consensus develop an understanding of what is appropriate for each local group of data. Discussing incrementally fields, and adding those to an infobox is the only way for us to take advantage of that latent value -- especially around things that we can't reasonably maintain locally. WikiProjects and other working groups which maintain the infoboxes should defacto be responsible for the way in which data should be used locally -- and we should empower them to make these decisions, Sadads (talk) 02:30, 10 April 2018 (UTC)
- 1D, 2E, 3C, 4other -- I would prefer a (much) stronger interpretation of 'experiments' and go with 1A (to include experiments of whole series of articles). But 1D comes closest to that for now. An RfC every time is overkill. For question 2, option D is simply most practical (assuming 3 is strict enough). The box should only be implemented when there's sufficient reliable data on Wikidata across the series, and on a field-by-field basis one could determine where that is the case. If the data is only strong enough for opt-in: don't do it. If we know a better answer: include it in the article (or better: improve Wikidata!). Sourcing (3C) should be enforced strictly, but again in a practical sense. I will assume however that all circular references are left out of consideration (no references to Wikimedia projects or major crowdsourced projects like IMDB etc). This seems a big enough endeavor do figure that out. For 4, I want whatever is necessary to effectively implement 2E (none of the options in 4 seem to support that?). effeietsanders 17:02, 10 April 2018 (UTC)
- 1D, 2G, 3G -- We should do our best to treat Wikidata usage as any other issue on Wikipedia. Case-by-case judgements and implicit consensus are king. No comments regarding 4. A very important use-case is Template:Infobox anatomy, which uses Wikidata for Terminologia anatomica entries, which are both highly reliable, and very useful. There is less control on Wikipedia than on WikiData regarding these entries. Carl Fredrik talk 20:48, 10 April 2018 (UTC)
- 1A, 2A Wikidata is a wiki that "anyone can edit" and is thus an unreliable source. Per WP:V, an English Wikipedia policy, "In Wikipedia, verifiability means that other people using the encyclopedia can check that the information comes from a reliable source" (my emphasis). Wikidata should not be used as a source for anything in article space whether in infoboxes or elsewhere. --Malcolmxl5 (talk) 01:11, 11 April 2018 (UTC)
- @Malcolmxl5: No one is proposing using Wikidata itself as a source, all statements imported into enWiki infoboxes are themselves referenced in wikidata so anyone can check the source of the information. --D Wells (talk) 01:31, 11 April 2018 (UTC)
- @D Wells: En.wikipedia is an encyclyclopedia that anyone can edit - still we have much info that is referenced deeply and properly, and absolutely correct. That still does not make Wikipedia a reliable source. It is easy to scrape information from imdb and put it on my private blog. That does not make imdb a reliable source, it does not make my private blog a reliable source. If I copy a PD list of data from a reliable source and put it on my blog, then my blog is still not a reliable source. If data on WikiData is reliably sourced it is still not a reliable source of data. And the current data structure, nor the current polcies and guidelines, of WD does not allow for that. ~--Dirk Beetstra T C 06:46, 11 April 2018 (UTC)
- @Malcolmxl5 and Beetstra: You're mixing 'places where the data is held' and 'places that provide authority that the data is reliable'. Think of it more like the template namespace - you can use transclude content from there, but you're not using that as a reference/authority. The authority is through the reference to the original source, and Wikidata's data structure supports references (and we can even show those references here if we want). Thanks. Mike Peel (talk) 16:02, 11 April 2018 (UTC)
- @D Wells: En.wikipedia is an encyclyclopedia that anyone can edit - still we have much info that is referenced deeply and properly, and absolutely correct. That still does not make Wikipedia a reliable source. It is easy to scrape information from imdb and put it on my private blog. That does not make imdb a reliable source, it does not make my private blog a reliable source. If I copy a PD list of data from a reliable source and put it on my blog, then my blog is still not a reliable source. If data on WikiData is reliably sourced it is still not a reliable source of data. And the current data structure, nor the current polcies and guidelines, of WD does not allow for that. ~--Dirk Beetstra T C 06:46, 11 April 2018 (UTC)
- @Malcolmxl5: No one is proposing using Wikidata itself as a source, all statements imported into enWiki infoboxes are themselves referenced in wikidata so anyone can check the source of the information. --D Wells (talk) 01:31, 11 April 2018 (UTC)
- Err... support Wikidata in infoboxes generally, I guess? The format of this RfC is extraordinarily bad. There is far too much background and it's far too technical. We should be commenting on a single well-defined issue, not being offered multiple choice solutions to four disparate ones. I don't see how you can maintain any illusion that this is a discussion and not a poll. I pity whoever ends up having to close it and I would seriously challenge the idea that the result reflects any sort of reasoned consensus. – Joe (talk) 20:27, 11 April 2018 (UTC)
- 1F, 2D, 3D, 4C I think Wikidata is a super valuable resource which we should use, but local fields should still be used for now unless Wikidata has the sourced information. Daylen (talk) 04:51, 12 April 2018 (UTC)
- 1F, 2C, 3A*, 4C** *=remembering that wikipedia does not consider wikipedia to be a reliable source. **=allow local overrides with the expectation that there is also a human-readable explanation for the override somewhere standardised. Stuartyeates (talk) 09:40, 12 April 2018 (UTC)
- 1A, 2A, 3B, 4A, we have a bad enough problem with accuracy of information on Wikipedia without having to police further inaccuracies in Wikidata and infoboxes. SandyGeorgia (Talk) 11:52, 12 April 2018 (UTC)
- 1A, 2A, 3A, 4A, per several above. WD is a separate project with near a zero focus on referencing or reliability. It's a limited concept, not a holy grail, and while it contains data, it does not provide knowledge or understanding. - SchroCat (talk) 14:48, 12 April 2018 (UTC)
- 1A, 2A, 3B, 4A per all concerns with referencing and accuracy. Wikipedia risks becoming even less reliable. I have no idea how and no inclination as to how I might change wikidata if I found an error. J3Mrs (talk) 16:54, 12 April 2018 (UTC)
- I had just that problem today. Any central location for reporting errors there is very well-hidden. But points made at https://www.wikidata.org/wiki/Wikidata_talk:Community_portal#Where_to_report_errors_in_the_data? do seems to get addressed, sometimes the same days, sometimes after weeks or months. Johnbod (talk) 17:07, 12 April 2018 (UTC)
- 1A / 2A / 3A / 4A. That is to say, at no point in time should Wikipedia content auto-generate based on Wikidata. We struggle with content verification and vandalism reversion here. This makes an already difficult problem worse by requiring editors to become conversant with an entirely separate project (one with very idiosyncratic rules and formatting!) and by somewhat reducing the ability of invested editors to monitor articles via the watchlist. Furthermore, I strongly oppose permitting individual projects to, effectively, set their own policy-level standards and practices in this manner. Wikidata is a fascinating aspirational project, but it is a different project than Wikipedia, with different goals, expectations, and standards. I can conceive of a day in the future where those goals and standards align and we're able to work with them in a more integrated fashion. But that day is not today, and is not in the altogether near future, either. Squeamish Ossifrage (talk) 20:53, 12 April 2018 (UTC)
- 1D / 2G / 3G / 4D I related some cases which importing data from WikiData (mainly at anatomy field). As far as I committed, There are no "unreferenced data" or "neglected low quality data". If importing data is unreferenced or low quality (or there is no meaning to use WikiData), I simply oppose such data import. --Was a bee (talk) 09:34, 13 April 2018 (UTC)
- 1A, 2A, 3A, 4A. Information should be handled locally. There is no reason to be be presenting data from an entirely separate project as our own when we have no control over it. Importing data makes it extremely difficult for editors to track via watchlist what an article is saying, increases the chances of self-contradictory articles, and makes it considerably for difficult for editors to track down and change incorrect information (and anonymous/new editors might as well give it up as a lost cause). The Wicked Twisted Road (talk) 18:04, 13 April 2018 (UTC)
- 1A, 2A, 3A, 4A. The problems over accessibility of the information to edit and the restrictions on what is available in wikidata makes it almost impossible to use in a meaningful way. As an example, on references/dates being pulled over from wikidata there has been no thought about how these are going to be synchronised with the style of the article they are being pulled into. For examle should they be in short format, should they use last/first style, first/last style or vauthor style for authors etc. How about handling of named references to enable consolidating of references in infobox and body? Far too many problems to enable this to be use productively. Keith D (talk) 23:06, 14 April 2018 (UTC)
- @Keith D: On reference/date format: there has been thought about that, and things like mdy vs dmy are already implemented. References are more complicated since there are so many different styles, but in principle it's possible to match those since Wikidata contains structured info that can be reformatted. In practice that's been avoided thus far since there were other issues to deal with first, and it would ideally be better if we just used a single standard reference format. Using the same reference in both the infobox and the body (using cite web) is already possible if both are pulled from Wikidata, but refs are currently disabled in most Wikidata-enabled infoboxes by default (set refs=yes to show them in e.g., telescope infoboxes). If we go with the A options, then these issues are unlikely to be solved in the near future, though, since there's then no motivation to do so. Thanks. Mike Peel (talk) 21:44, 16 April 2018 (UTC)
- Just to note there are 5 possible formats for dates, the DMY & MDY versions also have long & short month variants & then there is also the ISO variant. Keith D (talk) 10:49, 17 April 2018 (UTC)
- And that is just the Gregorian calendar .. where Hijri may be more appropriate in some cases, as well as cases of 'the first day of March'. Now I agree that past dates are pretty immutable, display in an infobox would probably require a 'date-display_option = DMY' to follow the state of the page. --Dirk Beetstra T C 10:54, 17 April 2018 (UTC)
- @Beetstra and Keith D: In, e.g., {{Infobox person/Wikidata}}, the parameter is "dateformat = dmy" or mdy. There hasn't been demand for other formats, so I don't think they're coded up, but I suspect it's straightforward to add them if needed. Thanks. Mike Peel (talk) 22:06, 17 April 2018 (UTC)
- The other 3 formats are specific to references. Keith D (talk) 22:59, 17 April 2018 (UTC)
- @Mike Peel: ‘
it’s straightforward to add them if needed
’ yes, 800 wikis have to program each template to accomodate their needs, and editors have to figure out to use the parameters correctly. —Dirk Beetstra T C 03:37, 18 April 2018 (UTC)
- @Beetstra and Keith D: In, e.g., {{Infobox person/Wikidata}}, the parameter is "dateformat = dmy" or mdy. There hasn't been demand for other formats, so I don't think they're coded up, but I suspect it's straightforward to add them if needed. Thanks. Mike Peel (talk) 22:06, 17 April 2018 (UTC)
- And that is just the Gregorian calendar .. where Hijri may be more appropriate in some cases, as well as cases of 'the first day of March'. Now I agree that past dates are pretty immutable, display in an infobox would probably require a 'date-display_option = DMY' to follow the state of the page. --Dirk Beetstra T C 10:54, 17 April 2018 (UTC)
- Just to note there are 5 possible formats for dates, the DMY & MDY versions also have long & short month variants & then there is also the ISO variant. Keith D (talk) 10:49, 17 April 2018 (UTC)
- @Keith D: On reference/date format: there has been thought about that, and things like mdy vs dmy are already implemented. References are more complicated since there are so many different styles, but in principle it's possible to match those since Wikidata contains structured info that can be reformatted. In practice that's been avoided thus far since there were other issues to deal with first, and it would ideally be better if we just used a single standard reference format. Using the same reference in both the infobox and the body (using cite web) is already possible if both are pulled from Wikidata, but refs are currently disabled in most Wikidata-enabled infoboxes by default (set refs=yes to show them in e.g., telescope infoboxes). If we go with the A options, then these issues are unlikely to be solved in the near future, though, since there's then no motivation to do so. Thanks. Mike Peel (talk) 21:44, 16 April 2018 (UTC)
- 1F, 2F, 3E (or 3D), 4F (or 4E), but most importantly implement editing Wikidata values in the local template-editing interface. Any policy issues can be resolved by talking things out over at Wikidata, in which case there's no real difference from local policy discussions. Doing structured-data work locally is a duplication of effort since Wikidata is already doing it. —{{u|Goldenshimmer}}|✝️|they/their|😹|T/C|☮️|John 15:12|🍂 02:58, 16 April 2018 (UTC)
- 1C, 2F, 3A (or 3D), 4F (or 4C) - In situations where Wikidata has references and is unlikely to be controversial or a target of spam/vandalism, I see no reason not to use it. There's no good way to tell which infoboxes fit that criteria apart from a per-infobox affirmative consensus; relying on "passive consensus" seems like it will make arguments worse when they eventually happen. If/when tooling is improved (i.e. to allow watchlists to detect changes cascading from Wikidata) and other problems are known/fixed, it may be worth having a follow-up discussion. Content (at least for BLPs) must be referenced, beyond that I see no reason to do a half-rollout. power~enwiki (π, ν) 00:24, 17 April 2018 (UTC)
- 1A, 2A, 3A, 4A - It is a separate project and one with insufficient sourcing requirements and oversight. We don't need to do this to appease our corporate overlords. Carrite (talk) 01:54, 17 April 2018 (UTC)
- 1A, 2A, 3A, 4A - Wikidata is useful for someone, somewhere, but not us. We should keep local control of all our information, rather than farm out the work to a project whose long-term viability I'm nowhere near convinced of. Courcelles (talk) 20:04, 17 April 2018 (UTC)
- NOTE: Some !voters below may have arrived via CANVASS. This RFC was posted & linked on the Wikidata central Project Chat at 04:57, 18 April 2018, section title Enwiki RFC raises concerns. [1] Multiple responses below are from users with greater than 98.4% Wikidata edits / less than 1.6% EnWiki edits (when considering Wikidata edits & EnWiki edits only). Alsee (talk) 10:51, 25 April 2018 (UTC)
- Threaded discussion moved down to Status section.
- 1D, 2D, 3C, 4C Based on my experience with WD implementation on cswiki. The implementation was based on RfC (closed in September 2016 after 3 months of discussion), in which there were many concerns of vandalism etc. Conversions started on minor infoboxes with limited audience, but nowadays almost all most popular infoboxes are using WD. Concerning the vandalism, from my point of view (2,100 pages on watchlist on enwiki and 19,971 items on watchlist on WD) the rate of vandalism is considerably lower on WD in comparison with enwiki (constraint violations system helps a lot) although the response rate to vandalism is better.--Jklamo (talk) 09:03, 18 April 2018 (UTC)
- 1A. This is an overly complex RfC. All those pen icons in infoboxes are an eyesore. There is one "edit source" / "edit" tab at the top of each article, and often each section, and that should be a sufficient interface to edit the encyclopedia. I think Wikidata may be most useful as a quality-control tool for Wikipedia, but the thrust of this may lead to quality degradation rather than improvement. In an encyclopedia that anyone can vandalize, an independent Wikidata repository can serve as a useful tool for vandalism detection and reversion. wbm1058 (talk) 18:23, 18 April 2018 (UTC)
- 1A. This is a solution in search of a problem. Use of Wikidata in infoboxes introduces all kinds of user experience problems; where's the magic data coming from? How does a user add a parameter locally when there is nothing but one template call representing the whole infobox?--a situation I find confusing and frustrating even when I entirely understand what is going on under the hood. The ridiculous little "pencil-to-edit" after every field in the box has no realistic use: infobox data is largely static, and the only thing that the pencil does is make vandalism easier, and we won't be able to see that vandalism in a straightforward way, while we can now. Editorial control of what fields are shown becomes difficult, with the inevitable creep of more and more Wikidata properties showing up without knowledgable editors attached to a given article x every being made aware of a change to that article's infobox via their watchlist. I get that infoboxes from Wikidata may be very useful on small projects, but they are not useful here. I have never seen so much pushing of a tech for its own sake (many other initiatives are tied, though). These WD usage proposals are a massive change for no advantage to a status quo that works. Today I happened on a telescope article pulling everything from Wikidata, including an image. I wonder how long I could get away with a vandalized image there. As a Wikipedia editor, under these proposals I cannot edit that image. I cannot see the filename. A Wikidata user I must become. Come the F on. Massive increases in complexity that make a Wikipedia editor's life harder. I get that plenty of people think that the proposed integration is "really neat", technologically. So do I, but if organizations ran on "really neat" they'd be ruined. Among other problems, the downside from potential diffusion of inaccurate data is just enormous. I have reviewed many Wikidata entries in my time and many of them have questionable assertions. (Note to a bot: the artist was a "follower of artist X", not "artist X", according to your reference URL--but go ahead and pollute a few dozen entries like that based on a scrape of a European museum web site.) The easiest path to imagine is that Wikidata becomes a less and less reliable concoction of true and false data over time, as the number of Q-items grows. Outriggr (talk) 02:40, 19 April 2018 (UTC)
- 1F, 2D, 3G, 4C. I don't like how this RfC is build. It's too complex, some choices are absurd and half the choices permit to ban the use of wikidata. For me the goal of this over-complexification is to remove the possibility of the use of wikidata information. The people who don't want to use wikidata just have to tape 1A, 2A, 3A, 4A, but the people who want to use wikidata, have to carefully read and select their choices... --Nouill (talk) 03:27, 19 April 2018 (UTC)
- 1A, 2A,
3A3B, 4B The potential for abuse or vandalism is, per below discussion, high and the benefits would be low. Question 3 is a non question - all content should conform to policy, no matter where it comes from. 4B allows rapidly importing data without all of the inconvenience - and allows for quick and easy modification should the information be incorrect. Moreover, the proposed implementation of a bot to monitor eventual inconsistencies (should consensus be something other than 1A, 2A, 3A) would allow fixing monitoring Wikidata more easily, which could possibly mitigate (in my opinion not enough, but I digress) some negative consequences. 198.84.253.202 (talk) 04:09, 19 April 2018 (UTC) - 1A, 2A, 3A, 4A Per Dirk, Carrite and Courcelles. shoy (reactions) 19:40, 20 April 2018 (UTC)
- 1A, 2A, 3A, 4A Information displayed on enwp needs to be subject to enwp policies, with any disputes resolved on enwp. Kanguole 12:31, 21 April 2018 (UTC)
- 1A, 2A, 3A, 4A - Wikidata IMHO should stay off Wikipedia as a whole - All information displayed in the infobox needs to meet our policies and guidelines. –Davey2010Talk 13:01, 21 April 2018 (UTC)
- 1D, 2C, 3A, 4C [This is pending the fix below] The ability to have Wikidata changes only to properties used within the "English article" appear in the watchlist adds sufficient quality assurance that I am now happy to support the use of Wikidata items within Wikipedia with consensus. Doc James (talk · contribs · email) 20:17, 22 April 2018 (UTC)
- I tried to test that today, User:Doc James, by putting Weißenhorn on my watchlist. Seen that there were 13 edits to that page today, that I have enabled the settings to see changes to WikiData items in my watchlist, and that my watchlist spans now from 9:34 to 02:29 (the edits were at 9:30/9:31) it seems strange that I am not seeing anything in my watchlist. --Dirk Beetstra T C 06:46, 23 April 2018 (UTC)
- User:Beetstra Yah was testing and noticed this aswell. Spoke with and showed the problem to the programmer on these efforts and they say they will have the fix rolled out in a week or two. My response depends on this being fixed. Doc James (talk · contribs · email) 08:17, 23 April 2018 (UTC)
- User:Beetstra it is now working mostly. Have requested that changes to aliases for languages other than EN not show up. Doc James (talk · contribs · email) 21:24, 13 May 2018 (UTC)
- User:Beetstra Yah was testing and noticed this aswell. Spoke with and showed the problem to the programmer on these efforts and they say they will have the fix rolled out in a week or two. My response depends on this being fixed. Doc James (talk · contribs · email) 08:17, 23 April 2018 (UTC)
- I tried to test that today, User:Doc James, by putting Weißenhorn on my watchlist. Seen that there were 13 edits to that page today, that I have enabled the settings to see changes to WikiData items in my watchlist, and that my watchlist spans now from 9:34 to 02:29 (the edits were at 9:30/9:31) it seems strange that I am not seeing anything in my watchlist. --Dirk Beetstra T C 06:46, 23 April 2018 (UTC)
- 1D/E, 2E, 3D, 4C Wikidata is ready to be a valuable complement for several infoboxes organically, but not others yet. It's only practical that we broaden its use over time as it is improved, and as more features like the improved watchlist functionality mentioned above by @Doc James: become available. It is important that local data always takes precedence, though.--Pharos (talk) 19:26, 24 April 2018 (UTC)
- 1A, 2A, [3A], 4A. Disclosure: helped prepare this RfC, that is, until this happened (5 March), so I certainly didn't come here from an outside canvassing operation. For the record, I agree with several commentators on this page that the RfC setup is below par. I tried to remedy that, that is, until I gave up after 5 March: it was simply not possible to get this RfC in some or another acceptable shape, at least all my efforts in that sense failed, so this is what we'll have to work with. My choice is partially based on that experience: I was working towards some intermediate solutions (i.e. somewhere in the middle between "A" and "F" options): lack of constructive collaboration towards an acceptable "middle of the road" set of options made clear that nobody wants it – the dominant feeling being "all or nothing" on either side. Then for me too the choice presented itself as "all or nothing". For me it's "nothing" then in the infoboxes realm. Infoboxes are a contentious realm in Wikipedia (ArbCom procedures etc), and there are always at least a few bombs ticking towards their next explosion. Wikidata being contentious too in English Wikipedia, mixing the two together leads to a compound that is ready to explode any time, without even a preliminary ticking sound. For me Wikidata, seen from English Wikipedia's perspective, is an excellent authority control system, i.e. a connector box (interwiki links, connecting to other authority control systems like BnF's), with no content (i.e. other than authority control numbers and names of articles in other wikis) in its own right that would be useful for English Wikipedia, at least no such other content in its own right to which English Wikipedia should link with a live connection. Too many problems. Please see the last panel of this comic which explains part of my rationale. The "3A" option is in square brackets while in the current confusing layout of the RfC the "3A" option (confusing "consequences" with "prerequisites") can only be combined with "2C" (which I don't support). --Francis Schonken (talk) 10:47, 26 April 2018 (UTC)
- 1D/E, 2E, 3D, 4C — Passive consensus is fine & separate templates should be created for templates that have particularly high number of uses; it's the easiest way to demonstrate the potential of WD infoboxes with minimal disruption. We shouldn't use blank fields to suppress Wikidata (although we do need an education campaign to explain this to surprised editors). We should allow editors to simply override Wikidata values, and shouldn't change Wikipedia data automatically.--Carwil (talk) 11:40, 26 April 2018 (UTC)
- 1A, 2A, 3A, 4A. Infoboxes are, in the main, intended to be a summary (and not an addition) of key features in an article. Almost all of these key features should, ideally, be in the text of the article, and be supported by inline citations in the article, with the citations including at least some basic information (primarily title, author, date, publisher, page number/location, and a url if available). Certain infoboxes are subject to further restrictions, for example, WP:BLP, which requires high quality sources. The current implementation of wikidata enabled infoboxes allows the inclusion of data from Wikidata, with the only requirement being that the data item has some sort of reference (a bare url or a reference to existing wikidata item suffices). There no checks to see if the data is mentioned in the article, is the same as shown in the article, or is referenced in the article.
- Furthermore, any changes to the data in Wikidata are displayed automatically, with no hint to anyone watching the Wikipedia page that something has changed. And yes, I'm aware that I can watch the wikidata item, but the working of this seems to have a severe time lag, and also floods me with information such as label changes in other languages, additions of unreferenced data etc that have no bearing on the wikipedia article. Also, if I do see a wikidata edit that has an impact and needs changing I have to go and edit the item in Wikidata. I've done a fair amount of work on Wikidata, so I'm capable of this, but I totally understand the reluctance of some editors to get involved in this process, and it should not be a required skill for editors who wish to change something on a Wikipedia page.
- And also, Wikidata (like Wikipedia) cannot be regarded as a reliable source. Although these, and other crowd-sourced projects, may often be a source of useful information (and even include appropriate references), we should not take them on face value. Instead, we check the given source and, if appropriate, use that to verify, and provide citations for, information in the article. It should be noted that Wikidata has no policies on Verifiability (there is a proposal which only has a handful of edits since January 2015), copyright violation, or sensitive data on living persons.
- In summary, the current implementation of Wikidata in infoboxes stretches, sometimes to breaking point, Wikipedia content policies on Verifiability and Biographies of living persons, and severely restricts the conduct policy of Consensus, and provides new and interesting ways of breaking the policy on Vandalism. It also drives a horse and cart through MOS guidelines on Infoboxes and the content guideline on identifying reliable sources which specifically states that a wikilink is not a reliable source, yet the current implementation implies that a link (the pencil icon) to an external (sister) project is somehow a sufficient reference.
- And yes, I'm well aware that the standards of use of infoboxes within Wikipedia is not as good as it should be, but providing more ways of getting them wrong is not the way to improve them.
- BTW I think Wikidata is an interesting, useful resource that could be used extensively in Wikipedia. See my addition (A third way?) to the discussions below. Robevans123 (talk) 15:06, 27 April 2018 (UTC)
- 1E 2C 3A 4C Use of wikidata should be permitted with oversight from an editor that chooses to use it. Acebulf (talk) 02:57, 30 April 2018 (UTC)
- 1D 2G 3C 4C My primary motivation, is that I think centralising this information is long term (next 20 years) required to sustainably evolve and grow English Wikipedia and other language versions of it (esp with the slow decline in editor activity continuing). I think it is therefore important that English Wikipedia leads the way in developing best practices for every Wikimedia wiki to be able to access all this information. This will long term free up volunteer hands to work on that where their effort will add the most value. I appreciate the host of problems this introduces and the uneasiness this causes for some editors. But the only way to work on groundbreaking things like this is to continuously experiment with them, to find their real world limitations, to discuss and to discover and work on new ideas. By excluding such activity, however difficult or occasionally disruptive as it might be, we relegate ourselves to becoming a museum. And when we exclude ourselves from the process, we exclude ourselves from having influence over the long term shape and form of WikiData. I see most of this discussion as a very reactionary, activistic, singular and needlessly agonistic soapboxing of (perceived and real) problems. I'm not in favour of widely rolling out Wikidata everywhere in the wiki, but i'm squarely against excluding it. —TheDJ (talk • contribs) 09:52, 30 April 2018 (UTC)
- I haven't responded to any !vote so far, but this one really takes the piss. "a very reactionary, activistic, singular and needlessly agonistic soapboxing", really? "the only way to work on groundbreaking things like this is to continuously experiment with them, to find their real world limitations, to discuss and to discover and work on new ideas." After five years of such "continuous experiment", there may well be a time to take a step back, see what has been achieved so far and what has been proven to be unworkable, and to decide that it was nice to experiment, but that it didn't work out and that the disadvantages are too severe. That's not "very reactionary" etcetera, that's normal practice. An experiment only makes sense if people are afterwards allowed, without being needlessly insulted, to decide that "no, it didn't work for us". You may disagree, and you may perceice the advantages as more important than the disadavantages, but this discussion has been for the most part one of the more calm, balanced ones about the use of Wikidata on enwiki. "And when we exclude ourselves from the process, we exclude ourselves from having influence over the long term shape and form of WikiData." So what? We already don't have such influence, and a sudden influx of enwiki editors to try to get such an influence would not be welcomed, just like a sudden influx of Wikidata editors to influence and shape our policies and future wouldn't be welcomed. Their policies on protection, BLP, notability, blocking, reliable sourcing, ... bacsically, their policies and practices on nearly everything are completely different from ours, and this difference hasn't decreased during the 5 years of experiments with infoboxes and so on we already had. "I think it is therefore important that English Wikipedia leads the way in developing best practices for every Wikimedia wiki to be able to access all this information." The needs of enwiki are completely different from the needs of small language versions. For many small language versions, the choice is between no data or automation-generated data, and Wikidata may play a role in this if their needs are bigger than their standards of reliable sourcing and so on. For enwiki, the situation, even with the stabilization of our editor base, is quite different, and in general using Wikidata is a step backwards wrt sourcing, maintenance, quality, ... and at the same time alienates a serious part of the remaining editor base. You won't increase your editor base by adopting and pushing something which has so much opposition after so many years, you will simply replace human, collaborative work, the basis of what enwiki is, with bot-generated content of dubious quality and with a lot less control over it. Basically, you will in the long run turn enwiki into a clone of Wikidata (just look at the auto-generated "articles" Wikidata and its opponents are pushing). Which may not be reactionary, but certainly is activistic, singular, and needlessly agonistic (and antagonistic and agonizing). Fram (talk) 14:15, 30 April 2018 (UTC)
- It is no surprise to me that you do. It is clear that we have very different viewpoints about what is best and likely for the future of Wikipedia. —TheDJ (talk • contribs) 14:47, 30 April 2018 (UTC)
- Yes, but I don't describe people with an opposing viewpoint and their comments as "a very reactionary, activistic, singular and needlessly agonistic soapboxing". If such comments by pro-Wikidata people would be the norm, I would have a very bleak view of "the future of Wikipedia" and the retention of our editor base. Luckily most people, from both sides, have more respect for opposing viewpoints than what you expressed here. Fram (talk) 15:44, 30 April 2018 (UTC)
- The rhetoric here really doesn't help. From either side. But I have to call out Fram here - please could you also be more respectful of opposing views here? Thanks. Mike Peel (talk) 01:28, 5 May 2018 (UTC)
- Which opposing view have I been disrespectful of? The only disrespectful vote was the one by DJ, and I called him out on that. I have not commented on other votes just because everyone is entitled to their opinion without being called "reactionary" etcetera. No idea why you have to call me out and not the DJ, apart from the fact that my vote doesn't align with yours... Fram (talk) 15:54, 5 May 2018 (UTC)
- The rhetoric here really doesn't help. From either side. But I have to call out Fram here - please could you also be more respectful of opposing views here? Thanks. Mike Peel (talk) 01:28, 5 May 2018 (UTC)
- Yes, but I don't describe people with an opposing viewpoint and their comments as "a very reactionary, activistic, singular and needlessly agonistic soapboxing". If such comments by pro-Wikidata people would be the norm, I would have a very bleak view of "the future of Wikipedia" and the retention of our editor base. Luckily most people, from both sides, have more respect for opposing viewpoints than what you expressed here. Fram (talk) 15:44, 30 April 2018 (UTC)
- It is no surprise to me that you do. It is clear that we have very different viewpoints about what is best and likely for the future of Wikipedia. —TheDJ (talk • contribs) 14:47, 30 April 2018 (UTC)
- I haven't responded to any !vote so far, but this one really takes the piss. "a very reactionary, activistic, singular and needlessly agonistic soapboxing", really? "the only way to work on groundbreaking things like this is to continuously experiment with them, to find their real world limitations, to discuss and to discover and work on new ideas." After five years of such "continuous experiment", there may well be a time to take a step back, see what has been achieved so far and what has been proven to be unworkable, and to decide that it was nice to experiment, but that it didn't work out and that the disadvantages are too severe. That's not "very reactionary" etcetera, that's normal practice. An experiment only makes sense if people are afterwards allowed, without being needlessly insulted, to decide that "no, it didn't work for us". You may disagree, and you may perceice the advantages as more important than the disadavantages, but this discussion has been for the most part one of the more calm, balanced ones about the use of Wikidata on enwiki. "And when we exclude ourselves from the process, we exclude ourselves from having influence over the long term shape and form of WikiData." So what? We already don't have such influence, and a sudden influx of enwiki editors to try to get such an influence would not be welcomed, just like a sudden influx of Wikidata editors to influence and shape our policies and future wouldn't be welcomed. Their policies on protection, BLP, notability, blocking, reliable sourcing, ... bacsically, their policies and practices on nearly everything are completely different from ours, and this difference hasn't decreased during the 5 years of experiments with infoboxes and so on we already had. "I think it is therefore important that English Wikipedia leads the way in developing best practices for every Wikimedia wiki to be able to access all this information." The needs of enwiki are completely different from the needs of small language versions. For many small language versions, the choice is between no data or automation-generated data, and Wikidata may play a role in this if their needs are bigger than their standards of reliable sourcing and so on. For enwiki, the situation, even with the stabilization of our editor base, is quite different, and in general using Wikidata is a step backwards wrt sourcing, maintenance, quality, ... and at the same time alienates a serious part of the remaining editor base. You won't increase your editor base by adopting and pushing something which has so much opposition after so many years, you will simply replace human, collaborative work, the basis of what enwiki is, with bot-generated content of dubious quality and with a lot less control over it. Basically, you will in the long run turn enwiki into a clone of Wikidata (just look at the auto-generated "articles" Wikidata and its opponents are pushing). Which may not be reactionary, but certainly is activistic, singular, and needlessly agonistic (and antagonistic and agonizing). Fram (talk) 14:15, 30 April 2018 (UTC)
- Yes Wikidata, 1F, 2F, 3A, 4E Infoboxes contain structured data. Anyone who wants to edit structured data should begin to expect that they should take their editing to Wikidata, and phase out the practice of Wikipedia's way of managing structured data. Many aspects of Wikidata are not mature. Infoboxes on any language Wikipedia should facilitate very easy editing in Wikidata, and anyone comfortable editing Wikipedia should also feel comfortable editing Wikidata. The interface we have now does not permit easy Wikidata editing of infoboxes on Wikipedia, and I recognize that as a barrier to adoption. Still - the future is in Wikidata. I !voted for maximum Wikidata adoption, but at same time, I know that we have to pace the roll out with experiments and staggered changes. Wikidata has huge potential to increase quality control and the scope of Wikimedia coverage. I see its integration as inevitable in the near future. The sooner we scale up experiments with it, the smoother the transition will be, and sooner we greatly improve the quality and scope of English Wikipedia coverage and all language Wikipedia coverage. For roll out, the local consensus of WikiProject participants seems to me like the best way to decide which infoboxes to convert first. Blue Rasberry (talk) 19:01, 30 April 2018 (UTC)
- I do not think there is any reason we need a separate database to hold infobox data, when we already have a database of articles here that we can edit as usual. So I would support 1A and 2A. I remember being very surprised when I learned that any infoboxes already tried to use Wikidata - that should already have required some clear RFCs in favor before trying such a radical change to the way that article information is stored and edited. I don't view "shared data across multiple languages" as a particularly urgent goal. But I do think that all the information from our articles should be clearly visible in those articles, and editable without going to any other website. — Carl (CBM · talk) 00:52, 1 May 2018 (UTC)
- @CBM: I think you missed Wikipedia:Requests for comment/Wikidata Phase 2. We can have a separate database here for infobox data, but that also means that every other Wikipedia language also needs to have its own database, the question here is whether we should try to have a single database across all of the language projects. Which is quite a big issue for those of us that work across multiple languages. Mike Peel (talk) 01:21, 5 May 2018 (UTC)
- That RFC does look to have been very sparsely commented - this one seems to be getting much more feedback. Personally, I think it is up to each language to maintain its own articles. I don't think we need any wikidata-like database here, beyond the article text we already have. There's no reason that infobox data or other data should be kept separately; that only leads to confusion and difficulty when trying to edit. — Carl (CBM · talk) 10:53, 5 May 2018 (UTC)
- By RfC standards, the Phase 2 RfC was well commented, with around as many votes across two sections as this one has in its single section, albeit from fewer voters. At 58K of prose, it is nowhere near as large (or rambling) as this one, but calling that "sparsely commented" is a big stretch. It was comfortably big enough – and conclusive enough – to establish a consensus on inclusion of Wikidata in Wikipedia infoboxes. That's something that this highly polarised discussion won't be able to do, principally because very few contributors this time seem interested in finding a consensus; most seem only interested in pushing their own POV. --RexxS (talk) 20:26, 5 May 2018 (UTC)
- That RFC does look to have been very sparsely commented - this one seems to be getting much more feedback. Personally, I think it is up to each language to maintain its own articles. I don't think we need any wikidata-like database here, beyond the article text we already have. There's no reason that infobox data or other data should be kept separately; that only leads to confusion and difficulty when trying to edit. — Carl (CBM · talk) 10:53, 5 May 2018 (UTC)
- @CBM: I think you missed Wikipedia:Requests for comment/Wikidata Phase 2. We can have a separate database here for infobox data, but that also means that every other Wikipedia language also needs to have its own database, the question here is whether we should try to have a single database across all of the language projects. Which is quite a big issue for those of us that work across multiple languages. Mike Peel (talk) 01:21, 5 May 2018 (UTC)
- 1D, 2E, 3G, 4D. We should try to use technical tools like Wikidata for live updating of numbers whenever possible, but need to respect local editor consensus to use other data if that is better in their editorial judgment. What TheDJ says makes a lot of sense. —Kusma (t·c) 18:03, 1 May 2018 (UTC)
- 1A, 2A, 3A, 4A. Not user friendly, difficult to maintain, poor oversight at wikidata, a solution in search of a problem. Yilloslime (talk) 03:40, 2 May 2018 (UTC)
- 3A. Everything needs to be properly sourced. NinjaRobotPirate (talk) 18:45, 3 May 2018 (UTC)
- 1A, 2A, 3A, 4A, it makes things difficult to make even simple corrections. You should be able to do a simple search, when in edit, to locate the text that needs to be changed. It also allows the infobox to be out of step with the text without any obvious way of correcting it. A change on an external website should not be allowed to change the details in an article which should be under human control. 86.187.162.55 (talk) 14:50, 4 May 2018 (UTC)
- 1F, 2E, 3D, 4F per Mike Peel et al. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 20:22, 6 May 2018 (UTC)
- 1F; 2E; 3C; 4C Richard Nevell (talk) 11:25, 7 May 2018 (UTC)
- No preference for any of the options (is this a valid !vote?). This is something which I do care about a little, but I think it would be reasonable to assume that regardless of the outcome of the RfC, Wikidata may be sufficiently reliable and sufficiently not vandalized (such that no one complains about it, anyway) in some years. Maybe it won't be, but it's really too early to tell, so I shouldn't waste my time now trying to figure out which side of the issue I want to be on. There won't be a clear consensus anyway, so there will inevitably be another RfC at some point. Jc86035's alternate account (talk) 13:21, 7 May 2018 (UTC)
- 1F, 2E, 3D, 4C, like with Mike Peel, though I would push for us to use bots to find non-BLP statements that aren't "sourced" in Wikidata and fix them as a community. James F. (talk) 19:57, 7 May 2018 (UTC)
- 1E, 2D/2G, 3D, 4C - Although Wikidata is improving, there are still many mistakes and errors there, and we can't do a full roll-out just yet (epicgenius (talk)):
- For question 1, I would prefer the option E (separate Wikidata version) because some of the data on Wikidata may be wrong/unsourced, or need to be elaborated. For instance, on Wikipedia, you can add text notes like {{efn}} to infoboxes, while on Wikidata, you can't do that. Therefore some pages should use a local version of the infobox, while others can use the Wikidata version. I'm assuming that for Wikidata infoboxes displayed on Wikipedia, there is a link so someone can edit values on Wikidata. This will be a little bit of a problem if a Wikipedia editor is unable to edit values on Wikidata; hence, we should keep a local version of an infobox.
- For question 2, I would prefer either option G (case by case basis), for the reason I just described, or option D (opting in), because some Wikidata entries might be good enough that Wikipedia can use them.
- For question 3, I would pick option D (BLP-sourced), except with the stipulation that if the information in the infobox is already in the article but not sourced in the infobox, then the relevant infobox value may remain. On Wikipedia, there is a MOS guideline, WP:INFOBOXCITE, which states
References are acceptable in some cases, but generally not needed in infoboxes if the content is repeated (and cited) elsewhere or if the information is obvious. If the material requires a reference (see WP:MINREF for guidelines) and the information does not also appear in the body of the article, the reference should be included in the infobox.
- For question 4, I choose Option C (local precedence), again for the reason I elaborated in Question 1. --epicgenius (talk) 14:46, 9 May 2018 (UTC)
- 1D, 2D, 3C, 4C. I believe this would strike the correct balance. The data is there to be used, within reason. — AfroThundr (talk) 01:34, 10 May 2018 (UTC)
- NOTE: The RFC was indirectly re-canvassed at Wikidata central Project Chat at 01:52, 10 May 2018[2] (which leads here via[3]). Alsee (talk) 03:29, 13 May 2018 (UTC)
- 1F, 2E, 3C, 4F. But 3F for fields that usually do not require a reference on Wikipedia, for instance the portrait of a person. Syced (talk) 06:47, 11 May 2018 (UTC)
- Note: The above editor acknowledges arriving at the WikidataMOS RFC via the partisian-notification advertisement at Wikidata,[4] and the !vote there was followed by the !vote here 18 minutes later. Alsee (talk) 03:29, 13 May 2018 (UTC)
- 1A, 2A. Wikidata is a gigantic security hole for the convenience of corporations, with no benefit to readers over the old system of interwikis (indeed it has the disadvantage that erroneous matches are almost impossible to correct). Wikipedia is an encyclopedia, not a machine-readable database. Plus, in addition to its contribution to the risk, I share others' concern that Wikidata makes a lie of "anyone can edit". Wikidata is undermining what we do, not assisting it. If (some) articles must have infoboxes, at least let them be Wikipedia infoboxes, not outgrowths of a rival project with hostile objectives. Yngvadottir (talk) 19:13, 12 May 2018 (UTC)
- Support: 4F Because it gives better ease when filling informations. 4F is used on the Frech Wikipedia and it works perfectly well--Railfan01 (talk) 21:06, 12 May 2018 (UTC)
- 1D/1F, 2E, 3C, 4C. 3C over 3DEFG because I've noticed that statements with refs tend to be more reliable and to have been placed more carefully than unreferenced statements, which may have just been placed willy-nilly and/or by a beginner bot-operator. 4C now and for a while, increasing gradually to 4E then 4F over the years as WD matures. ~ Tom.Reding (talk ⋅dgaf) 12:33, 14 May 2018 (UTC)
- 1A, 2A, 3A, 4A, all per the dearth of oversight at WD—or to clarify, responsible management—and a tendency to underexpose their information to WP:RS. Also, per the supposed tenet of being the encyclopaedia thart anyone can edit, the rather coy suggestion that instead of actually being able to edit WD, one can just
provide new values using Wikicode in the traditional way
...err, just carry on editing WP as ususal and wait for WD to catch up and do it for us? No thanks; material on Wikipedia should be chageable on Wikipedia by as many user groups as possible: that is a key tenet. —SerialNumber54129 paranoia /cheap shit room 14:02, 14 May 2018 (UTC) - 1C, 2C, 3A, 4C Wikidata is an elegant solution, and sooner or later I believe it will be both necessary and desirable to go that way. However, currently Wkidata content provenance is too unsafe and modifying it is too user-unfriendly to make it a reliable baseline option. Use for specific uncontroversial and agreed-on purposes, and keep our house standards (not spotless but at least enthusiastically patrolled) for everything else until that tool has been improved a lot more. --Elmidae (talk · contribs) 19:45, 14 May 2018 (UTC)
- 1A, 2A, 3A, 4A-wikidata is an unreliable source and should never be used on WP anywhere.Smeat75 (talk) 18:36, 15 May 2018 (UTC)
- 1C, 2C-D, 3C, 4C Wikidata can be used if broad consensus is present, but must be subject to simple local override to control vandalism. Additionally, since Wikidata is not a reliable source, an additional source must be present. The WMF is clearly working on some improvements to further increase the palibility of Wikidata usage. I would also favor creating a special expedited consensus seeking procedure (lie a BRFA) for Wikidata specifically, which would assist in the resolution of disputes. Tamwin (talk) 19:56, 15 May 2018 (UTC)
- 1A, 2A, 3A, 4A—once upon a time, I was mildly excited about the potential for WikiData, but the implementations that have been inflicted on us, and the contempt WikiData's gatekeepers have shown for those who have raised concerns, have completely soured me on WikiData. It can't be trusted and is a maintenance nightmare. Curly "JFC" Turkey 🍁 ¡gobble! 22:55, 15 May 2018 (UTC)
- 1C, 2B, 3A, 4C Wikidata can be a valuable tool and source for many kinds of information, but currently it should only be used in cases where it's demonstrably reliable. It should not be the default, and must be easy to override with local information so that errors can be corrected without going through WD. BegbertBiggs (talk) 18:04, 17 May 2018 (UTC)
- 1A, 2A, 3A (including outside infoboxes), 4A (because infobox content is frequently the subject and cause of intractable disputes). Should those not pass, then: 1B, followed possibly by 1C after another RfC, and subject also to 2B and 2D, 3A (because it's already policy), 3C, 4B and 4C. Post-script: I also agree with Rschen7754's "54367QRTE$#$$$^" comment. This is no way to do an RfC. Ask one question, then do another RfC to ask another question. Only a fraction of our geekiest editors are going to have the patience and the data-parsing habits to even understand this RfC. That means that any outcome other than the restrictive 1A 2A 3A 4A is going to be a WP:False consensus because only editors already deeply steeped in data-wrangling would even arrive at them much less understand their implication, and those !voters are a narrow minority who do not represent editorial much less readership norms. (I'm in that minority, so it's not a criticism of them, just an observation that they're probably IT professionals in particular, with a bias toward centralization and automation of data.) — SMcCandlish ☏ ¢ 😼 10:20, 18 May 2018 (UTC)
- 1A, 2A, 3A, 4A I had WD included in my watchlist a while ago, but removed that. What with all the myriad bot edits adding/changing/deleting all kinds of info that I'm not even remotely interested in, it clogged up my watchlist to much. Meaning that any article I have on my watchlist and that uses WD can be vandalized without me even noticing. --Randykitty (talk) 14:09, 18 May 2018 (UTC)
- 1A,2A,3A,4A As it is right now, Wikidata is unreliable and doesn't comply with content policies we have here. In the future this can always be revisited, if and when Wikidata makes some serious improvements.--Rusf10 (talk) 06:51, 20 May 2018 (UTC)
- 1A,2A,3A,4A. As it is now Wikidata is not ready to be used in infoboxes, with the possible exeption of a few values 2B. Wikidata has imported all sorts of information from various Wikipedias and cites different Wikipedia as a source if you look at what they as a main rule call references. Proper references are cited once in a blue moon. The system for adding proper references in Wikidata is so cumbersome it is barely possible if you have an url-adress, but with any paper sources it just is beyond most people to figure out how and if you do it is so much work it's not worth it. When Wikidata has cleaned up the mess they have made of their references it can be reconsidered. In Norwegian Wikipedia there was a shortlived experiment to show the Wikidatareferences, but it did not look good at all. One article had for a short time 10 citations from VIAF, BIBSYS etcetera for one date of birth. ツツDyveldi ☯ prat ✉ post 20:18, 21 May 2018 (UTC)
- 1C, 2C, 3A, 4C at best. I'd really rather keep all data on en, so we can see its provenance easily, but I'm willing to try relaxing it a bit, as long as it's not overriding local data. --SarekOfVulcan (talk) 20:56, 21 May 2018 (UTC)
General discussion
[edit]- I think there should be a discussion section under each question. I don't see how four questions, each with six or seven options, can be managed in a single discussion. – Jonesey95 (talk) 22:05, 6 April 2018 (UTC)
- Let's see how the discussion goes, and if need be we can split this 'General discussion' section into several parts. Let's not mix questions with discussions, though. Thanks. Mike Peel (talk) 22:14, 6 April 2018 (UTC)
- As I expressed on the talkpage, I am somewhat afraid, like [[[user:Jonesey95]] expresses, that this will result in an endless soup of choices with no clear consensus (hence status quo, hence with similar fights/edit wars on the contentious subjects) in the end (though I am not sure whether the alternative set-ups would be different). —Dirk Beetstra T C 22:19, 6 April 2018 (UTC)
- Mike, I don't see how we're going to gain consensus with multiple questions and so many options. SarahSV (talk) 01:55, 7 April 2018 (UTC)
- SarahSV, we tried as hard as possible to put the options in sequential order from "less wikidata and most restricted" to "more wikidata and least restricted". Anyone who !votes A will generally prefer B over C, and anyone who !votes F will generally prefer D over C. If there isn't a consensus at one end of the spectrum, the hope is that consensus could be found by grouping responses from one end or the other until a compromise-consensus-point is reached. Alsee (talk) 22:58, 9 April 2018 (UTC)
- In reality you have only 3 possible choices
- accept all extracted data from WD without any constraint on infobox approval or data quality
- refuse any use of WD
- use of WD in infoboxes respecting the following requirements:
- - infobox templates have to be approved by local wikiprojects
- - infoboxes manage local data and when local data are present, the infobox has to display those local data over WD data
- - only referenced data in WD should be displayed in the infobox
- - a white list or a black list of references can be used to filter referenced data in WD in order to perform a deeper selection of WD data.
- All other possible combinaisons are worthless. So this RfC could be simplified by one question: do you accept to use WD data in an infobox if the four constraints above are met ?
- The only way to allow the use of WD in WP is to fix a high quality data constraint for WD: if WD data displayed in infoboxes have a higher quality than the local data and if for any reason, someone can provide an even better local value which will have the priority, what are the arguments against use of WD data ? Try to learn from past experiences: there is a demand for good quality data, so just give what people want if you want to see this RfC accepted or stop now. Snipre (talk) 02:39, 7 April 2018 (UTC), @Mike Peel: Snipre (talk) 02:41, 7 April 2018 (UTC)
- My problem is with thinking of infoboxes as “data” in the first place. Speaking for myself only, I don’t WANT “data”. I want our infoboxes to present quality “information” ... and I think that information needs to added by an actual person (an editor who has researched the subject and fact checked what he/she has added... and who can be held accountable if he/she does not follow our policies and guidelines). Most of the problems with Wikidata stem from the flawed idea that our infoboxes can (or should) be automated...with a bot compiling information instead of an editor. No, no no... a thousand times no. Blueboar (talk) 20:52, 7 April 2018 (UTC)
- Data is not knowledge (Tuomi I (1999), "Data is more than knowledge: implications of the reversed knowledge hierarchy for knowledge management and organizational memory", doi:10.1109/HICSS.1999.772795). I am an academic researcher in informatics and knowledge management, and we make a distinction between data and knowledge. Wikipedia articles should contain knowledge, and knowledge requires the interpretation and contextualisation of data. This means that, at least in some cases, an automated importing of data is always going to fail. In other words, what Blueboar said.
- My research looks at electronic healthcare records and how they are used. There's lots of data in these that we'd often like to import for a variety of functions. Sometimes that works, but sometimes it doesn't. Because a record is not merely a container of data, as we describe in Greenhalgh et al. (2009; doi:10.1111/j.1468-0009.2009.00578.x). The same principles apply to Wikipedia articles. Articles are not mere containers of data, and that still applies to their infoboxes. They are more than that and thus you can't remove the human editor.
- (Note how I am citing WP:RS? That's more than Wikidata does! ;-) )
- Wikidata has many great uses, but it is an ontological mistake to believe that we can or should supplant human editing of text on a large scale with Wikidata entries. Too often, the content of infoboxes requires discussion. For example, how many seats did the Conservative party get at United Kingdom general election, 2017? It depends on whether you include the Speaker or not, thus the footnote in the infobox. Who directed The Matrix? The Wachowski Brothers, the Wachowski Sisters or the Wachowskis? Where was Natalie Portman born? Jerusalem? Or Jerusalem, Israel? There is no single truth to any of these. Ergo, they have to be worked out through WP:CONSENSUS. Better to do that in Wikipedia. Bondegezou (talk) 21:36, 7 April 2018 (UTC)
- @Bondegezou: Your first example fails because the number of seats won by the Conservative party at the United Kingdom general election, 2017 isn't a piece of data held on Wikidata, so it's not liable to be imported anyway. Infobox fields are, for the most part, pieces of data, and hence are suitable for being imported from Wikidata. Paul Pogba's Current team is Manchester United, and I know that because at Paul Pogba (Q129027), I find that it's sourced to an article in Le Monde, Paul Pogba, le phénomène (Q17195332) (see how Wikidata clearly quotes a reliable source; that's more than Wikipedia does! ;-) Not only that, but when Pogba changes his current club and his Wikidata entry is updated, we could arrange for that to automatically update in all 64 language Wikipedias where he has an article. It's common sense that it is advantageous to the majority to have simple facts sourced from a single place, which eases maintenance and updating. The names for the director (P57) of The Matrix (Q83495) are . No ambiguity there and a perfectly good redirect for the reader to find out more. You'll find that Natalie Portman (Q37876) has her place of birth (P19) in Jerusalem . What's the problem with that linked article? --RexxS (talk) 22:21, 7 April 2018 (UTC)
- There have been long arguments over whether The Matrix infobox should say the Wachowski Brothers or Lilly and Lana or something else. Whatever the right answer to that, it's not something that can be settled easily. It needs to be discussed on Wikipedia. Importing from Wikidata thus fails. The import from Wikidata you list would go against current consensus on the article (although personally I'd prefer Wikidata's answer in this case).
- As for Natalie Portman, most places of birth in most infoboxes list a town and a country. Portman's doesn't because of disputes over Jerusalem (half of the city is, arguably, not in Israel). Consensus, after much edit-warring, is not to use the standard form "Jerusalem, Israel", but to just say "Jerusalem". Again, it needs to be discussed on Wikipedia. (What country does Wikidata say Jerusalem is in?)
- The underlying problem is that "simple facts" are frequently not simple, nor facts. That's not me saying that: that's the conclusion of entire disciplines of academic thought. Bondegezou (talk) 22:32, 7 April 2018 (UTC)
- The argument against using Lilly Wachowski, Lana Wachowski (which of course redirect to The Wachowskis) are not arguments against getting that sort of data from Wikidata – only an argument for ensuring that a locally agreed consensus is always capable of overriding anything fetched from Wikidata in a particular case. I'm firmly in favour of that, and every single one of the functions I've written to fetch data from Wikidata complies. Setting
|director=The Wachowski Brothers
in the infobox will always result in The Wachowski Brothers, even in a Wikidata-enabled infobox. - Similarly for Natalie Portman, any controversy can be solved by setting
|birth_place=Jerusalem
, although it results in exactly the same linked article as using the Wikidata call. If you follow the link that the call produces, you'll see that our article on Jerusalem takes no sides on the question of its country. I'm not sure how we can do any better than that for our readers. There are, of course many examples of non-simple facts that require discussion on Wikipedia, but if the outcome is something that the Wikidata call doesn't match, it's simple to use the locally agreed outcome instead. --RexxS (talk) 23:43, 7 April 2018 (UTC)
- The argument against using Lilly Wachowski, Lana Wachowski (which of course redirect to The Wachowskis) are not arguments against getting that sort of data from Wikidata – only an argument for ensuring that a locally agreed consensus is always capable of overriding anything fetched from Wikidata in a particular case. I'm firmly in favour of that, and every single one of the functions I've written to fetch data from Wikidata complies. Setting
- @Bondegezou: Your first example fails because the number of seats won by the Conservative party at the United Kingdom general election, 2017 isn't a piece of data held on Wikidata, so it's not liable to be imported anyway. Infobox fields are, for the most part, pieces of data, and hence are suitable for being imported from Wikidata. Paul Pogba's Current team is Manchester United, and I know that because at Paul Pogba (Q129027), I find that it's sourced to an article in Le Monde, Paul Pogba, le phénomène (Q17195332) (see how Wikidata clearly quotes a reliable source; that's more than Wikipedia does! ;-) Not only that, but when Pogba changes his current club and his Wikidata entry is updated, we could arrange for that to automatically update in all 64 language Wikipedias where he has an article. It's common sense that it is advantageous to the majority to have simple facts sourced from a single place, which eases maintenance and updating. The names for the director (P57) of The Matrix (Q83495) are . No ambiguity there and a perfectly good redirect for the reader to find out more. You'll find that Natalie Portman (Q37876) has her place of birth (P19) in Jerusalem . What's the problem with that linked article? --RexxS (talk) 22:21, 7 April 2018 (UTC)
- @Blueboar: So I guess you only wear clothes that have been made by an actual person, rather than a machine, so that they can be held responsible for the occasional manufacturing flaw? The Luddites would have proud of your stance. Much of the information in Wikidata was copied from English Wikipedia – especially from the infoboxes – and that information was originally added by an actual person who had done actual research on the topic and actually fact checked it. And yes, they were actually held accountable if they didn't follow PAG. --RexxS (talk) 21:37, 7 April 2018 (UTC)
- So it’s a WP:CIRCULAR situation... wikidata pulling information from one Wikipedia article, holding it... and then sending that information back to another Wikipedia article. Does wikidata now export and import the cited sources that verify the information? (Last time I checked, it didn’t... has that flaw been fixed?) Blueboar (talk) 00:10, 8 April 2018 (UTC)
- It's only CIRCULAR in the way that ripples are in water: they spread outward. One of the 280 Wikipedias has a sourced fact that it donates to Wikidata, and all the other projects now have access to it. So it's more "wikidata pulling information from one Wikipedia article, holding it... and then sending that information back to hundreds of other Wikipedia articles". Wikidata doesn't export anything; it's the individual Wikipedias that choose to pull information from Wikidata, and that includes references if those are required and available. --RexxS (talk) 02:27, 8 April 2018 (UTC)
- @Blueboar: If we were using Wikidata as a reference, then it would be circular. However, we're not - and statements on Wikidata should be referenced in same way that they are in Wikipedia articles. It's more like copy-pasting a sentence from one Wikipedia article to another - you wouldn't then think of the other Wikipedia article as the reference, would you? Thanks. Mike Peel (talk) 16:21, 8 April 2018 (UTC)
- yes, we can copy and paste - IF we also copy and paste the citation. THAT would be OK... However, we can’t copy material WITHOUT the citation, paste it into another article and then say “but it’s cited in some other article”. Information needs to be cited wherever it appears... in every article where it appears. One major problem is that wikidata does not copy and paste references along with the material. It is effectively saying “a Wikipedia article was the source for this data... trust us” - and when it exports that data back to some OTHER Wikipedia article, that does set up a circular situation (if not the one that WP:CIRCULAR addresses) - wikidata is taking info from Wikipedia (without also taking any citations) and then exporting it back to Wikipedia (without any citations). Blueboar (talk) 22:15, 8 April 2018 (UTC)
- @Blueboar: Ironically, most enwp infoboxes do not include references, otherwise it would be a lot simpler to copy the references over (manually/automatically) at the same time as the data. Instead, you need to go digging through the article text to find which reference contains the data, if indeed it is referenced/has a live link to access the info. If we only used referenced data from Wikidata in infoboxes, then our % of referenced info in infoboxes would go right up. ;-) Note that 'imported from Wikipedia' references are ignored by the code here to avoid that being counted as an actual reference. Thanks. Mike Peel (talk) 23:04, 8 April 2018 (UTC)
- @Blueboar: (ec) Seems to me that such concerns with respect to infoboxes are overdone, and should be put into perspective. In most cases, one would expect the infobox information to be either a summation of basic fundamental information about the subject, an encapsulation of information presented in greater detail (presumably with sourcing) elsewhere in the article. In such cases if the reader wants to investigate a claim, it should usually be easy for them to find a reference for that claim in the main article, or to note that there is a discrepancy. In the general run of things we're fine for infoboxes to summarise article content without anyone crying 'circularity', and without the referencing to be duplicated; so why such a big deal if Wikidata has drawn on the article content, and then the infobox has drawn on Wikidata? A second set of fields are those which relate to external identifiers, which almost by definition are self-referencing. Between these two cases, most of the content in most infoboxes are accounted for. As a result, most infoboxes are not referenced, many attempts to explicit references are removed as unnecessary clutter, and the community is broadly fine with that. So why shouldn't that apply to Wikidata infoboxes too? -- at least when the relevant WikiProjects are comfortable with the fields in question, and consider the corresponding Wikidata content is broadly well-monitored and well-curated.
- I see your !vote was for 1A -- a blanket ban on any infobox drawing anything from Wikidata. So it seems reasonable to put User:D Wells's question to you: what would you do with the 12,000 infoboxes about genes on Wikipedia, to keep those clean and correct and up-to-date? For a project like that, to keep the data updated and synchronised with the major international databases, and monitored for changes, it is far easier to deal with Wikidata items (that are then accessible to all wikis), with all the data immediately available and reviewable directly from database queries, rather than having to scrape and parse 12,000 wiki pages every time you want to review their content. But apparently you think it is necessary for every update to Wikipedia to be manually scrutinised. So who do you propose is going to do that work? You? No, I didn't think so. Or are we just going to leave WP's gene coverage to rot? Jheald (talk) 23:30, 8 April 2018 (UTC)
- yes, we can copy and paste - IF we also copy and paste the citation. THAT would be OK... However, we can’t copy material WITHOUT the citation, paste it into another article and then say “but it’s cited in some other article”. Information needs to be cited wherever it appears... in every article where it appears. One major problem is that wikidata does not copy and paste references along with the material. It is effectively saying “a Wikipedia article was the source for this data... trust us” - and when it exports that data back to some OTHER Wikipedia article, that does set up a circular situation (if not the one that WP:CIRCULAR addresses) - wikidata is taking info from Wikipedia (without also taking any citations) and then exporting it back to Wikipedia (without any citations). Blueboar (talk) 22:15, 8 April 2018 (UTC)
- @Blueboar: If we were using Wikidata as a reference, then it would be circular. However, we're not - and statements on Wikidata should be referenced in same way that they are in Wikipedia articles. It's more like copy-pasting a sentence from one Wikipedia article to another - you wouldn't then think of the other Wikipedia article as the reference, would you? Thanks. Mike Peel (talk) 16:21, 8 April 2018 (UTC)
- It's only CIRCULAR in the way that ripples are in water: they spread outward. One of the 280 Wikipedias has a sourced fact that it donates to Wikidata, and all the other projects now have access to it. So it's more "wikidata pulling information from one Wikipedia article, holding it... and then sending that information back to hundreds of other Wikipedia articles". Wikidata doesn't export anything; it's the individual Wikipedias that choose to pull information from Wikidata, and that includes references if those are required and available. --RexxS (talk) 02:27, 8 April 2018 (UTC)
- So it’s a WP:CIRCULAR situation... wikidata pulling information from one Wikipedia article, holding it... and then sending that information back to another Wikipedia article. Does wikidata now export and import the cited sources that verify the information? (Last time I checked, it didn’t... has that flaw been fixed?) Blueboar (talk) 00:10, 8 April 2018 (UTC)
- My problem is with thinking of infoboxes as “data” in the first place. Speaking for myself only, I don’t WANT “data”. I want our infoboxes to present quality “information” ... and I think that information needs to added by an actual person (an editor who has researched the subject and fact checked what he/she has added... and who can be held accountable if he/she does not follow our policies and guidelines). Most of the problems with Wikidata stem from the flawed idea that our infoboxes can (or should) be automated...with a bot compiling information instead of an editor. No, no no... a thousand times no. Blueboar (talk) 20:52, 7 April 2018 (UTC)
@Only in death: Commons generally has much less strict policies than Wikipedia. Commons has no policy on original research, nor on verifiability, nor on weight, nor sourcing – the very policies that you complain about on Wikidata. Only Commons' copyright policy is on a par with that of Wikipedia. And yet we happily add images, audio clips and videos to our articles without being troubled by Commons' lack of compliance with our policies. Isn't it completely muddled thinking to oppose including information from Wikidata, while welcoming the same information if it was imported from Commons as an image? --RexxS (talk) 21:37, 7 April 2018 (UTC)
- We don't wholesale automate the addition of images, audio clips and videos from Commons. We consider each addition. Thus the analogy breaks down. Bondegezou (talk) 22:00, 7 April 2018 (UTC)
- How does it break down? The policies are the same whether applied to a manual edit or an automated one. Images are updated and removed by bot as well as by humans.
Until such time as wikidata implements either compliant (or stricter, like Commons) policies, anything drawn from wikidata is required to adhere to local policies
. If the compliance is good enough for Commons, it's good enough for Wikidata. Thinking otherwise is mere prejudice. --RexxS (talk) 22:28, 7 April 2018 (UTC)- The entire debate here is about automation, so of course it matters whether we're talking manual or automated edits. There is no bot treatment of images that compares to the more Wikidata-friendly end of the proposals above. Bondegezou (talk) 22:38, 7 April 2018 (UTC)
- No, the debate here is about whether we take information from Wikidata, and in what way. Fetching information from Wikidata using a function call is no more "automated" than using image syntax to fetch an image from Commons (except we can problematically tailor the information in far more ways than we can the images). --RexxS (talk) 02:27, 8 April 2018 (UTC)
- Sigh, commons requires all media is freely released - no fair use exceptions. For the scope in which the vast majority of commons media is to be used on ENWP its restrictions are stricter, we allow fair use, commons does not. If we want to use media from commons we have a reasonable expection that it is compatible when we request information from commons due to its internal policies and fairly rigourous community. Wikidata wants to insert information into articles that not only has no requirement to be reliably sourced or otherwise cross-compliant with our policies for information, but as has been shown already as in the ongoing short description fiasco, it is now actively disrupting the quality of wikipedia articles as presented to the public. I am not going to get into an extended argument about this again, its been said repeatedly in a number of venues. Until wikidata implements policies that are either as strict/stricter as ENWP's policies for information and data, I will continue to oppose any implementation of wikidata-drawn information displaying on a wikipedia article that cannot be quickly and simply over-written on a local level, or that is of the limited/specific purpose like the examples in 2b. Only in death does duty end (talk) 03:47, 8 April 2018 (UTC)
- Double sigh, Commons' "no fair use" is a copyright policy. For all of its content, in every way that's relevant to this discussion, its restrictions are far laxer than those of enwp. There can be no expectation that any media we import from Commons complies with our key content policies of WP:OR or WP:V. There is a complete absence of "
internal policies and fairly rigourous community
" on Commons when it comes the sourcing of any content. At least we can filter out unsourced content from Wikidata - as has made clear regularly. It is untrue to assert that "it is now actively disrupting the quality of wikipedia articles as presented to the public
". You can't give diffs to back that up because it's fake criticism. You can drop your opposition now because every single piece of wikidata-drawn information displaying in a Wikipedia infobox can be quickly and simply over-written on a local level. That's already guaranteed as a consequence of the outcomes of Wikipedia:Requests for comment/Wikidata Phase 2, which still applies. --RexxS (talk) 13:15, 8 April 2018 (UTC)- My opposition will stand until Wikidata has sufficient polices and a population that will enforce them that are on a level with ENWP. When that time comes, feel free to contact me, and not before. Only in death does duty end (talk) 18:54, 8 April 2018 (UTC)
- Double sigh, Commons' "no fair use" is a copyright policy. For all of its content, in every way that's relevant to this discussion, its restrictions are far laxer than those of enwp. There can be no expectation that any media we import from Commons complies with our key content policies of WP:OR or WP:V. There is a complete absence of "
- User:RexxS, first of all, that is a WP:OTHERCRAPEXISTS argument. Secondly, if I pull data from Commons or from WikiData I have to be able to defend it here. As it stands, we could transclude a nearly empty infobox with perfectly referenced data NOW, and someone on WikiData could add data later with an insufficient reference (or change the data while leaving the sufficient reference). Everyone knows that that (->) is not Donald Trump, so if I chose to transclude File:Donald Trump.jpg I at that point am very responsible for that that is the actual picture. If someone changes it here to File:A monkey enjoying his snack in Daman-e-Koh.jpg it is clear, if someone overwrites the file on Commons the result here is clear. That is much different from me changing the boiling point of trimethylphosphine on WikiData to 324°C - there are only very few editors who know it is wrong, and a couple more that suspect that it is wrong, I am sure that 99.9% of the Wikipedia editors have to check (and that number is likely very similar in the case of knowing that the birthday of Donald Trump is not 22 May 1944). One of the most difficult vandals I ran into here on Wikipedia was an IP editor who changed physical constants of obscure materials. That makes you completely paranoid to IP editors changing numerical data. --Dirk Beetstra T C 06:27, 8 April 2018 (UTC)
- Sigh, commons requires all media is freely released - no fair use exceptions. For the scope in which the vast majority of commons media is to be used on ENWP its restrictions are stricter, we allow fair use, commons does not. If we want to use media from commons we have a reasonable expection that it is compatible when we request information from commons due to its internal policies and fairly rigourous community. Wikidata wants to insert information into articles that not only has no requirement to be reliably sourced or otherwise cross-compliant with our policies for information, but as has been shown already as in the ongoing short description fiasco, it is now actively disrupting the quality of wikipedia articles as presented to the public. I am not going to get into an extended argument about this again, its been said repeatedly in a number of venues. Until wikidata implements policies that are either as strict/stricter as ENWP's policies for information and data, I will continue to oppose any implementation of wikidata-drawn information displaying on a wikipedia article that cannot be quickly and simply over-written on a local level, or that is of the limited/specific purpose like the examples in 2b. Only in death does duty end (talk) 03:47, 8 April 2018 (UTC)
- No, the debate here is about whether we take information from Wikidata, and in what way. Fetching information from Wikidata using a function call is no more "automated" than using image syntax to fetch an image from Commons (except we can problematically tailor the information in far more ways than we can the images). --RexxS (talk) 02:27, 8 April 2018 (UTC)
- The entire debate here is about automation, so of course it matters whether we're talking manual or automated edits. There is no bot treatment of images that compares to the more Wikidata-friendly end of the proposals above. Bondegezou (talk) 22:38, 7 April 2018 (UTC)
- How does it break down? The policies are the same whether applied to a manual edit or an automated one. Images are updated and removed by bot as well as by humans.
- To the closer of the RfC: the status quo should be seen as the starting point. If there is no consensus for a change, the status quo prevails. Accordingly, unless there is affirmative consensus either for less Wikidata (e.g., 1A etc.) or for more Wikidata (e.g., full rollout), no changes to policy should occur. Best, Kevin (aka L235 · t · c) 23:37, 7 April 2018 (UTC)
- The Wikidata thing is hard.
- Some editors are very happy that Wikidata exists, and want to leverage all that data and make all kinds of interesting things happen.
- Some editors love words and text, and view this whole 'data' thing with askance at best and at their most negative, are hostile to it and its infobox manifestations everywhere.
- I am not in either of those buckets.
- The hardest thing for me, is the difference in policy regimes, governance, and heck, even ethos between the projects. Big sweeping changes are made to Wikidata by bots, and there are not humans watching every field, and weird shit happens, and when that gets imported to en-WP, we see weird shit here. Which is upsetting, for me at least.
- We experienced this in WPMED when somebody ran a bot that added a bunch of chemicals used in experiments in cells, to a field "drugs used to treat" in entries about diseases/conditions, and then these additions came into en-WP articles about those diseases in a wikidata-pulling infobox. Just argh. (and the wikidata person who ran that bot insisted throughout, even as we killed that field in the infobox here, that bot run was fine because it added "relevant" data to wikidata. double argh)
- The WMF reading team has exacerbated this by using Wikidata descriptions in a bunch of places, like navigation and brief descriptions in mobile and in apps.
- I have complained about this a lot already. I just came across another instance of it last week, when I was browsing on mobile and was surprised to see Christopher John Boyce described as an "American bank robber" in the list of "related" articles that the WMF reading team cause to be generated at the bottom of articles. I went today to find out how that got into Wikidata, and it was this diff, by a bot. In 2015. Three years of stupidness.
- en-WP is not perfect by miles and miles but we don't need more random "data" dumped into WP. I spend at least half my time removing bad edits by humans.
- (and yes i fixed it there. This was not a happy thing for me - it was more like blackmail. I simply don't want to edit in Wikidata; I don't care about it. I also don't want to force the values of en-WP on them. Let them evolve as they like. I just don't want that data in en-WP if it is not policy compliant per en-WP policies and guidelines, and doesn't have some kind of controls on it, to ensure it stays that way.)
- From my perspective it is "garbage in, garbage out", a) for en-WP, b) at this point in time. That may change. Jytdog (talk) 05:00, 8 April 2018 (UTC)
I fully agree with Jytdog here, but adding a caveat. It is possible to change the data in Wikidata, while leaving the reference that is there intact. At that point, the data is referenced but wrong (e.g. diff resulting in this official website, where it is pertinently untrue it is 'Imported from English Wikipedia'). Such edits can stand for months (in this case 2 months, 12 days), and there is no reason that this cannot happen with more sensitive and/or better referenced data. This is particularly egregious when someone on WikiData changes the data (but not the reference) on sensitive material that here is displayed on a WP:BLP - and I am not sure to who I now should attribute the resulting display of bad data here, and I can only hope that WikiData will sanction editors who vandalise WikiData as a hobby.
It is stated above, that data can be pulled from xx.wikipedia, so that all other 800 Wikis can use it. The problem is that that data that is being pulled is unreferenced (or if it is referenced, the references are not necessarily pulled either). The data-pull also does not record from which revid the data is pulled so we can possibly find a reference easy. --Dirk Beetstra T C 06:27, 8 April 2018 (UTC)
- @Beetstra: Of course, it's always possible to change data here on Wikipedia and leave the reference intact. See edits like this where a date is changed from December to November, even though the source clearly states December. Wikipedia also suffers from citation dislocation where editors rearrange text and separate the fact from its reference, or insert unreferenced information within a block of text that was referenced. Also it's easy to insert unreferenced values into infoboxes, because we don't insist on each item having its own reference, but we can filter out unreferenced information drawn from Wikidata automatically. Using Wikidata, the data pull occurs every time you load an article page, so we always get the most up-to-date revision, and there's a link available so it only takes one click to see where the data came from and what its reference is. --RexxS (talk) 13:15, 8 April 2018 (UTC)
- @RexxS: Exactly (and yet another WP:OTHERCRAPEXISTS argument) .. and you think that having the data on WikiData will make that easier to control? No, I think it is actually worse - I see a shift in spammers targeting WikiData first, as that gives them access to 800 wikis in one single step. This edit was the very first spam performed by this spammer with this link, targeting WikiData (and is out of the same set of editors that I mentioned in my post above, which was transcluded for over 2 months (telling me that their MO is successful)). Combine that with the fact that Wikipedia has an order of magnitude more active administrators, bots that are controlled to reasonable editing speeds and have a decent vetting process, and an editor base that is significantly larger. And if we would turn on WikiData monitoring in en.wikipedia to the fullest your en.wikipedia watchlist becomes flooded. And I know, the data on WikiData is just a click away .. however here on Wikipedia it is not even that one click away. --Dirk Beetstra T C 14:09, 8 April 2018 (UTC)
- Every example can be dismissed as a "WP:OTHERCRAPEXISTS exists argument". The argument still holds: You're content to see unsourced stuff imported from Commons but oppose soured information coming from Wikidata. I don't think keeping data on Wikidata makes it easier to control at present, but English Wikipedia does not have the majority of active editors across all Wikimedia projects, and the balance will shift further away from here as time goes on. Recruiting some active editors from all Wikimedia projects to also curate the data on Wikidata is an obvious step forward. I agree we're not there yet. To see the references for a particular fact in a Wikipedia infobox usually requires searching through the text to find the corresponding information, then following the link somewhere after that to see the citation. I suggest you'd find it easier in a Wikidata-enabled infobox to just click the pen-icon link and look at the references on Wikidata. Here are Douglas Adams' occupations: screenwriter, novelist, science fiction writer, writer, musician – see how easy it is to get at the references? --RexxS (talk) 16:28, 8 April 2018 (UTC)
- That is not a reference, User:RexxS. That is data. Now show me a WP:RS so I know it is correct. That it is mentioned in a database that states that part of the data (which part?) is from imdb does not help. --Dirk Beetstra T C 18:56, 8 April 2018 (UTC)
- The reference is clearly visible if you follow the link, Dirk, where you'll see something like "screenwriter ... 2 references: stated in Bibliothèque nationale de France, BnF ID = http://catalogue.bnf.fr/ark:/12148/cb11888092r;" Surely you're not trying to tell me that Bibliothèque nationale de France is not a WP:Reliable source? --RexxS (talk) 19:19, 8 April 2018 (UTC)
- @RexxS: .. who got their data from imdb, see WP:ELPEREN#IMDb. —Dirk Beetstra T C 19:36, 8 April 2018 (UTC)
- But we're not citing imbd, we're citing BnF - an independent third party source with a reputation for fact-checking and accuracy, per WP:RS. Are you really challenging the fact that Douglas Adams wrote "Un cheval dans la salle de bains (Dirk Gently's Holistic Detective Agency)"? There are dozens of reliable sources for that "sky is blue" kind of fact. --RexxS (talk) 19:48, 8 April 2018 (UTC)
- @RexxS: .. who got their data from imdb, see WP:ELPEREN#IMDb. —Dirk Beetstra T C 19:36, 8 April 2018 (UTC)
- The reference is clearly visible if you follow the link, Dirk, where you'll see something like "screenwriter ... 2 references: stated in Bibliothèque nationale de France, BnF ID = http://catalogue.bnf.fr/ark:/12148/cb11888092r;" Surely you're not trying to tell me that Bibliothèque nationale de France is not a WP:Reliable source? --RexxS (talk) 19:19, 8 April 2018 (UTC)
- That is not a reference, User:RexxS. That is data. Now show me a WP:RS so I know it is correct. That it is mentioned in a database that states that part of the data (which part?) is from imdb does not help. --Dirk Beetstra T C 18:56, 8 April 2018 (UTC)
- Every example can be dismissed as a "WP:OTHERCRAPEXISTS exists argument". The argument still holds: You're content to see unsourced stuff imported from Commons but oppose soured information coming from Wikidata. I don't think keeping data on Wikidata makes it easier to control at present, but English Wikipedia does not have the majority of active editors across all Wikimedia projects, and the balance will shift further away from here as time goes on. Recruiting some active editors from all Wikimedia projects to also curate the data on Wikidata is an obvious step forward. I agree we're not there yet. To see the references for a particular fact in a Wikipedia infobox usually requires searching through the text to find the corresponding information, then following the link somewhere after that to see the citation. I suggest you'd find it easier in a Wikidata-enabled infobox to just click the pen-icon link and look at the references on Wikidata. Here are Douglas Adams' occupations: screenwriter, novelist, science fiction writer, writer, musician – see how easy it is to get at the references? --RexxS (talk) 16:28, 8 April 2018 (UTC)
- @RexxS: Exactly (and yet another WP:OTHERCRAPEXISTS argument) .. and you think that having the data on WikiData will make that easier to control? No, I think it is actually worse - I see a shift in spammers targeting WikiData first, as that gives them access to 800 wikis in one single step. This edit was the very first spam performed by this spammer with this link, targeting WikiData (and is out of the same set of editors that I mentioned in my post above, which was transcluded for over 2 months (telling me that their MO is successful)). Combine that with the fact that Wikipedia has an order of magnitude more active administrators, bots that are controlled to reasonable editing speeds and have a decent vetting process, and an editor base that is significantly larger. And if we would turn on WikiData monitoring in en.wikipedia to the fullest your en.wikipedia watchlist becomes flooded. And I know, the data on WikiData is just a click away .. however here on Wikipedia it is not even that one click away. --Dirk Beetstra T C 14:09, 8 April 2018 (UTC)
- Beestra yes I agree and this is why I said "some kind of controls on it, to ensure it stays that way.". Wikidata is (like Wikipedia) a really crazy idea in its own field - an open database. Stepping outside my en-WP hat and just thinking about it as a database, this is too crazy. In database projects I have been involved in, we were extremely careful about who had rights to change things and procedures they had to use. Those were databases we relied on to make decisions and that we used for automated procedures, and the data needed to be good or the subsequent procedures and decisions based on queries would have been screwed up and bad things would have happened. We were so wary of "garbage in garbage out". The "drugs used to treat" mess showed me the consequences of that with respect to Wikidata and content about health here in en-WP. I am not comfortable leveraging Wikidata here as long as that project's governance is so ... undeveloped. Jytdog (talk) 16:08, 8 April 2018 (UTC)
- @Jytdog: I have said this before - in basis WikiData is a great idea and it should be fully integrated into the crude data of Wikipedia. The interwiki part is a great success. The main problem is that, just as wikipedia, it is completely free to edit anything in that database. There has been absolutely no control whatsoever regarding what is going in, there focus has been to import, import, import, without first thinking about what they actually want (or should) be. One of the greatest problems of en.wikipedia is the notorious unreliability of material. We sometimes come back after years to add references (I can't believe that in 2018 we have still 1333 articles in Category:Articles_with_unsourced_statements_from_April_2008). And now we have imported that into a database, ripped out of its context which may have explained whether a datapoint is reasonable.
- IMHO, they have missed a chance. There is a lot of immutable data on the Wikipedias that they could get, referenced and all, and lock it down. There is way more in official, vetted databases that you can download, import and lock down. The rest of the data, you take, unreferenced as it may be, and leave it there. Until someone properly refs it and checks it, and then you .. lock it down. There is simply no reason for the man in the street to be allowed to change the boiling point of water (at standard temperature and pressure), the birthday of Donald Trump, the official website of Microsoft, or the atomic weight of the 12C isotope of carbon (at least not without good reason). There is a reason to change the presentation, of how we describe the boiling point of water, the official website of Microsoft, the birthday of Donald Trump, and the atomic weight of 12C, and that is the function of an encyclopedia that anyone can edit. That is the fundamental difference between Wikipedia/WikiSource/Wikiversity/Wiktionary on one end, and WikiData on the other. Because that was not locked down in the start - no mechanism for that was set or developed, most of the correctly entered data could very well have changed with respect to the attached reference (if any), and we really have no clue what is correct and what is not correct on WikiData anymore (even when the material is sufficiently referenced). --Dirk Beetstra T C 16:39, 8 April 2018 (UTC)
- The problem isn’t with the compilation and storage of data on Wikidata... the problem is with the subsequent retrieval and presentation of that data in Wikipedia. If the retrieval can not present the data in a way that conforms to our policies, wikidata simply can’t be used. Blueboar (talk) 15:51, 8 April 2018 (UTC)
- Yes exactly. Wikidata is what it is, and it is not our business what can/cannot or should/should not be there. We are concerned with what happens here. Jytdog (talk) 15:55, 8 April 2018 (UTC)
- These are the cities twinned with Birmingham, taken from Wikidata: Frankfurt, Chicago, Leipzig, Johannesburg, Lyon, Milan, Changchun . What part of that retrieval and presentation is problematical? How does it "not present the data in a way that conforms to our policies"? if you want it as an accessible list, that's easy: Or would you prefer an unbulleted list? Where are the examples of the problems, and why do you think that problems can't be fixed? --RexxS (talk) 16:26, 8 April 2018 (UTC)
- Unfortunately the reference link for these twinned cities on Wikidata is broken. I tried to fix, but could not find anything on the website. · · · Peter (Southwood) (talk): 18:23, 8 April 2018 (UTC)
- @Peter: Linkrot can happen anywhere, but the Wayback Machine is your friend. The claims were added in December 2013 and you can see the website as it was then by looking at http://web.archive.org/web/20131202223641/http://www.birmingham.gov.uk/twins although we really ought to find something up-to-date now. I wonder if Andy can help? --RexxS (talk) 19:26, 8 April 2018 (UTC)
- User:RexxS I don't intend to go back and forth forever but you missed the in Wikipedia part. En-WP has policies and guidelines, and the generation and maintenance of data in Wikidata is not subject to en-WP policies and guidelines... so when stuff comes here it may or may not happen to comply with en-WP policies and guidelines. If it does comply, this is an accident. It is in my view just kookoo to open the door to that. Jytdog (talk) 20:17, 8 April 2018 (UTC)
- (edit conflict) I'm glad you don't intend to go back and forth forever – we need you spending your time doing good work stemming the flood of spam we get masquerading as articles. Anyway, just as there is considerable overlap between the Wikipedia and Wikidata communities, there are natural overlaps in policies, so correspondence is not accidental. We still retain the safeguard of only allowing sourced data into an infobox, although I understand we can't guarantee that the source verifies the information without checking (just we can't for information added locally to a Wikipedia infobox). It's okay to disagree on how much we can trust those curating the information on Wikidata to keep it honest, but my experience is they are no better and no worse that those doing the same job on Wikipedia, as they're often the same people. --RexxS (talk) 20:33, 8 April 2018 (UTC)
- Birmingham City Counil's website isn't what it was. They seem to have deleted all sorts of useful, pertinent information. Perhaps they assume it can all be found on Wikipedia now ;-) But seriously, there have been no new twin cities in the last decade, so the archive link is good enough. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 20:25, 8 April 2018 (UTC)
- Andy: I found these: d:Talk:Q2256 --RexxS (talk) 20:33, 8 April 2018 (UTC)
- RexxS thanks for your reply. But please don't oversell the development of policy in Wikidata. You know as well as anybody that there is no V, no BLP, etc etc. Just some undeveloped proposals and citing of the WMF board statement on personal information, etc. And there is always this muddle over the fact that some people are active in both editing communities; this doesn't make any difference with regard to essential differences between consensus (or lack thereof) in the respective communities with respect to V, BLP, etc. The communities govern themselves very differently. Jytdog (talk) 20:52, 8 April 2018 (UTC)
- User:RexxS I don't intend to go back and forth forever but you missed the in Wikipedia part. En-WP has policies and guidelines, and the generation and maintenance of data in Wikidata is not subject to en-WP policies and guidelines... so when stuff comes here it may or may not happen to comply with en-WP policies and guidelines. If it does comply, this is an accident. It is in my view just kookoo to open the door to that. Jytdog (talk) 20:17, 8 April 2018 (UTC)
- @Peter: Linkrot can happen anywhere, but the Wayback Machine is your friend. The claims were added in December 2013 and you can see the website as it was then by looking at http://web.archive.org/web/20131202223641/http://www.birmingham.gov.uk/twins although we really ought to find something up-to-date now. I wonder if Andy can help? --RexxS (talk) 19:26, 8 April 2018 (UTC)
- Unfortunately the reference link for these twinned cities on Wikidata is broken. I tried to fix, but could not find anything on the website. · · · Peter (Southwood) (talk): 18:23, 8 April 2018 (UTC)
- These are the cities twinned with Birmingham, taken from Wikidata: Frankfurt, Chicago, Leipzig, Johannesburg, Lyon, Milan, Changchun . What part of that retrieval and presentation is problematical? How does it "not present the data in a way that conforms to our policies"? if you want it as an accessible list, that's easy: Or would you prefer an unbulleted list? Where are the examples of the problems, and why do you think that problems can't be fixed? --RexxS (talk) 16:26, 8 April 2018 (UTC)
- Yes exactly. Wikidata is what it is, and it is not our business what can/cannot or should/should not be there. We are concerned with what happens here. Jytdog (talk) 15:55, 8 April 2018 (UTC)
- first, Birmingham has 9 sister cities (correctly listed on Wikipedia): in addition to Frankfurt, Chicago, Leipzig, Johannesburg, Lyon, and Milan, there are also Guangzhou, Changchun, and Nanjing as can been seen at http://web.archive.org/web/20141104171730/http://www.birmingham.gov.uk/cs/Satellite/eia?packedargs=website=4&rendermode=live from the Wayback Machine from 2014 (where the cities are referred to as "partner cities" or "sister cities"). BTW this information is still available on the Birmingham city website under "European and International Affairs" in the sections Europe, China and the Far East, Commonwealth, and North and South America. These pages date from 2016, and all cities are referred to as "sister cities".
- The Wikidata item for Birmingham shows 9 cities: 6 statements with references as given in the example above, and 3 statements without references (or imported from the Italian wikipedia) for the cities of Guangzhou (correct and from the Italian page), Xi'an (wrong), and Zaporizhia (also wrong). I did wonder where these came from. It appears that the Wikidata items for this cities show Birmingham as a sister city, so well meaning wikidata editors have thought it would be useful to add this to the Birmingham wikidata item... Now the {{wdib}} template Rexx used to create the list of sister cities is clever enough to only show the wikidata statements that have a reference. But I found the reference used in the Zaporizhia wikidata item (BTW a bare, dead url), and copied it into the wikidata item for Birmingham, and lo and behold, Zaporizhia appeared in the example above. I've undone my "error" on Wikidata, but this action is something that could easily be done by a well meaning bot or user...
- As to the appearance of the list of cities in the example given, it is basically ok but could with an "and" before the last item. But I do wonder if the use of a template like this is very helpful. The cities are shown in the order of the statements in the wikidata item. What if I want to show them alphabetically or in order of twinning? It also impossible to edit within wikipedia - you have to switch to wikidata - not everyone will be interested in doing that. Are we saying that Wikipedia editors have to become skilled in Wikidata?
- Also on appearance, the list should look something like:
- ^ "China and the Far East". Birmingham City Council. 2016. Retrieved 9 April 2018.
- ^ "Europe". Birmingham City Council. 2016. Retrieved 9 April 2018.
.
- That is, with references (preferably with a bit more than the basic url), if you want it to conform with our policies.
- I think the above shows that, although Wikidata can be a good starting point for an article, it isn't always up to date, is prone to both good faith errors and vandalism, and that nothing beats some editor research and checking... Robevans123 (talk) 00:21, 9 April 2018 (UTC)
- @Robevans123: All of that is true, but it's all true for Wikipedia as well. If we only correct the list of sister cities on Wikipedia, we have one project with accurate information. If we correct the information on Wikidata, we potentially supply accurate information to 280 other Wikipedias. That's got to be worth trying for. I'm pleased you noticed that the results of pulling data from Wikidata is dynamic, and so it should be: if somebody makes the effort to add references on Wikidata, we want the referenced information to appear immediately. If you have a screen reader, you'll hear that the lists above are marked up as real lists (and that's why there's no "and"). As for alphabetically, how does this look: Or if you want to see even unreferenced claims:
- You can make another call that returns a citation from Wikidata, but there is no way of working out which citation style is in use in any particular article, so I haven't implemented it, as I'm not about to hand yet another excuse to complain to the usual suspects. --RexxS (talk) 01:20, 9 April 2018 (UTC)
Authority control
[edit]- For those who advocate removing all references to WD, please consider data that acts as Help:Authority controls to articles, which is itself a form of referencing - we in anatomy use e.g. links to Terminologia anatomica, an international (and multilingual) listing of anatomical structures. We would have to re-port this and several other authority controls into WP and then manually defend it against vandalism, which is not uncommon given these terms are immutable but our articles are constantly the subject of defacement and the infobox is one of the first thing vandals see. Authority controls are a form of references linked to the article and I feel may require separate consideration given the emphasis that we shouldn't use wikidata owing to WP:V concerns--Tom (LT) (talk) 12:02, 9 April 2018 (UTC)
- @Tom (LT): '... and then manually defend it against vandalism' .. but now you have to manually defend it also against vandalism on an external wiki where there are likely less eyes on the subject than here (except for the few that look at WikiData through their en.wikipedia watchlist). There is unfortunately no way to lock down immutable data on WikiData (after it is properly checked and verified to be correct) - it is either the whole page or nothing. --Dirk Beetstra T C 12:15, 9 April 2018 (UTC)
- @Beetstra: Far easier to re-sync Wikidata to an external source than a myriad of Wikipedia articles. Also far easier to query for changes to a particular field over a particular period. For cases where a periodic re-check is enough, it's a lot easier to do on Wikidata.
If you ban imports from Wikidata, are YOU going to undertake to do all the updates manually, if external IDs change? Jheald (talk) 12:32, 9 April 2018 (UTC)- Oh, it is rather easy to run a bot on en.wikipedia to update the fields or to add fields. We have been doing that for years. And since we are talking immutable data, lets just find our way to protect the really immutable and checked data. It does not make it in any form more difficult, and at least we can protect the data here. Unless WikiData will implement a way where the truly immutable data can be individually made immutable. --Dirk Beetstra T C 12:46, 9 April 2018 (UTC)
- @Beetstra: Except that external IDs often aren't immutable. They get merged, split, updated. The rate varies from database to database, but if you don't keep them actively synchronised, they do rot.
So once they get re-sync'd on Wikidata, then what? You bot-import them to here, you say. But that's not compatible with 1A. Which is why a blanket !vote for 1A is fundamentally clueless. Jheald (talk) 12:58, 9 April 2018 (UTC)- @Jheald: The problem is, that WikiData data, referenced or not, gets actively vandalized, spammed, &c. (yes, the same crap happens here). The data there is unprotected (same crappy problem as here), and part of it is unreliable (just as here). The sourcing requirements on WikiData are far, far less strict than Wikipedia, and there is no protection of the data that is there (same as here). But WikiData has way less people monitoring it (I know, this RfC also acts as a good reason to try and attract (well, force) more people to watch WikiData data if 1F is implemented). The data that is there and is 'referenced' either is fine, or it is not referenced at Wikipedia standards, or it is referenced but since vandalized, or it is correct but unreferenced, or incorrect and unreferenced. But we have NO clue at which level each P in each Q is (the handful that I ran in just a day or 3 ago contained 1 that was repaired only after 2 months, and one is still there after 3 1/2 months - a WP:POINT violation on my side if you wish).
- No, I would not import them from WikiData, fully in line with my 1A - I would import PubChem numbers from PubChem, I would import CAS numbers directly from CAS (they don't allow that ..), I would import everything directly (that is what I have done in the past, with an attempt at protecting those datapoints years after we had the data .. an impossible and gargantuan task on en.wikipedia). I would not take the risk that the data in WikiData gets changed between one of their imports and the moment that we decide to take the data from them. And there is no reason to use WikiData as an intermediate if I can run a bot to do it directly (the result should be the same, with more certainty that the data is still correct).
- The immutable data is immutable indeed until a database updates (though to take a chemical identifier, the CAS, those numbers are practically 'for life' except under some very rare circumstances). But there is no reason for 'the man in the street' to mutate those 'at will'. They are immutable until proven needed to be mutated due to a database update (which for some data may happen regularly, for others never). IMHO, it would be easy to implement a setting on data where one P-field per Q-set can be set to 'immutable', and have a group of editors promoted to 'mutators' who are the only editors who can mutate that immutable data. Then when you have a bot update the data from a verified database, they mark that P in each Q as immutable. Editors with a reviewing-type of right can do the same, they check if a certain manual P is properly referenced and correct against that reference, and can mark it as immutable. It is so simple (and yes, so un-wiki-like). At that point, if en.wikipedia trusts the referencing standard of WikiData, and we have that immutable, unvandalizable data, then we can have a new RfC to see whether opinions have changed. Until then, I remain a firm 1A supporter - even if the quality is the same, I prefer to maintain that data here then having to maintain it through an intermediate. --Dirk Beetstra T C 13:34, 9 April 2018 (UTC)
- Note: if even only 0.1% (but growing) of the data on WikiData that we would have here on Wikipedia would be reliably correct and immutable I would have supported a laxer option (likely a 1D, but possibly a 1F) for that data, with support of removing local data if it is the same as the WikiData version (but still having the possibility to override a WikiData datapoint, but with an alerting categorization that the WikiData value is challenged). --Dirk Beetstra T C 13:52, 9 April 2018 (UTC)
- @Beetstra: Sorry but I didn't your reasoning well grounded:
- - WD is vandalized/spammed, WP is vandalized/spammed, this a fact. But how could you assume that WD is more vandalized than WP ? Any statistical analysis about that ? And if WD is vandalized in the same extend than WP, would you agree that there is no reason to consider WP as better than WD ?
- - The point of the existence of non-sourced data in WD is not relevant to decide if WD can be used or not in WP bcause WD allows you to filter them. You don't like unsourced data ? Well, ignore them by using the filter tool.
- WD is full of sourced data from unreliable references ? Well, ignore sourced data from unreliable references by using the filter tool.
- Why do you take care of unsourced data/bad quality data if you can discard them from your data extraction ? You can add this feature in the code of your infobox once and that filter will be applied at each opening of the WP article. WD is full of unsourced data/bad quality data but WD gives you the way to get ride of this unwanted stuff, problem solved, chapter closed, unless you can demonstrate that unsourced data/bad quality data from WD are not correctly filtered by the filter tool.
- The only argument against sourced data in WD is than the amount of good quality and sourced data is representing a too low percentage of the data in a relevant field. Is it the case ? To demonstrate. Is it a permanent drawback ? No as sourced data in WD is growing. So based on that argument we could say WD is currently not interesting but not that WD will never be interesting.
- - You mention the fact that this is better to populate infoboxes directly from the original databases than using WD as intermadiate. You are right, but can you give us about that regular update ? Is there any plan or bot to perform that task ? When I see that contributions were never verified since several years (see the CHEMBL identifier in Hydrochloric acid were is spotted as unverified since more than 4 years), I have some doubt that such affirmation will be done.
- - I won't spend time about the affirmation that WP data in infoboxes are well sourced. If I take the boiling point of dichlorine, I can't find the link between the value and the source. Yes I know, there is a references list, but from which reference the value was taken ? Can you really affirm that all references have the same value for the boiling point ? Is it not possible that the value becomes more accurate with the time.
- - You affirmed that WD was not properly designed to fight vandalism or error. You completely skip tools like constraint violation reports which offers an always updated list of some deviations to some rules (format, single rule, unique rule,...).
- I could continue, but I think the main idea is there: WD is not ready for a full application in the diverse WP's fields but if a WP wikiproject is confident in a dataset in WD and maintains properly the dataset in WD, why do you restrinct the use of WD for that particular application ? Instaed of pointing the weakness of WD, why do you list the minimal requirements to allow the use of WD in WP ? And be fair: don't require stronger requirement for WD than the ones applied in WP:en. You mentioned the fact that the WD community was to small to monitore all contributions in WD, but can you show that all WP:en articles are in the watchlist of several contributors and that the contributions in those articles are regularly checked ? In not, how can you point that as a weakness in WD and not in WP ? Snipre (talk) 22:30, 10 April 2018 (UTC)
- "But how could you assume that WD is more vandalized than WP ? Any statistical analysis about that?" .. WikiData has a smaller editor base, it has a smaller admin base (the latter by a factor of 10). There are no clear BLP policies, spam policies, and I gave an example of spammed urls that stay there for months (one still being there). I do notice on meta that there start to be spammers that target WD in their very first edit (as that one that was there for over 2 months). Moreover, it is not my task to show that WD is more reliable than WP.
- "The point of the existence of non-sourced data in WD is not relevant to decide if WD can be used or not in WP bcause WD allows you to filter them." - I agree, I am talking however also about the referenced data on WD - taking our previous point: you have no clue whether the data is correct with respect to the reference, because even the referenced data can be changed without changing the reference. Then the policies and guidelines on WD are not the same as en.wikipedia, the data may be referenced to references that we would not use (and that is often on a case-by-case data, I have been discussing the French library - where the statement was that it was the French library, of course it must be reliable. However that specific item clearly states that it uses imdb as a source. That something is published by an established publishing house does not necessarily make a single datapoint reliable. And that with less stringent policies in the first place.
- Updating directly vs. indirectly. Yes, my experience with the chemical infobox and updating is exactly that - it takes an humongous amount of effort to do that with one - verify whether the datapoint is properly linked to the page where you are talking about (and that requires proper knowledge of chemical compounds ..). We have spotted errors in those databases or in the databases between each other. There are CAS numbers that are linked to multiple compounds, there are compounds with multiple CAS numbers. CAS takes an effort in repairing that. But a plain database upload 'we have hexane here, they have hexane there, import identifier' fails on many other compounds. En.wikipedia has always been reluctant to updating data directly from databases for a good reason. Do you expect me to trust that because it now happens on WikiData that those en.wikipedia concerns do not apply?
- Yes, there are datapoints that depend on the reference - however each of those datapoints is still immutable against the reference (ref A is saying 100, ref B is saying 100.1 - that 100 is NOT going to change with respect to A, that 100.1 is NOT going to change with respect to B). Yes, it is likely that there will be a more accurate reference C at some point, who comes to 100.094. That does not change that reference A said 100 and reference B said 100.1. It does mean that we would upgrade that we use 100 to 100.1 to 100.094, and change the reference. That does not make the datapoint mutable. And that is for physical data, the same reasoning goes with your birthday - it is unlikely that that will ever be defined more precise. There is absolutely 0 reason to change that data with respect to the references that are there.
- So we have BLPs - where I think it fails, we have chemical compounds - where I think it fails. I am sure that there are many other fields where it fails. Many of our pages do not 'belong' to one WikiProject (where WikiProjects do not own pages) - so do we have to have fights between WikiProjects where one thinks that WD is good enough and the other doesn't, have lengthy discussions properly organized 'yes, for this group of articles a local consensus exists to use WikiData'. If I read the !voting above, there are many people who have concerns for the overall use, do you think that these editors could be convinced of a local consensus. IMHO, it is a question between 1A and 1F - either we don't, or we go full. Anything between that is a recipe for problems, discussions, fights. And no, WD is not ready for that.
- My minimal requirements are there - 3A - all data should be compliant to OUR policies and guidelines, which means that WD's policies and guidelines should be more strong than Wikipedia's, and that WD's protection of said data is (way) more stringent than ours. WD is by far not there yet, and I am sorry to see that the way WD started up (first shoot, then ask questions) does make it very unlikely that it will be up to that task in any near future. Until then, no. We are already fighting whether infoboxes should be included in the first place. --Dirk Beetstra T C 06:19, 11 April 2018 (UTC)
- @Beetstra: Except that external IDs often aren't immutable. They get merged, split, updated. The rate varies from database to database, but if you don't keep them actively synchronised, they do rot.
- Oh, it is rather easy to run a bot on en.wikipedia to update the fields or to add fields. We have been doing that for years. And since we are talking immutable data, lets just find our way to protect the really immutable and checked data. It does not make it in any form more difficult, and at least we can protect the data here. Unless WikiData will implement a way where the truly immutable data can be individually made immutable. --Dirk Beetstra T C 12:46, 9 April 2018 (UTC)
- @Beetstra: Far easier to re-sync Wikidata to an external source than a myriad of Wikipedia articles. Also far easier to query for changes to a particular field over a particular period. For cases where a periodic re-check is enough, it's a lot easier to do on Wikidata.
- @Tom (LT): '... and then manually defend it against vandalism' .. but now you have to manually defend it also against vandalism on an external wiki where there are likely less eyes on the subject than here (except for the few that look at WikiData through their en.wikipedia watchlist). There is unfortunately no way to lock down immutable data on WikiData (after it is properly checked and verified to be correct) - it is either the whole page or nothing. --Dirk Beetstra T C 12:15, 9 April 2018 (UTC)
Constraint violation reports
[edit]Hmm... Although topics what currently discussed in this page include important points and I share same concerns, it seems that more practical information about current state of WikiData is needed, especially about well-formatted data like identifiers. As far as I searched this page with CTRL F
, no one yet mentions the most powerful counter-vandal and quality-control tool in Wikidata. It's "Constraint violation report".
"Constraint violation report" is quality control tool which tells us things, for example, typo, duplicated data, weird data in context of surrounding data, and so on. Let me put here example of "constraint violation report" from my experience. For example, I can see "Unique value violation" list about Terminologia Anatomica ID (d:Property talk:P1323) at d:Wikidata:Database_reports/Constraint_violations/P1323#"Unique_value"_violations (ignore other sections. This is not well customized report. I simply put example from my editing region.)
This list is updated daily basis by bots. But what this list means? This list asks me question like "These two different pages have same TA98_ID. Is this OK?"
- This is sometimes OK (for example, if "Head" and "Human head" have same TA98_ID, that would be intentional. Not error or vandalism.)
- Sometimes not-OK (for example, if "Kidney" and "Brain" have same TA98_ID, that is weird.)
As additional information, these "constraint" is customizable and set at d:Property:P1323#constraints. You can see whole available "constraint" list at here (d:Help:Property constraints portal). As my personal experience, when I exported anatomy-related data to WiKiData from WikiPedia, I fixed many data entries which were suggested as potential errors by "constraint violation report" like as above example. And these days, I sometimes check constraint violation report page.Then I see that data is safe and OK even now.
This simple fact "I fixed many data" and "safe and OK even now" means, to me, that "Data in WikiData has better quality than data which was stored in infobox of WikiPedia". This is very simple point.
There is no substitutable way for "constraint violation report" in WikiPeida, as far as you don't programming your own software like Wikipedia:WikiProject Chemicals/Chembox validation (This is very great project. But it is difficult to imitate it for ordinary editors in other fields.)
As my experience, WikiData is safer and better place for data which has well-formatted style, like identifiers. From this reason, currently I personally feel "more reliable" if identifiers are retrieved from WikiData. Because "constraint violation report" pages guard us. --Was a bee (talk) 19:55, 10 April 2018 (UTC)
P.S. If you interested in "constraint violation", I think you also interested in more higher level tool "Complex constraint violations". For example, at d:Wikidata:Database reports/Complex constraint violations/P25, some data items which suffice certain criteria are automatically listed. For example, "Mother was born after her child was born". Such list is created by analyzing data of mother-child relation and birth/death in whole 40 million item pages. Although I haven't checked each items, if item is listed there, I can guess that something is wrong. This list is set at property talk page (d:Property talk:P25) using SPARQL query (diff). If you don't know well about SPARQL, you can request query at d:Wikidata:Request a query. This kind of automated error detection, or quality control is possible only in WikiData. --Was a bee (talk) 19:59, 10 April 2018 (UTC)
- @Was a bee: So I see 31 violations on the 47 cases of P1108 (~75%). I know that that is just a one-item test group (others are staggering, I see 94477 for P1057 ('Too many results. 89476 records skipped.'), but how is this going to give me trust that those 31 cases are correct information? And how do you there monitor vandalism if you have already 94477 which have problems? Or when one is there a false positive (hence flagged but correct) and is changed so becomes unflagged and incorrect (Human head and head have the same TA98_ID you say, but they are flagged, if I change one of them they are different, one is incorrect, and they are not flagged ..). To me, the only way to know if something is correct is by locking it down in a checked and correct state. --Dirk Beetstra T C 06:39, 11 April 2018 (UTC)
- @Beetstra you are mistaking the point of the TA links. Like medicine, anatomy infoboxes link to relevant databases and almanacs. We link relevant articles to the international standard nomenclature for that topic... in layman's terms, like linking to the Oxford or Collins dictionary definition. Thus your concern is about whether there is more vandalism on EN WP (greater views, greater ease of editing, greater edits) or Wikidata (less views, harder to edit, less edits). I don't think there's any data presented in the above discussions to validate either side of this point but it would be interesting to know.--Tom (LT) (talk) 11:35, 11 April 2018 (UTC)
- @Tom (LT): I agree, we don't really know. Point in fact is that I just ran into another case of spamming where WD was targeted broadly (links were added to multiple Qs, where some were rather sought after), and specifically, broader than on any other Wiki where the links were spammed. As with the situation mentioned earlier, one of the links stayed for
1437 days before being removed (addition 1st of March, removal 7th of April. - But besides that, I think my point is that the vandalism on WD should be significantly less than on en.wikipedia - sneaky number-changing is rather difficult to spot on en.wikipedia (many people don't see it even when it is on their watchlist, you have to either know, or being willing to check - you cannot blindly revert as you have to check whether the number got improved or not, and biting is an issue there), it will be less visible if someone does that through WD. That requires eyes on WD monitoring (especially since the en.wikipedia watchlist, at this moment, does not necessarily show that). And I think my deeper point is, that I really do not see the need why anyone would ever need to change the birthday of Donald Trump, there may be reasons to change the way we display it ('4-11-2018' (MDY, Gregorian), '11-4-1900' (DMY, Gregorian), '25-7-1439' (DMY, Hijri), '11th of April, 2018' (written out) .. the underlying number(s) do(es) not change (P(birthday_day)=4; P(birthday_month)=11; P(birthday_year)=2018; P(birthday_calendertype)='Gregorian'). We can discuss whether data is vandalized, or how much it is vandalized, we can discuss whether the data is generally correct or not (and to what percentage), we can discuss whether the data is preoperly referenced or not, we can discuss the chances whether references were added after the fact, or facts were changed after the reference, my point is that for a lot of data it plainly true that once it is correct, it is never ever going to change (and there is no reason to do change), some data has multiple 'corrects' (3 different papers describing the boiling point of water at different precisions - still each of those is hard-linked to the reference, that reference is not going to change, that number is not going to change, ever), there is data that regularly (but not continuously) changes (but is now correct and will only change upon a change of external database), and there is data that is regularly changing (soit, we have to live with that). But currently, we cannot even trust the truly immutable data. And if I have to check on en.wikipedia the reference supports that today is 11-4-2018, or I have to go to WikiData to check whether the reference that supports that today is the date displayed on en.wikipedia, then I prefer the former. As it stands now, material on WD is not even reviewed whether it is correct with respect to the reference that is there (or maybe multiple WD editors review it, but they don't know of each other that that has been done (since the last change), nor do we know it), let alone that it is locked-down in a correctly supported state.
- I wonder how much bogus data there is on WD that is supported by an equally bogus reference, but which would be transcluded onto en.wikipedia because it is 'properly' referenced - point is that if on en.wikipedia a bogus piece of data is entered with a bogus reference it is properly visible, while that material could very well be added after a fully enabled WD infobox is added and hardly anyone notices (and now think BLP). --Dirk Beetstra T C 12:22, 11 April 2018 (UTC) (adapted --Dirk Beetstra T C 12:46, 11 April 2018 (UTC))
- @Tom (LT): I agree, we don't really know. Point in fact is that I just ran into another case of spamming where WD was targeted broadly (links were added to multiple Qs, where some were rather sought after), and specifically, broader than on any other Wiki where the links were spammed. As with the situation mentioned earlier, one of the links stayed for
- @Beetstra you are mistaking the point of the TA links. Like medicine, anatomy infoboxes link to relevant databases and almanacs. We link relevant articles to the international standard nomenclature for that topic... in layman's terms, like linking to the Oxford or Collins dictionary definition. Thus your concern is about whether there is more vandalism on EN WP (greater views, greater ease of editing, greater edits) or Wikidata (less views, harder to edit, less edits). I don't think there's any data presented in the above discussions to validate either side of this point but it would be interesting to know.--Tom (LT) (talk) 11:35, 11 April 2018 (UTC)
- @Beetstra:
"So I see 31 violations on the 47 cases of P1108 (~75%)."
I see 94477 for P1057
This is not problem. There are various kind of constraint (see d:Help:Property constraints portal). And now what you mentioned is violations of "type constraint" (see d:Help:Property constraints portal/Type). As current state of usage, although type constraint is added to many properties, generally this violation is not treated seriously (no one cares so much). - Analyzing cases which you raised,,, In electronegativity (P1108) case, option parameter was not matched to current instance/subclass usage in "atom" items. So I updated that[5]. When next update, most "violation" would disappear. In chromosome (P1057), I fixed setting[6] for matching current main usage of property. It is item type "gene". When next update, number of "violation" would largely decrease.
- After constraint setting is nicely configured, "type violations" lighting up "wrong usage of the property". For example, if electronegativity (P1108) is used in "chimpanzee" type page (e.g. Bubbles (Q997482)), it is strange ("What is electronegativity of monkey??"). In such case, I can guess that somebody mistakenly used electronegativity (P1108) at monkey item page. Or somebody changed (vandalized) item type at atom page from "chemical element" to "chimp". For example, as far as I see current type violations of electronegativity (d:Wikidata:Database reports/Constraint violations/P1108), there are no such case. So I can say that there are no vandal at least related to item type in electronegativity.
- Your data locking feature is interesting idea. I like that. If locking functionality is discussed in future, I'll follow that discussion with supportive mind :) For example, atomic number of Helium is "2" (d:Q560#P1086). If this data "2" is locked, Wikidata can provide next level reliability. Various new WikiData feature discussion is on going at here (d:Wikidata:Contact the development team). --Was a bee (talk) 16:37, 11 April 2018 (UTC)
- I can see how that might catch greater vandalism, but not the sneaky stuff. As mentioned earlier, if s.o. changes the boiling point of trimethylphosphine to 150 degC, hardly anyone but a hardcore chemist will know that it is wrong. Those who notice may want to check, but to all of them the data seems reasonable.
- The locking of data is not something new. There are cases on Wikipedia of locked down infoboxes (template editor protected transclusions of data-filled infoboxes). That data is immutable and referenced, and locked down in that state. The only reason for change is if someone can make a case for change. —Dirk Beetstra T C 03:38, 12 April 2018 (UTC)
- Template editor protection and transclusion! That locking system is interesting.
- About nuance level data quality like examples what you raised (e.g. birthday of famous person, URL of certain official website, boiling point of certain chemical substance, etc) is basically hard to evaluate with automated algorithm in WikiData. About this point, I think, although technological aspect play roles in some degree (e.g. protection is possible or not), the most important factor would finally resulted in "human", not technology nor system. In other words, "Regarding that data, where is the most powerful contributor among all wiki around the world, now?" I put examples:
- 1. If most powerful contributor is in en.wikipedia, then en.wikipedia should not import data from WikiData. But it is better to export data periodically to WikiData and sharing that among other wikis. Although I don't know well about chemistry region, currently chemistry project in en.wikipedia seems such world's best.
- 2. If most powerful contributor is in WikiData. It is better to import data from WikiData. It makes en.wikipedia articles more reliable and excellent. For example, gene/protein data realm is kind of this. I suppose you already know that User:ProteinBoxBot project have shifted their data storage center from en.wikipedia to WikiData. So enormous quantity data related to gene/protein is now in WikiData. They are keeping data in WikiData anew by periodical updates. As fa as I know, this project is world's best project in gene/protein data realm among all wikis.
- Because even if locking-system is implemented, if there were no good editor/maintainer, as you can imagine, system itself can't afford good things to users/readers. So "Where is good editor?" is finally matter. Blanket rule would not be efficient. --Was a bee (talk) 21:49, 12 April 2018 (UTC)
- @Beetstra:
!voting style
[edit]As usual the Wikipedia "discussion" process is plain old "politicking" and a "public" forum for criticizing, influencing, discrediting and trashing the "votes" and opinions and viewpoints of others. To some, at least. One has to wonder just what kind of "democracy" some who demand explanations for the opinions and votes of others are used to in the "real world". And why they react negatively when subjected to similar treatment from others. But then again, those who play the other "team" in the same game and submit to their demands and let themselves be influenced by them have their own role in creating and responsibility for the "democracy" they "create". — Preceding unsigned comment added by 68.234.100.169 (talk • contribs) 19:58, 9 April 2018 (UTC)
- @68.234.100.169: It's WP:NOTAVOTE. The point of an RfC is to draw out why people take the positions they do; and it's entirely appropriate on both sides for people to raise questions, to see whether those positions stand up to challenge and fully consider all the angles. If you think this is "as usual", then you're right, this is "as usual", because this is how it is meant to work, when there are questions to thrash out. Jheald (talk) 20:44, 9 April 2018 (UTC)
- IP 68.234.100.169, if you are referring to Mike Peel then I object. Mike and I are about as far apart on this issue as possible, and I could have asked the exact same question he did. There's nothing wrong with asking someone to expand/clarify on their !vote.
- Actually it's not considered improper or unusual to counter-argue or try to change someone's !vote in an RFC. A RFC is discussion on the merits and not a vote. But Mike didn't even do that. Alsee (talk) 23:56, 9 April 2018 (UTC)
Text based editing vs code based editing
[edit]Just a side comment - perhaps not related to the RFC, but worth noting. I think a lot of the resistance to wikidata may stem from the fact that wikidata is not “user friendly”. To understand how wikidata works (and to even read a wikidata page), you have to have some idea of how all the parameters and coding behind wikidata. As someone who is very TEXT based, when I look at wikidata, I become quickly confused by all the parameters, P-numbers, Q-numbers etc. This makes it overwhelming (to the point of impossible) for a text based editor (like me) to figure out what’s going on... how to edit wikidata. Wikipedia is supposed to be the encyclopedia that “anyone” can edit... including technophobes like me. Currently, Wikipedia is relatively easy to edit... it is text based... just go to edit mode and type words ... then hit save. Unfortunately, the more we encorporate wikidata into Wikipedia, the harder it becomes to edit Wikipedia. If we roll this out fully, it will simply confuse and drive away text based editors like me. Wikipedia will become “the encyclopedia that SOME (those who can figure out wikidata) can edit... but others can’t.” Blueboar (talk) 12:16, 11 April 2018 (UTC)
- Yup, I noted that in my oppose of use of wikidata. I'm pretty techy, and it still took me atleast 10 minutes just to figure out vaguely what is going on and how to edit it and it is still much way more annoying to edit - only one field at a time, to add a statement, you have to do so many clicks and what not; everything is cumbersome, and i have no idea what half the things are about. For something as vital and widely used as infoboxes it should be very easy to edit, add a lot of information manually, or remove etc - there shouldn't be barriers to human editors. So maybe only things that should be updated en mass/with bot can be moved there (identifiers etc) but only very maybe.. Galobtter (pingó mió) 04:07, 12 April 2018 (UTC)
- Even with the bot thing, you'd have to ensure that the bot edits update correctly, which may not be the case, and one'd probably need to impose something close to enwiki bot policy on there.. Galobtter (pingó mió) 04:10, 12 April 2018 (UTC)
Images
[edit]Well a major advantage of such a change could be fixing of Category:No local image but image on Wikidata (8,941) which has more than 8200 articles at present. If any of the Wikidata bashers want to do that manually, they're most welcome. But I don't see that happen anytime sooner (I mean whatever is not forever) without use of Wikidata (especially if they want to opt for manual editing). Capankajsmilyo (talk) 17:48, 11 April 2018 (UTC)
- Are you now suggesting to automatically transclude those images? It still needs someone to check whether the image on WikiData is the right image. That set is just a (small?) subset of all pages with infoboxes without image. —Dirk Beetstra T C 03:20, 12 April 2018 (UTC)
- The transcluded image will be checked by anyone and everyone reading that article. But this will add image to thousands of article which don't have one. I don't see how keeping the articles (for which images exist) without an image is better. Further, not every article is GA/FA. There are stubs and starts as well. Do you want to apply a blanket policy on all? Capankajsmilyo (talk) 03:31, 12 April 2018 (UTC)
- Also it might be a good idea to have some statistical data of how many incorrect images were added vs how many correct were before arriving at a decision of blanket ban. Capankajsmilyo (talk) 03:34, 12 April 2018 (UTC)
- Especially for those low view stubs, people won't know what the correct image is supposed to be Galobtter (pingó mió) 03:54, 12 April 2018 (UTC)
- A bot could do it easily if we wanted...it wouldn't be approved though, as no human is checking the images, but if you're going to automatically add the image the same problem is there Galobtter (pingó mió) 03:40, 12 April 2018 (UTC)
- The very first one I checked on was incorrect - it said there was a picture for Héctor Abad Gómez, but was actually was of Héctor_Abad_Faciolince. The second one was fine (Syamer Kutty Abba), so it'd be useful for someone manually/semi-automatically with checking to go through the category, but blanket adding images, definitely not. Looking at the history, the clearly incorrect image was added twice separately by different people, looking at the tag, appears automatically - no bot on enwiki would add an image (presumeably) because it had the same first matching first two words, and anyone who did it semi-automatically would be blocked very quickly. This clearly shows the different standards - of wikidata, anything goes as long as there is more, where on enwiki there is a much more emphasis that bots be very vetted for errors, and edits too. Galobtter (pingó mió) 03:54, 12 April 2018 (UTC)
- That category isn't even right in many instances. The first one I checked was Sarah Onyango Obama, which has an image (the same as in Wikidata), just not in the infobox. Which is good as it is an image of her house, not of the person. Branko Oblak should, according to Wikidata, get the image File:Josip Katalinski at World Cup 1974 against Zaire.jpg (and many other language versions of his article already do, through Wikidata). It is an image of Katalinski though... Humphrey Ocean has the image, but above the infobox, not in it, which allows it to be shown a bit larger. Eddie Ockenden has as Wikidata image a picture of his whole team, which isn't really suitable for an infobox. Neither Wikidata nor the image at Commons gives any indication which person Ockenden is. A similar issue with George Odgers, but there the Wikidata item is included in the article outside of the infobox, with a caption identifying him.
- Basically this is a good example of why simply using Wikidata to fill our infoboxes is way too often a bad idea, and local control of the data is preferable. Fram (talk) 12:44, 12 April 2018 (UTC)
- You do have control over local data in options other than 1A. You can always specify a local image in case you don't wanna use Wikidata image. Wikidata images can be fetched to reduce the pages in the category specified. If you want to do it by yourself, you're most welcome. But thousands of articles without any images (when we do have them) does not make sense to me. User:Capankajsmilyo(Talk | Infobox assistance) 04:17, 18 April 2018 (UTC)
- Not with 4E and 4F. And if we have to override Wikidata all the time, then what's the use of a Wikidata-based infobox instead of using a local one immediately? Fram (talk) 04:27, 18 April 2018 (UTC)
- 2-3 instances won't qualify a descent sample size I guess. How many incorrect images vs correct images were imported in say 1000 imports? User:Capankajsmilyo(Talk | Infobox assistance) 04:43, 18 April 2018 (UTC)
- If half the images are incorrect a sample size of 10 is enough really. I checked another two: File:UCV_2015-210_Harry_Abend,_1994.JPG was a image of Harry Abend's work which you wouldn't want for his infobox; (one image of which is already there in the article). Another the image was fetched from wikidata and was fine. The next had a sculpture image: File:June_Gordon_Marchioness_of_Aberdeen_and_Termair_bronze_bust_by_sculptor_Laurence_Broderick.jpg which was included in the article and shouldn't be in the infobox. The next one was Lauren_Ambrose; The image, File:Lauren_Ambrose_2000.jpg, was already included below and may not be good for the infobox (and definitely shouldn't be repeated both in the infobox and below). So 1/4 for usefulness, I suppose, 2/4 - no that isn't a suitable infobox image - and 1/4 already included. Galobtter (pingó mió) 04:59, 18 April 2018 (UTC)
- The statistics could be very different for different classes of articles, for example BLP and localities. Even if 50% BLP images are incorrect (which still sounds very strange to me), it could be that only 1% of locality images is incorrect.--Ymblanter (talk) 05:13, 18 April 2018 (UTC)
- If half the images are incorrect a sample size of 10 is enough really. I checked another two: File:UCV_2015-210_Harry_Abend,_1994.JPG was a image of Harry Abend's work which you wouldn't want for his infobox; (one image of which is already there in the article). Another the image was fetched from wikidata and was fine. The next had a sculpture image: File:June_Gordon_Marchioness_of_Aberdeen_and_Termair_bronze_bust_by_sculptor_Laurence_Broderick.jpg which was included in the article and shouldn't be in the infobox. The next one was Lauren_Ambrose; The image, File:Lauren_Ambrose_2000.jpg, was already included below and may not be good for the infobox (and definitely shouldn't be repeated both in the infobox and below). So 1/4 for usefulness, I suppose, 2/4 - no that isn't a suitable infobox image - and 1/4 already included. Galobtter (pingó mió) 04:59, 18 April 2018 (UTC)
- The argument
have to override Wikidata all the time, then what's the use
sounds like if we have to monitor/edit the Wikipedia all the time what's the use. It also assumes that each and every image of Wikipedia is incorrect which is an absurd and unsubstantiated assumption. User:Capankajsmilyo(Talk | Infobox assistance) 04:43, 18 April 2018 (UTC)- The essential point is if we're overriding it half the time, then it isn't useful. If we introduce 4000 incorrect images for 4000 correct ones that doesn't seem overall improvement - there'd be a massive drop in quality. Galobtter (pingó mió) 04:59, 18 April 2018 (UTC)
- How are you so sure that it's 50/50 and not 95/5 (correct vs incorrect) on a sample size of 8000. I don't think anyone has even assessed 100 images till date. If so, please share actual stats. Further Ymblanter's point carries weight that this will differ by Wikiproject. Telescopes already use it and are happy with it. BLP don't. So we might need to consider the breakup of such stats by Wikiproject before arriving on any conclusion. This might be suitable for some type of articles and not for others. That must be thoroughly assessed. User:Capankajsmilyo(Talk | Infobox assistance) 05:24, 18 April 2018 (UTC)
- Even a small sample size of 4 shows 4/6 - 66% - useless/bad. According to statistical significance calculation using an online calculator, it is 66% -37.89 - so only <1% chance it is even 30% good. Bascially, itwould be extraordinarily unlikely if I just happened to find a very bad sample. I'm not sure why it would be better with non-BLPs. (you are welcome to disprove that though, if you wish to check some of the images, on BLPs and non-BLPs) Galobtter (pingó mió) 05:35, 18 April 2018 (UTC)
- In essence, we have no reliable data on how reliable Wikidata is, and from personal experience many of us have found major problems with some of the data. It would be gross irresponsibility to massively import data that is unreliable and unchecked, but also excessively restrictive not to allow the use of data which is reliable and has been checked, providing that it can also be reasonably easily curated by the end users, ie. Wikipedians. Both the "no Wikidata under any conditions" and the "always use Wikidata" extremists are shutting doors unnecessarily. Wikidata has a long way to go, but it might get there. Some projects are more sensitive to bad data, they should not take the chance until it can be shown that Wikidata fits their requirements. Other projects find Wikidata useful, let them use it, they will test the systems and get changes made, and as time goes by Wikidata may become acceptable to more projects, and they can start to use it. Trying to force this either way will not work well, and will polarise the communities. This has parallels with WMF pushing unready software. When pushed, people push back, sometimes quite violently. The status quo allows us to bicker interminably at a sort of ground level, and get some work done. That is how it goes at Wikipedia. · · · Peter (Southwood) (talk): 05:55, 18 April 2018 (UTC)
- Exactly that.--Ymblanter (talk) 06:05, 18 April 2018 (UTC)
- I do understand that, but IMO I don't think the data quality is ever going to improve, nor is there a way to make sure to get the certain data that is reliable and checked, and thus the experiments cannot be limited to that; I think that is inherent in wikidata policy, of its desire to get as much information as possible, with minimal BLP and sourcing policy, acceptance of crappy sources et al. If it ever improves - and this will take a long time - then we can reevaluate, but currently allowing experiments/wikiproject use can/has easily allow bad data. There is nothing preventing bot importation of data directly from the high quality sources, as Beetstra points out - wikidata doesn't offer exclusive data, and thus it wouldn't be "excessively restrictive not to allow the use of data which is reliable and has been checked" (is data "checked" on wikidata - most of it is bot imports, which we can as easily do here if we wanted - often we don't want, because bots are bad with ambiguity, similar names et al, but problems with that, as I pointed out above about wrong images being added seem to exist on wikidata - so it is more like mass importation from wikidata circumvents our own community desires regarding data)
- Not only that, I think many people will be confused by the extra steps to edit on wikidata, and I expect lot less editing of infobox fields if they are shifted to wikidata - from confusion of where the data is, to learning a new interface etc. Having the data on wikidata is basically a barrier to editing Galobtter (pingó mió) 08:34, 18 April 2018 (UTC)
- Regarding directly importing, that is what I have been doing in a grey past on Wikipedia. It is an enormous task .. you have to check every single datapoint: is this identifier really describing this subject? That takes up to minutes per subject. For a lot of data you cannot say that you are correct by blind bot-importing directly from outside databases. I would be surprised if you get >95% correct (and that is partially because also Wikipedia is not correct in the first place). Below an editor talked about 95% correct, that is by FAR not satisfactory. --Dirk Beetstra T C 08:46, 18 April 2018 (UTC)
- Regarding the difficulty of making sure that identifier is really describing the subject. Agree 100%, indeed that is what I was saying about "often we don't want, because bots are bad with ambiguity, similar names et al, but problems with that, as I pointed out above about wrong images being added seem to exist on wikidata - so it is more like mass importation from wikidata circumvents our own community desires regarding data". In Wikidata importations that checking isn't done. Galobtter (pingó mió) 15:45, 18 April 2018 (UTC)
- Regarding directly importing, that is what I have been doing in a grey past on Wikipedia. It is an enormous task .. you have to check every single datapoint: is this identifier really describing this subject? That takes up to minutes per subject. For a lot of data you cannot say that you are correct by blind bot-importing directly from outside databases. I would be surprised if you get >95% correct (and that is partially because also Wikipedia is not correct in the first place). Below an editor talked about 95% correct, that is by FAR not satisfactory. --Dirk Beetstra T C 08:46, 18 April 2018 (UTC)
- In essence, we have no reliable data on how reliable Wikidata is, and from personal experience many of us have found major problems with some of the data. It would be gross irresponsibility to massively import data that is unreliable and unchecked, but also excessively restrictive not to allow the use of data which is reliable and has been checked, providing that it can also be reasonably easily curated by the end users, ie. Wikipedians. Both the "no Wikidata under any conditions" and the "always use Wikidata" extremists are shutting doors unnecessarily. Wikidata has a long way to go, but it might get there. Some projects are more sensitive to bad data, they should not take the chance until it can be shown that Wikidata fits their requirements. Other projects find Wikidata useful, let them use it, they will test the systems and get changes made, and as time goes by Wikidata may become acceptable to more projects, and they can start to use it. Trying to force this either way will not work well, and will polarise the communities. This has parallels with WMF pushing unready software. When pushed, people push back, sometimes quite violently. The status quo allows us to bicker interminably at a sort of ground level, and get some work done. That is how it goes at Wikipedia. · · · Peter (Southwood) (talk): 05:55, 18 April 2018 (UTC)
- Even a small sample size of 4 shows 4/6 - 66% - useless/bad. According to statistical significance calculation using an online calculator, it is 66% -37.89 - so only <1% chance it is even 30% good. Bascially, itwould be extraordinarily unlikely if I just happened to find a very bad sample. I'm not sure why it would be better with non-BLPs. (you are welcome to disprove that though, if you wish to check some of the images, on BLPs and non-BLPs) Galobtter (pingó mió) 05:35, 18 April 2018 (UTC)
- How are you so sure that it's 50/50 and not 95/5 (correct vs incorrect) on a sample size of 8000. I don't think anyone has even assessed 100 images till date. If so, please share actual stats. Further Ymblanter's point carries weight that this will differ by Wikiproject. Telescopes already use it and are happy with it. BLP don't. So we might need to consider the breakup of such stats by Wikiproject before arriving on any conclusion. This might be suitable for some type of articles and not for others. That must be thoroughly assessed. User:Capankajsmilyo(Talk | Infobox assistance) 05:24, 18 April 2018 (UTC)
- The essential point is if we're overriding it half the time, then it isn't useful. If we introduce 4000 incorrect images for 4000 correct ones that doesn't seem overall improvement - there'd be a massive drop in quality. Galobtter (pingó mió) 04:59, 18 April 2018 (UTC)
- 2-3 instances won't qualify a descent sample size I guess. How many incorrect images vs correct images were imported in say 1000 imports? User:Capankajsmilyo(Talk | Infobox assistance) 04:43, 18 April 2018 (UTC)
- Not with 4E and 4F. And if we have to override Wikidata all the time, then what's the use of a Wikidata-based infobox instead of using a local one immediately? Fram (talk) 04:27, 18 April 2018 (UTC)
- You do have control over local data in options other than 1A. You can always specify a local image in case you don't wanna use Wikidata image. Wikidata images can be fetched to reduce the pages in the category specified. If you want to do it by yourself, you're most welcome. But thousands of articles without any images (when we do have them) does not make sense to me. User:Capankajsmilyo(Talk | Infobox assistance) 04:17, 18 April 2018 (UTC)
Thanks to the enabling of automatic image inclusion from Wikidata (i.e. showing an image in an infobox without the image being defined here), we now have many articles with the same image twice, either above and inside the infobox (e.g. Eshin Nishimura), inside and below the infobox (James Preston Poindexter), in two separate infoboxes (e.g. Robert Little (minister)) or twice inside the same infobox(!) (e.g. William T. Dixon, John M. Brown, Christopher Payne, William R. Pettiford, Rufus L. Perry, Benjamin F. Lee, Joseph C. Price, Joseph Endom Jones, ...) These are clear examples of the implementation of such Wikidata imports making articles actively worse here. Fram (talk) 07:27, 18 April 2018 (UTC)
- Do you really think that thise images should be outside infobox rather than being in it? As far as I see, in the examples given above (by you) Wikipedia editors placed those images (incorrectly) outside infobox, when they should have been inside it. And no-one, especially Wikipedia administrators opposing wikipedia did anything to fix that. User:Capankajsmilyo(Talk | Infobox assistance) 08:01, 18 April 2018 (UTC)
- You have noticed that the vast majority of examples I gave had images inside the infobox already? And as far as I can tell, there is no rule that says that images should be in the infobox and not above it (WP:MOSIMAGES and the linked page certainly don't mention it). Apart from that, can you please stop with the needlessly inflaming add-ons to your posts? From "Wikidata bashers" in your first post to this subsection to "especially Wikipedia administrators opposing wikipedia" (admins have no more or less authority over content than any other editor, and I think you mean "wikidata", not "wikipedia"), you insert comments which don't help this discussion at all and needlessly antagonize and divide the discussion. Fram (talk) 08:07, 18 April 2018 (UTC)
had images inside the infobox already
who said that? Why are you trying to put words in my mouth? I said they should be inside infobox. Are you saying that Wikipedia prefer infoboxes without image image above/below it. I see no common-sense especially no design-sense in that. User:Capankajsmilyo(Talk | Infobox assistance) 08:13, 18 April 2018 (UTC)- I didn't put words in your mouth, I was asking you whether you noticed that most of them already had an image in the infobox before the automatic wikidata inclusion of an image created a duplicate image in it? The 8 examples I gave for "twice inside the same infobox" and the one example of "in two separate infoboxes". Fram (talk) 08:21, 18 April 2018 (UTC)
- (edit conflict)No, most of them are because you didn't disable fetching from wikidata if the infobox was embedded, actually - those were the cases of two images in one infobox. You should make sure your changes to the infobox don't cause problems, Capankajsmilyo; I've reverted the edit for those problems. Those uses may need improvement, or they may simply be a different style- either way that doesn't mean problems should occur. Galobtter (pingó mió) 08:18, 18 April 2018 (UTC)
- You have noticed that the vast majority of examples I gave had images inside the infobox already? And as far as I can tell, there is no rule that says that images should be in the infobox and not above it (WP:MOSIMAGES and the linked page certainly don't mention it). Apart from that, can you please stop with the needlessly inflaming add-ons to your posts? From "Wikidata bashers" in your first post to this subsection to "especially Wikipedia administrators opposing wikipedia" (admins have no more or less authority over content than any other editor, and I think you mean "wikidata", not "wikipedia"), you insert comments which don't help this discussion at all and needlessly antagonize and divide the discussion. Fram (talk) 08:07, 18 April 2018 (UTC)
The stats given above for incorrect images don't confirm with the actual ones. A simple check on {{Infobox religious biography}}
shows that out of 654 articles that have images on both Wikidata and Wikipedia, 75% were exactly same. Remaining 25% were the ones that were different, and out of those, I doubt that even 50% (i.e. 12.5%) are incorrect ones. So in total the experiment done on a sample of 654 shows >80% (I would say >95% in personal opinion) images confirm to Wikipedia standards. User:Capankajsmilyo(Talk | Infobox assistance) 08:31, 18 April 2018 (UTC)
- As opposed to what would be nearly 100% of the images that we have do comply with Wikipedia policies, which overall still means we go down. Now, you checked only one datapoint, if you score 95% on 1, you score 77% on 5, 60% on 10, 35% on 20. And that on BLPs and in an infobox that has even more fields. --Dirk Beetstra T C 08:37, 18 April 2018 (UTC)
- 80% is terrible... even 90% is very very bad - I'd expect >99% of our images to be correct. And of the extra ones it'll be even worse; the similar ones are likely imported from us, and give no benefit. Galobtter (pingó mió) 08:42, 18 April 2018 (UTC)
- (ec)Many images on Wikidata are taken from them being used on the enwiki article in the first place. Taken these into the count of how many images on Wikidata are correct is circular reasoning. The question is whether, for images where we don't have an image in our article yet, or where the image on Wikidata is different to the one used here, how many of the images we find on Wikidata are acceptable for use in an infobox here, or are an improvement over what we already have in the second case. Fram (talk) 08:44, 18 April 2018 (UTC)
- I agree - I'd never heard of the category & the first 2 I looked at were wrong - #1 Kenneth Allen (physicist) had an image here (and in the infobox), #2 Harry Abend (an artist) had an image of a work on WP (not in infobox), but no portrait image either on WP or WD. Mind you, #3 Kirsty Alley worked ok, & has gone from the cat since I added a pic. But not an impressive argument for WD use. Johnbod (talk) 13:31, 21 April 2018 (UTC)
1F&(4E|4F) combination
[edit]Just for clarity, the 1F&(4E|4F) combination still allows that if someone sees a value being transcluded from WikiData with which the observer disagrees (either it is wrong, because it does not have our flavour of display, etc.) to be overwritten with a local value (I know, if it is wrong it would be preferred that it is corrected on WikiData .. but that aside)? --Dirk Beetstra T C 12:27, 12 April 2018 (UTC) (adapted to include option 4E, which is the same, thanks User:Fram --Dirk Beetstra T C 14:00, 12 April 2018 (UTC))
- That's not how I understand it. In 4E and 4F, Wikidata data always gets precedence, only when there is no Wikidata value can a local value be shown. If you see an error on Wikidata, you need to correct it there. How we should deal in this scenario with data which is acceptable on Wikidata, but unacceptable here (e.g. because of different sourcing requirements, different BLP policies, ...) is not clear. Simple things like genre warriors on music articles can then simply move to Wikidata, and then we would need to get them blocked or restricted there or the Wikidata item protected there, to get a stable infobox here.
- Basically, we can no longer enforce our own policies if 4E or 4F gets accepted. Fram (talk) 12:55, 12 April 2018 (UTC)
- Well we can. Its called 'Remove/replace the infobox with one that doesnt call wikidata' as I am not about to start messing around on another project to solve their problems. Only in death does duty end (talk) 13:00, 12 April 2018 (UTC)
- @Only in death: but that would violate 1F 'Every infobox that is technically ready to convert from local content to Wikidata content may be implemented in mainspace' - if that passes everyone who would then re-replace the infobox with the WikiData would abide by the '1F' consensus here, and stand fully in their right. --Dirk Beetstra T C 13:55, 12 April 2018 (UTC)
- Anyone who replaced an infobox (without fixing it) that contained vandalised/blatantly incorrect data would be equally at fault as whoever altered the data on wikidata and would end up sanctioned accordingly. 1F would not in any way negate our content and sourcing policies. It would just as Fram points out, make it hard to enforce without taking heavy-handed actions. Only in death does duty end (talk) 14:04, 12 April 2018 (UTC)
- .. and that starts already by editorial choice, not only vandalism and blatantly incorrect data, think D-M-Y/M-D-Y, °F/°C/K, ise/ize, ... I cannot fathom what disagreements (my understatement of the day) this all would cause. --Dirk Beetstra T C 14:11, 12 April 2018 (UTC)
- Anyone who replaced an infobox (without fixing it) that contained vandalised/blatantly incorrect data would be equally at fault as whoever altered the data on wikidata and would end up sanctioned accordingly. 1F would not in any way negate our content and sourcing policies. It would just as Fram points out, make it hard to enforce without taking heavy-handed actions. Only in death does duty end (talk) 14:04, 12 April 2018 (UTC)
- @Only in death: but that would violate 1F 'Every infobox that is technically ready to convert from local content to Wikidata content may be implemented in mainspace' - if that passes everyone who would then re-replace the infobox with the WikiData would abide by the '1F' consensus here, and stand fully in their right. --Dirk Beetstra T C 13:55, 12 April 2018 (UTC)
- Well we can. Its called 'Remove/replace the infobox with one that doesnt call wikidata' as I am not about to start messing around on another project to solve their problems. Only in death does duty end (talk) 13:00, 12 April 2018 (UTC)
Local files
[edit]Wikidata only allows files from Wikimedia Commons. What would happen to non-free fair use logos used in infoboxes? Daylen (talk) 04:15, 14 April 2018 (UTC)
- There are some technical options to do that (using local data, using different property, and so on). But I think simply that it is not suitable to replace such data field by Wikidata (e.g. logo field in company infobox, etc). Wikidata is useful, but only for limited data field. Not omnipotent. --Was a bee (talk) 06:00, 14 April 2018 (UTC)
- Sure, fair use images must stay here, and the corresponding data can not be replaced.--Ymblanter (talk) 07:38, 14 April 2018 (UTC)
- You can always keep and use local images. Wikidata images can be used when no local image is specified. User:Capankajsmilyo(Talk | Infobox assistance) 04:22, 18 April 2018 (UTC)
- Not with 4E and 4F. Fram (talk) 04:25, 18 April 2018 (UTC)
- You can always keep and use local images. Wikidata images can be used when no local image is specified. User:Capankajsmilyo(Talk | Infobox assistance) 04:22, 18 April 2018 (UTC)
- Sure, fair use images must stay here, and the corresponding data can not be replaced.--Ymblanter (talk) 07:38, 14 April 2018 (UTC)
Middle ground
[edit]Except for the extremists, 1A/1F, middle ground seems achievable if we include wikidata but keep an option to override the values with local data. I am not sure extremists are ever gonna change their views, especially 1A ones. But for others, there are ample workarounds and consensus options. A few arguments I've read was Wikidata is about "only bots" which is untrue. Most of the edits on Wikidata I have seen (for Property values) were by humans. Another was that by piping Wikidata control get over. Again untrue. By piping Wikidata, you just add another option to add data. You can always customise using |fetchwikidata=
, |onlysourced=
, etc. Plus you can override with local values too. Another untrue statement was that enwiki is a source at Wikidata. Although it's a source there, but the module and the template has been coded in such a way that it does not pipe in values from sources mentioned as enwiki, etc when |onlysourced=
is set to yes. There seems to be a lot of misconception and misinformation prevailing about Wikidata which needs to be calmly looked at before arriving on any concensus. Also, it is worth considering that this discussion is not about Wikidata, but modules and templates fetching values from there which has been customised to enwiki policies. Capankajsmilyo (talk) 09:56, 14 April 2018 (UTC)
- The only problem being that a middle ground is in this case a recipe for continuing fights. Note that infoboxes and WikiData both already have been looked at by ArbCom, this is material that could end up there again and again.
- There are cases on Wikipedia where we have chosen an extremist stance - NFCC being one. Sometimes it is the better choice, and we can always revisit this at a later stage. At least for me, I can foresee scenarios where I can be convinced in the futere. —Dirk Beetstra T C 10:03, 14 April 2018 (UTC)
modules and templates fetching values from there which has been customised to enwiki policies
. Some of the people working on these modules and templates have absolutely done an admirable job. However, it would be next to impossible to design a biographical infobox template that automatically pulls all of its information from Wikidata, and have it comply 100% with enwiki policies without human review and intervention. Plus, the implementation you describe with onlysourced etc is the current status; that's not the only option presented in this RfC and not what you yourself supported. Nikkimaria (talk) 12:55, 14 April 2018 (UTC)- Why should this template pull 100% info from Wikidata? I though this RfC is about 0% vs non-zero. I do not think there is a significant population of users who want to make every template pull all info from Wikidata.--Ymblanter (talk) 13:46, 14 April 2018 (UTC)
- The 1F,4F combination would pull all the data that any implemented template requested. In the case of the {{Infobox person/Wikidata}} template, this would be 23 data items (some of which can have multiple values). By default, only items that are sourced are included. The only option for an Wikipedia editor would have would be to suppress individual items. Of the 26 !voters who expressed a preference, 3 went for the 1F,4F combination. Robevans123 (talk) 15:21, 14 April 2018 (UTC)
- My position on this has been that a "one-size-fits all" approach is a bad idea. I can understand the concerns about BLP, but do those same arguments apply to, say, Template:Infobox road, which does use Wikidata for automatically copying over maps between wikis? --Rschen7754 17:46, 14 April 2018 (UTC)
- Why should this template pull 100% info from Wikidata? I though this RfC is about 0% vs non-zero. I do not think there is a significant population of users who want to make every template pull all info from Wikidata.--Ymblanter (talk) 13:46, 14 April 2018 (UTC)
- Pardon the parody, but we should seek a "middle ground" about which side of the road to drive on. Either we can all drive in the middle, or we can drive on either side at random. The responses to this RFC cluster at the two ends for good reason. Aiming for some kind of "middle" is the worst possible outcome... and it's an unstable mess. Either we should fully embrace the positives of Wikidata and use it everywhere it makes sense (1F), or we should fully eliminate the negatives of Wikidata and not use it (1A). It's massively disruptive to have the two sides fighting over each template one by one, and it creates months disruption and staggering wasted work when even a single template or template-field is converted in one direction then re-converted in the opposite direction. No, no, no, and hell no. Flipping back and forth is unacceptably disruptive and wasteful. Alsee (talk) 14:02, 16 April 2018 (UTC)
- That is completely not my experience with Template:Infobox road and its pulling the map data from Wikidata. There have been plenty of fights over that infobox, but none related to Wikidata. --Rschen7754 18:14, 16 April 2018 (UTC)
- I doubt that traffic on two-dimensional roads is a good analogy here. Wikidata integration does not become magically better: for that it needs a pathway to success. We could discuss a bit more where we would like to end up eventually, provided that all technical challenges and quality assurances are being met. Most of the non-A options are a careful step towards the potential that Wikidata has to offer. We need some level of integration, even if it is only in some well-defined corners of our universe, to provide space for the technology to develop. If there are actual scenarios in play, we can learn what works well, and what doesn't. It's not as much on which side of the road we'd like to drive, but perhaps a question of whether we want to start thinking about permitting flying cars - and under which conditions. effeietsanders 21:48, 16 April 2018 (UTC)
- There's a reason why question 1 isn't just yes/no, and that's because middle grounds are possible, providing they are agreed upon and are followed without editwarring. I think we're seeing groups at either end because people hold strong views here, but that doesn't rule out the middle options. Thanks. Mike Peel (talk) 21:50, 16 April 2018 (UTC)
- I disagree, we don’t need to facilitate to help WikiData to develop, we do not need some level of integration. Moreover, we have run this experiment now for some time, but there was no drive to improve, there is a drive to implement and a drive to extract (and I can’t escape the feeling that the current fights about infoboxes and about wikidata are actually about this). Again like NFCC, we do not allow experiments, we have clear rules that cannot be negotiated. You can do your experiments in project namespaces, but 1A is simply a ‘no wikidata infoboxes in mainspace’. —Dirk Beetstra T C 03:46, 17 April 2018 (UTC)
- @Beetstra: "no drive to improve" is harsh and unjustified. Just look at the work that's been going into Module:WikidataIB, and the various infobox templates, to improve support for different features. And bear in mind that South Pole Telescope, the example given above, didn't have an infobox before this work started! Mike Peel (talk) 22:10, 17 April 2018 (UTC)
- @Mike Peel: Wikidata has run bots that imported spam, just approved of the data promise. WikiData runs bots at incredible import speeds. There is no mechanism to protect data with respect to their references. Lots and lots of material is unreferenced, and there is no guarantee that data is correct with respect to the references there are. It is an ocean of data.
- We had, with a number of editors, a drive to verify a crude 250000 numerical, nearly immutable, identifiers, and that did never finish, and is in an inbetween state, now unmaintained. I have had hope on WikiData, and there have been some discussions around that, but seen how such drives go on en.wikipedia, I will not hold my breath. I am sorry, we have different goals, and those goals do not scale linearly with en.wikipedia’s goals.
- (that is a red herring, right: en.wikipedia could just have added an infobox and get the data - that could very well have been done without any help from Wikidata). —Dirk Beetstra T C 03:32, 18 April 2018 (UTC)
- The unsourced data or data having Wikipedia as source doesn't get piped with
|onlysourced=
set as yes. So why are you trying to imply that it does. User:Capankajsmilyo(Talk | Infobox assistance) 04:26, 18 April 2018 (UTC)- Sure, you can omit all data that is unsourced, problem is that there is no control over the data that is sourced. I have to assume that all is properly sourced and that it has not changed since. --Dirk Beetstra T C 07:08, 18 April 2018 (UTC)
- Actually there is and it's implemented using abuse filter. User:Capankajsmilyo(Talk | Infobox assistance) 07:51, 18 April 2018 (UTC)
- Looking at the log (500 most recent entries), I see no filter actions for "changed sourced data" or something similar. "Adding unsourced ethnicity" is the only one. Can you please indicate which abuse filter you mean. Fram (talk) 08:01, 18 April 2018 (UTC)
- There are no tags for this at all. 'new editor changing data' is the closest maybe. Quite some of those are either good faith or plainly correct, though. --Dirk Beetstra T C 08:33, 18 April 2018 (UTC)
- Actually there is and it's implemented using abuse filter. User:Capankajsmilyo(Talk | Infobox assistance) 07:51, 18 April 2018 (UTC)
- Sure, you can omit all data that is unsourced, problem is that there is no control over the data that is sourced. I have to assume that all is properly sourced and that it has not changed since. --Dirk Beetstra T C 07:08, 18 April 2018 (UTC)
- The unsourced data or data having Wikipedia as source doesn't get piped with
- @Beetstra: "no drive to improve" is harsh and unjustified. Just look at the work that's been going into Module:WikidataIB, and the various infobox templates, to improve support for different features. And bear in mind that South Pole Telescope, the example given above, didn't have an infobox before this work started! Mike Peel (talk) 22:10, 17 April 2018 (UTC)
- I disagree, we don’t need to facilitate to help WikiData to develop, we do not need some level of integration. Moreover, we have run this experiment now for some time, but there was no drive to improve, there is a drive to implement and a drive to extract (and I can’t escape the feeling that the current fights about infoboxes and about wikidata are actually about this). Again like NFCC, we do not allow experiments, we have clear rules that cannot be negotiated. You can do your experiments in project namespaces, but 1A is simply a ‘no wikidata infoboxes in mainspace’. —Dirk Beetstra T C 03:46, 17 April 2018 (UTC)
- That is completely not my experience with Template:Infobox road and its pulling the map data from Wikidata. There have been plenty of fights over that infobox, but none related to Wikidata. --Rschen7754 18:14, 16 April 2018 (UTC)
- I want to give another general thought, and this "middle way" seemed to be a place to do it. There are some infoboxes where the content is really about data. Like Template:Infobox gene. I think the data-driven biology people have a nice relationship with the wikidata people and a bunch of them are wikidata people, and the data that gets presented in our infobox is technical stuff that to be frank general readers don't care about. I don't want to get in the way of that. But for infoboxes that have content that everyday people would care about -- that takes no special mojo to understand or think about -- I am very much against Wikidata being used. That is why I am supporting 3A above anything else, and explicit consent only, so that we can allow it where it is OK but rule it out where it has a risk of being harmful. Jytdog (talk) 20:11, 7 May 2018 (UTC)
Dates
[edit]Dates in Wikidata are defined as Universal Time without the possibility of specifying a time zone. Dates in English Wikipedia are generally local time. These are not compatible. Therefore, dates should not be imported from Wikidata. Jc3s5h (talk) 15:17, 19 April 2018 (UTC)
- That is generally not an issue. From what I'm aware, there was a single telescope article affected by that - the first light time for Hale Telescope. Where it is an issue, then the dates can be locally defined, or perhaps a module could be written to do the time zone conversion (since it's mostly simple arithmetic). Thanks. Mike Peel (talk) 15:32, 19 April 2018 (UTC)
- Sources normally give dates in the local time zone, but more often than not, do not state the time of day. It's impossible to correctly input such a date into Wikidata; Wikidata can't represent it. Thus the majority of Wikidata dates are garbage. Modules can't remove the stink from garbage. Jc3s5h (talk) 16:38, 19 April 2018 (UTC)
- So what you're saying is that we shouldn't trust any date on a any website unless it also specifies a timezone - which we don't normally do here on enwp? GIGO? Thanks. Mike Peel (talk) 20:27, 19 April 2018 (UTC)
- Most sources, including most websites, do not explicitly state time zones. This means the time zone must be determined from context. Most sources that Wikipedia uses place the responsibility to make whatever time zone determinations are desired upon the reader. Most Wikipedia articles do exactly the same, place the responsibility of determining the time zone upon the reader, if the reader needs that degree of accuracy. Wikidata does not permit entering a date with an unspecified time zone, so is adding a time zone statement (Universal Time) to statements from sources that were agnostic about time zones. In effect, Wikidata is misquoting sources. Jc3s5h (talk) 07:18, 20 April 2018 (UTC)
- So what you're saying is that we shouldn't trust any date on a any website unless it also specifies a timezone - which we don't normally do here on enwp? GIGO? Thanks. Mike Peel (talk) 20:27, 19 April 2018 (UTC)
- Sources normally give dates in the local time zone, but more often than not, do not state the time of day. It's impossible to correctly input such a date into Wikidata; Wikidata can't represent it. Thus the majority of Wikidata dates are garbage. Modules can't remove the stink from garbage. Jc3s5h (talk) 16:38, 19 April 2018 (UTC)
I do not really see the problem here - I agree that the source timing could be off by about 23 hours (as my DOB would be one day earlier or one day later depending on where you were observing my birth), but there is also no problem for WikiData to include next to years, months, days, hours, minutes and seconds a timezone qualifier (where needed) and a calendar qualifier (Gregorian / Hijri / other systems). Whether all of those need to be filled in on a date is up to the person who adds that data. All of that can then be calculated back. --Dirk Beetstra T C 07:37, 20 April 2018 (UTC)
- Wikidata lacks the development bandwidth to fix the problem. The bugs have been reported years ago and only a few of the multitude of problems have been fixed. See T87764 Jc3s5h (talk) 07:47, 20 April 2018 (UTC)
- @Jc3s5h: I am going to rephrase that: 'Wikimedia lacks the development bandwidth to fix any problem'. Anyways, the only requirement I see is that things should work per en.wikipedia standards (or better) on wikidata if we are to import their data. --Dirk Beetstra T C 08:41, 20 April 2018 (UTC)
Regarding 3B vs 3A
[edit]Currently, consensus appear to be heavily skewed towards the 4 "A options". However, I personally think that 3A is a bit too excessive. Therefore, let me make the case for 3B.
Non-contentious content which is well sourced within the article does not usually (actually, almost never) have an additional source (or a ref tag pointing to a source elsewhere in the article) just because it's in the infobox. For example, all info in this infobox is non-contentious and cited elsewhere in the article. No need for a repeat. Same for this other example and plenty of others. In the spirit of avoiding WP:CITEOVERKILL, I don't see why we should require duplicate/repeated sources simply because something is summarized in an infobox. 198.84.253.202 (talk) 16:17, 21 April 2018 (UTC)
- This is moot is there is consensus for 1A, which means no Wikidata in the first place. {{3x|p}}ery (talk) 22:31, 21 April 2018 (UTC)
- This does however show one of the problems with WikiData integration. If the data is the same as in the text, and in the text it is referenced, then I could indeed say that the data is properly referenced. But you get a ‘say where you got it’ type of problem in most other cases where the material is contentious: The material on WikiData may have another reference than used anywhere in the document, or material in infoboxes that is not ‘prose material’ and hence not referenced in the text. (And for the inverse, there is material that is properly referenced on en.wikipedia in the prose, and generally included in the infobox, but which would not be transcluded because it is currently not referenced on WikiData (e.g. data that was currently imported from an unreferenced infobox on en.wikipedia, because it is thoroughly referenced in prose). There is still so much to do on WikiData before en.wikipedia is ready for this. —Dirk Beetstra T C 03:59, 22 April 2018 (UTC)
Simple vandalism on Wikidata affects enwiki for hours after it has been reverted
[edit]Jajaxdelol
[edit]I was just now editing Windmill Hill, Avebury (Infobox World Heritage Site) when I spotted the vandalism visible in the screenshot. Getting rid of the vandalism here isn't hard. But I am unable to find the vandalism on Wikidata, never mind correct it.
The same vandalism also affects our (enwiki) infoboxes in Beatrice oil field (Infobox power station), Sandys Row Synagogue (Infobox religious building), The Sharp Project (Infobox project), Half-Mile Telescope (infobox telescope), ... (each time, I only give one example, but as I am writing this Google finds 67 articles with this vandalism). So this is one bit of stupid vandalism, affecting a wide range of articles here, but which can't easily be found and reverted even by experienced enwiki editors.
It turns out that this is a bit of vandalism which is over 24 hours old, but was (only?) live at Wikidata for 25 minutes: someone changed the English label for "United Kingdom" to "jajaxedlol"United Kingdom. Although this has been reverted more than 24 hours ago, it has a lasting effect on dozens of pages here (Google lists 67, searching on enwiki directly gives 242 pages[7]).
So 242 (or more) pages, across multiple Wikidata-enabled infoboxes, have been vandalized in this most basic way (nothing clever or sophisticated) for more than a day, because a) vandalism detection and reversion on Wikidata is too slow and haphazard, and b) the changes on Wikidata get ported to enwiki at seemingly random intervals, causing shortlived vandalism to remain live here anyway. Fram (talk) 11:34, 25 April 2018 (UTC)
- As I understand it, we have a parser that stores the parsed page. In that, it does not re-parse the infobox until the page is edited, and it gets only re-parsed after x amount of time. So if a vandalized page with a template transcludes a wrong value, it will put the vandalized WikiData data in the page on a refresh, and only remove it on the next refresh. With some bad timing that could indeed stay for hours while it is only live for seconds on en.wikipedia.
- If a page on en.wikipedia gets vandalized, the page gets reparsed - when it gets reversed, it gets reparsed. Template-vandalism on en.wikipedia could stay longer, as that depends on the same refresh of parsed pages. --Dirk Beetstra T C 09:26, 26 April 2018 (UTC)
- Which is why we have protected thousands of templates on enwiki, to avoid as much as possible that one vandal can vandalize hundreds of articles at once. In this case, an editor could actually change the label (the "title" of the article on Wikidata) of the United Kingdom without any problem and without being reverted for more than 20 minutes, affecting many enwiki templates at once. It's a kind of problem we finally mostly have gotten under control on enwiki, but where we would then open the gates again by relinquishing our control. Fram (talk) 09:43, 26 April 2018 (UTC)
- And that is why it should be possible to protect immutable data ('individual Qs belonging to individual Ps') on WikiData (and probably also a whole set of Ps, Qs can already be protected). --Dirk Beetstra T C 09:54, 26 April 2018 (UTC) --Dirk Beetstra T C 09:54, 26 April 2018 (UTC)
- Which is why we have protected thousands of templates on enwiki, to avoid as much as possible that one vandal can vandalize hundreds of articles at once. In this case, an editor could actually change the label (the "title" of the article on Wikidata) of the United Kingdom without any problem and without being reverted for more than 20 minutes, affecting many enwiki templates at once. It's a kind of problem we finally mostly have gotten under control on enwiki, but where we would then open the gates again by relinquishing our control. Fram (talk) 09:43, 26 April 2018 (UTC)
- I can confirm Fram's observations on the "jajaxedlol" issue, which I saw yesterday while working through some Template talk:Infobox World Heritage Site#Step by steps. I couldn't find out what had caused it (as Fram apparently eventually did), so after thinking some time about how to report it here I eventually didn't while not knowing how to present it in a way that illustrated what exactly was going wrong. --Francis Schonken (talk) 10:47, 26 April 2018 (UTC)
- Ah, every time I see I only see more problems with wikidata.. Galobtter (pingó mió) 15:15, 27 April 2018 (UTC)
We are now full week after this vandalism edit, but an en.wikipedia search results in hundreds of results for jajaxedlol. For the same reason, the catalan Wikipedia has hundreds of pages for gilipollas.
Although the same thing would be happening, likely, if Heart of Neolithic Orkney was changed to have those parameters, it would still only be one page that would be affected. That would indeed be worse if the template was changed (but the template in this case is protected ..).
The possibilities of this are infinite. --Dirk Beetstra T C 07:05, 1 May 2018 (UTC)
As of now, searching for 'neolithic orkney heart united kingdom' within Wikipedia does NOT find the article Heart of Neolithic Orkney, as that term was not contained in the (vandalized) article at the time of indexing. --Dirk Beetstra T C 08:44, 1 May 2018 (UTC)
- @Beetstra: I can't reproduce either of these search results. The first search only returns this page, the second one does find the article you mentioned. Mike Peel (talk) 12:53, 1 May 2018 (UTC)
- @Mike Peel: wow. For me now the same. Does performing the search update induce the reindexing? Anyway, this google search still worked for me at this moment]. --Dirk Beetstra T C 13:19, 1 May 2018 (UTC)
- @Beetstra and Mike Peel: Oddly, I tried the first search earlier this morning (~10 am), and got 100 (?) articles, but the first few articles I looked at didn't actually contain "jajaxedlol". Go figure... Robevans123 (talk) 18:12, 1 May 2018 (UTC)
- @Robevans123: the ones I checked also did not contain the vandalism anymore. All cached data I presume. —Dirk Beetstra T C 19:14, 1 May 2018 (UTC)
- @Beetstra and Mike Peel: Oddly, I tried the first search earlier this morning (~10 am), and got 100 (?) articles, but the first few articles I looked at didn't actually contain "jajaxedlol". Go figure... Robevans123 (talk) 18:12, 1 May 2018 (UTC)
- @Mike Peel: wow. For me now the same. Does performing the search update induce the reindexing? Anyway, this google search still worked for me at this moment]. --Dirk Beetstra T C 13:19, 1 May 2018 (UTC)
Raider
[edit]- WP:CIRCULAR example:
- 10:16, 31 May 2017: the Madara Rider article is vandalized
- 01:59, 9 July 2017: the vandalized version of the image caption in the infobox is transferred to Wikidata (obviously without checking the reference as given in the infobox, which is 43, at the UNESCO World Heritage Site website)
- 01:59, 9 July 2017: local data of the infobox are deleted at Wikipedia, thus importing the erroneous image caption directly from Wikidata (= infraction on the WP:CIRCULAR policy).
- --Francis Schonken (talk) 10:00, 1 May 2018 (UTC)
- WP:CIRCULAR doesn't apply since no source is being claimed here. Thank you for fixing enwp vandalism that lasted for 11 months though! Mike Peel (talk) 12:50, 1 May 2018 (UTC)
"Also, do not use websites that mirror Wikipedia content or publications that rely on material from Wikipedia as sources. Content from a Wikipedia article is not considered reliable unless it is backed up by citing reliable sources. Confirm that these sources support the content, then use them directly"
. We are using material from a website that mirrored Wikipedia data, and that does not provide a reliable source for said information. Here, the mirror is unsourced as the person mirroring it did not copy a source, whereas our Wikipedia article has a reference regarding the naming. --Dirk Beetstra T C 12:55, 1 May 2018 (UTC)- No, you're misunderstanding what you're quoting. No-one is saying that we should use Wikidata as a reference for the information, which is what 'source' means in this case. Mike Peel (talk) 13:02, 1 May 2018 (UTC)
- And I think that that is part of the problem that is being discussed here .. is taking data from WikiData 'sourcing' data, and is WikiData then at the same moment the reference to said data ("The word "source" when citing sources on Wikipedia has three related meanings: .. The piece of work itself (the article, book)", WP:V). If I use data from website X, then I say where I got it, 'Website X', if I use stuff from WikiData, then I should say where I get it, 'WikiData', which is hence my source of data. That is why I think that we should not use data from WikiData, sourced or unsourced, unless that WikiData can vow for the data, and that it is correctly referenced. I am sorry, the sourcing requirements on WikiData have to be more strict than on en.wikipedia, you may have to have reference for material that would not need references on Wikipedia. --Dirk Beetstra T C 13:19, 1 May 2018 (UTC)
- No, you're misunderstanding what you're quoting. No-one is saying that we should use Wikidata as a reference for the information, which is what 'source' means in this case. Mike Peel (talk) 13:02, 1 May 2018 (UTC)
- @Mike Peel: what an abject nonsense. I say there's a WP:V infraction (as in the lead sentence of WP:V: "In Wikipedia, verifiability means that other people using the encyclopedia can check that the information comes from a reliable source"), that is, an infraction by not heeding the WP:CIRCULAR part of that policy. Then you start Wikilayering – please think through what you're trying to say: not a WP:V infraction? Then plain and simple vandalism... or did you mean that the 10:16, 31 May 2017 edit was not plain and simple vandalism? If you say that your error is not that you erroneously and unknowingly sourced your error to a Wikipedia vandal, then what you did is drawing in vandalised content from an unreliable source... In my book, if the WP:V transgression excuse is discarded, all what remains is plain vandalism. It is my strong opinion that you shouldn't be let near to a single Wikidata infobox, certainly if you start to prefer "vandalism" as an excuse over "WP:V infraction". --Francis Schonken (talk) 13:43, 1 May 2018 (UTC)
- @Mike Peel: Actually, what you did there is a perfect example of one of my concerns regarding WikiData: the collection of data is (much) more important than any check that what is imported is correct, that there is no (sufficient) drive to have the data correct. The motto is: import, import, import. Combine that with the fact that the data, referenced or not, can be (and is) vandalised, and that it pays to spam WikiData as a shortcut to 800 wikis. —Dirk Beetstra T C 19:14, 1 May 2018 (UTC)
- WP:CIRCULAR doesn't apply since no source is being claimed here. Thank you for fixing enwp vandalism that lasted for 11 months though! Mike Peel (talk) 12:50, 1 May 2018 (UTC)
Trocolandia
[edit]Similar to #Jajaxdelol above:
- 14:19, 28 May 2018 – d:Q183 is vandalised
- 14:40, 28 May 2018 – vandalism reverted
- 07:13, 31 May 2018 (UTC) The "Trocolandia" spoof still appears on a few dozen English Wikipedia articles
--Francis Schonken (talk) 07:13, 31 May 2018 (UTC)
I started Wikipedia cleanup:
- [8] – still 29 instances to go --Francis Schonken (talk) 07:49, 31 May 2018 (UTC)
- [9] – still 28 instances to go --Francis Schonken (talk) 08:05, 31 May 2018 (UTC)
- [10] – still 27 instances to go --Francis Schonken (talk) 08:32, 31 May 2018 (UTC)
- [11] – still 26 instances to go --Francis Schonken (talk) 09:13, 31 May 2018 (UTC)
- [12] – still 25 instances to go --Francis Schonken (talk) 09:24, 31 May 2018 (UTC)
- [13] – still 24 instances to go --Francis Schonken (talk) 09:47, 31 May 2018 (UTC)
- [14] – still 23 instances to go --Francis Schonken (talk) 09:58, 31 May 2018 (UTC)
- [15] – still 2 instances to go (some refreshing finally seems to have kicked in) --Francis Schonken (talk) 10:12, 31 May 2018 (UTC)
- [16] – all instances of the spoof "flushed" at English Wikipedia --Francis Schonken (talk) 10:45, 31 May 2018 (UTC)
In sum:
- This vandal act persisted on Wikidata for 21 minutes, in a single entry
- As a consequence of the short-lived vandalism of one Wikidata entry, this vandalism was multiplied at least 30 times on English Wikipedia articles, and it took around 200 times as long as the exposure period of the original vandalism to get rid of it in English Wikipedia.
Completely unacceptable MO, if you ask me. --Francis Schonken (talk) 10:45, 31 May 2018 (UTC)
- Comment: This is basically technically avoidable effect. And I suppose in most cases, Lua is set in label-vandal-proof manner. The reason this happens is fetching "Label" as text in infobox. If fetching sitelink text, this doesn't happen. --Was a bee (talk) 08:14, 31 May 2018 (UTC)
- @Was a bee: thanks for the suggestion on how to address this, could you please proceed with a WP:SOFIXIT? The affected infoboxes are afaik {{Infobox World Heritage Site}} and {{Infobox pyramid}}. Ask Mike Peel, who afaik set up both of these boxes with Wikidata functionality, for assistance if needed (I'm not an expert in Lua programming). Thanks. --Francis Schonken (talk) 08:42, 31 May 2018 (UTC)
- @Francis Schonken: These seem to come from {{Wikidata location}}. So I posted idea feedback at there (Template_talk:Wikidata_location#Label_text_or_sitelink_text). As additional information, this type of phenomena (one edit in WD affects hundreds of pages in WP) happens basically only in certain properties of wikibase-item (one of 15 datatypes). Among those, I think country (P17) would be the most widely used case of such property. As you can see, this section (#Trocolandia about Germany) and above section (#Jajaxdelol about England) are both about country. In this sense, I think these are very good examples for thinking about this phenomena. I think there are various options about this issue, from simply "not-use-WD" to "replacing data once by WD", or some other technical ways.--Was a bee (talk) 15:57, 2 June 2018 (UTC)
- @Was a bee: thanks for the suggestion on how to address this, could you please proceed with a WP:SOFIXIT? The affected infoboxes are afaik {{Infobox World Heritage Site}} and {{Infobox pyramid}}. Ask Mike Peel, who afaik set up both of these boxes with Wikidata functionality, for assistance if needed (I'm not an expert in Lua programming). Thanks. --Francis Schonken (talk) 08:42, 31 May 2018 (UTC)
What triggers updating the en.wikipedia cache to refresh data from WikiData - I presume that a WikiData edit does not trigger the cache-refresh on en.wikipedia for pages using that data. --Dirk Beetstra T C 08:56, 31 May 2018 (UTC)
- I assume this works exactly in the same way as templates - refreshing a template does not trigger the page-refresh of the pages using the template; one needs to refresh them one by one.--Ymblanter (talk) 09:03, 31 May 2018 (UTC)
- Changing a template does trigger the refreshing of pages (see here). Galobtter (pingó mió) 09:07, 31 May 2018 (UTC)
- Strange. This is definitely not my experience.--Ymblanter (talk) 09:10, 31 May 2018 (UTC)
- OK, what happens on en.wikipedia is that when a page is edited, it gets refreshed on the spot in the cache. If a template gets edited, ALL pages that are using said template get into the 'refresh queue', and their display will be refreshed in due time. My question is basically: if I edit a WikiData item, do all Wikipedia-pages that are depending on that item being added to the 'refresh queue', or does that only happen when either the page that is transcluding the data is refreshed, or when the template it is transcluding does? --Dirk Beetstra T C 09:32, 31 May 2018 (UTC)
- (ec)It's not instantaneous, but it certainly is a lot quicker than when it comes down from Wikidata. See e.g. this and the previous change to a local template, which show up in all (12) articles already. This one as well is immediately reflected. Fram (talk) 09:35, 31 May 2018 (UTC)
- On templates with a small number of transclusions it goes rather fast, on templates with thousands and thousands of transclusions it does take considerable time. --Dirk Beetstra T C 10:40, 31 May 2018 (UTC)
- Strange. This is definitely not my experience.--Ymblanter (talk) 09:10, 31 May 2018 (UTC)
- Changing a template does trigger the refreshing of pages (see here). Galobtter (pingó mió) 09:07, 31 May 2018 (UTC)
- “Trocolandia” does no longer appear any longer in any page, after I performed null-edits on all of the remaining ones from the search above. I also openend a phabricator ticket T196057 for the issue, in case this is not yet known. —MisterSynergy (talk) 10:26, 31 May 2018 (UTC)
- Not on Wikipedia, but Google still happily finds pages with Trocolandia ... Now we have to wait for the Google cache to refresh. --Dirk Beetstra T C 10:51, 31 May 2018 (UTC)
- For me, Google still finds Marmorpalais and GEO600 in Trocolandia. —Dirk Beetstra T C 03:24, 5 June 2018 (UTC)
- Google says to me that Trocolanda is at Marmorpalais, Upper Harz Water Regale, File:Bautzen-nach1620-Merian.jpg, Commons-File:Aufgeschütteter_Fehlboden.jpg, and at a bunch of addresses reusing Wikipedia content. Rather significantly, I found that 12 days after the vandalism was fixed, Wikidata itself still displays the Trocolanda-vandalism at Q240190, Q882802, Q2348799, and Q1874400! This particularly concerned me, that Wikidata might continue to feed Trocolanda to Wikipedia via these items. I tested it, and I'm thankful to report that it doesn't. When a page is linked to one of those contaminated items Wikidata bypasses the garbage item-value and imports it as "Germany". Alsee (talk) 23:43, 9 June 2018 (UTC)
A third way?
[edit]Wikidata is a useful and expanding resource which has the potential to be a useful tool to find information to create and improve articles on Wikipedia. I just don't believe that including information automatically in infoboxes is the best way to make use of this resource. However, in Wikipedia, we have various policies and guidelines regarding the use of bots and semi-automated tools which we can use to make edits. I would like to propose a mechanism using similar tools and processes that could be used to utilise more information from Wikidata, while meeting Wikipedia's policies and guidelines.
Page stalkers of RexxS might have seen a suggestion I posted on his talk page a while ago. At the time I thought it might be a way of monitoring what has changed on Wikidata. I now believe that it should be re-purposed to make suggestions to Wikipedia editors (using article talk pages) about useful information that could be used in the article. I've amended the proposal slightly.
So, this suggestion uses two bots (one working on Wikidata, and the other on Wikipedia) although the two might possibly be combined into one:
- The Wikidata bot (WD BOT) regularly monitors wikidata items used in a Wikipedia infobox template, for example, the {{Infobox person}} template.
- A list of template parameters and their matching wikidata properties needs to be generated when setting up the bot. Also, a list of wikidata items that need to be monitored needs to be generated. This could be generated from the What Links Here page, or possibly be adding a property to the Wikidata item indicating which infobox is used in an article.
- If any of the watched properties of the item have changed since the time of the previous check, WD BOT generates a report detailing the changes in a log file somewhere on Wikidata.
- The Wikipedia bot (WP BOT) regularly checks the log file on Wikidata.
- If the log file has been updated with changes since the last check, WP BOT writes a report of the changes on the talk page of the appropriate Wikipedia page. The report will also include details of the reference information on Wikidata, preferably in a cite template such as {{Cite Book}}.
- Any Wikipedia editor who has the page on their watchlist will see that the talk page has been updated (provided they haven't hidden bot edits).
- The Wikipedia editor can then check whether the changes to the infobox are appropriate and referenced and if the information is from a reliable source. They can then make any appropriate changes to the article; adding to infobox, adding text to the article to show the new/changed data in the article, adding a citation to the text using an existing source, or creating a new reference using the information in the WP BOT report.
- Any changes are recorded in the page history of the article, so appear on its watchlist and can be easily reverted/challenged/discussed.
I've not said anything on the frequency of the bot operations. RexxS pointed out that the {{Infobox person}} has roughly 270,000 transclusions, and with 20 parameters would require roughly 6,000,000 properties to be checked. Bearing in mind that most information in infoboxes is static and that Wikipedia is a work in progress, the bot work could be split into manageable chunks, for example, 40,000 checked each day completing a full check each week, or 10,000 checked each day completing a full cycle each month.
If such a process can be made to work then it may be possible to increase the level of automation, and also to feed back on reference information to Wikidata from the sources used in Wikipedia. Robevans123 (talk) 15:06, 27 April 2018 (UTC)
- I suggested this at the preparatory phase of this RfC. There was zero interest in developing this proposal.--Ymblanter (talk) 15:17, 27 April 2018 (UTC)
- @Ymblanter: Thanks - I don't think I was watching the talk page for this Rfc back then, but I've just had a dig in the archives and seen your proposal. Great minds and all that... Well, there's some interest now. We'll see if anyone else bites! Robevans123 (talk) 15:47, 27 April 2018 (UTC)
Sure, and if the material is wrong since the beginning, we will still never notice. And this still requires bots on two sides to monitor. I have this EXACT system running on en.wikipedia alone - it has shown impossible to maintain, and impossible to get correct values in it due to the humongous amount of work needed to sift through tens of thousand of datapoints, and it is because of that utterly unreliable. When these bots start flagging thousands datapoints of which hundreds are actually correct volunteers will quickly start ignoring as the work is simply too much to finish. When this RfC sits out and closes, I will propose a completely different solution to that problem, likely obsoleting any bots. But if Wikipedia is going to get any credibility in the future, we’ll have to take some drastic steps (which some wikiprojects already took). Unfortunately, WikiData missed that boat, and will now face the problem on their own. —Dirk Beetstra T C 19:06, 27 April 2018 (UTC)
Hi all, we have written a grant proposal called GlobalFactSync. In core, it is quite similar to the third way described here. Coming from DBpedia, the backend side is our stronger suit as we have been monitoring and extracting data from Wikipedias for 10 years. We also have a mapping from Infobox properties to Wikidata properties, done by volunteers (we are trying to merge and improve it in the course of the grant project). What I noticed in the discussion above is that doing a maintenance via Bot seems to have some drawbacks, i.e. the bot needs to monitor both changes on Wikipedia and Wikidata day and night. We have implemented such a system called DBpedia Live which process all edits on several Wikipedias (It broke three weeks ago, since there was a switch to Kafka for updates and it still usesd the very old OAI-PMH). The English WP has 130k mainspace edits per day on average. While this can be done, it is a lot of work upfront. Hence, we were suggesting a User Script/Gadget, that does it on a request basis, i.e. each time an editor uses the gadget to get suggestions and compare enwiki to wikidata and other languages, the data is accumulated.
We were also discussing adding info on discussion pages. It can be done, but there is an extra overhead in keeping the discussion page in good shape, i.e. once the infobox is updated the bot needs to remove or update the flag from the talk page, so it causes an extra edit for each edit. We didn't propose this in the grant proposal, as it needs a greater consensus in the WP communities than a user script / gadget. However, if such a great consensus could be reached, we could provide API and backend for the information necessary and help with the bot. Also collaborate on the template parameter to Wikidata mapping. We can also provide the list of infobox template used for each Wikidata item as mentioned above for all Wikipedias. That is already available.
Anyhow, the main idea here is that automation does not mean taking the control from the editor, but automate the way information change is displayed and accessible to get a better handle on the data. SebastianHellmann (talk) 12:11, 3 May 2018 (UTC)
How does the data actually arrive in an infobox?
[edit]In response to Beetstra's contention above that data is sourced from Wikidata, it's important to clear up some misconceptions. First, the data that arrives in infoboxes on Wikipedia is not transcluded from Wikidata. Transclusion is defined at Transclusion as "the inclusion of the content of a document into another document by reference". There is no document on Wikidata that content can be transcluded from, nor is the data that arrives passed by reference. The action of importing the data takes values that are individually filtered to meet the needs of Wikipedia. Each value that is imported will have an associated reference that can be checked, just as any fact in any infobox should have. Further, it is nonsensical to call data "reliable" or "unreliable", as those are properties of the sources that verify the data. The reliability of the source cited on Wikidata is as easily ascertained as a source found on Wikipedia. In Wikipedia, which is not considered a reliable source itself, we are able to take sourced content (along with its references) and add it to another article. We do not reject the content as "unreliable", nor do we reject the sources because they were cited on Wikipedia. This is because Wikipedia is an unstructured repository of content and references. Wikidata is a structured repository of content and references. While Wikidata has policies less rigorous than Wikipedia's, it would indeed not be appropriate to indiscriminately import content from Wikidata. But that is not what happens in our Wikidata infoboxes: they discriminate between content that is sourced and that which is not; they allow the editor at the article level to check that source; and they allow that editor to reject, replace, or amend the content of the infobox. The sourcing requirements for content on Wikidata are not relevant to the question "Can Wikidata infoboxes be used in mainspace?" because our infoboxes only see the fraction of Wikidata content that has an associated source. If someone feels it necessary to verify the content against its source and the source's reliability, they do that in exactly the same way as they would verify content on Wikipedia, that is, by checking source and the source's reliability. Content and its sourcing that is found on Wikidata is not somehow mystically "tainted" by Wikidata's policies or data model. That content and its source is independent of the repository where it has been collected. --RexxS (talk) 00:23, 21 May 2018 (UTC)
- I'm sorry, but I disagree strongly with almost everything that you write above. I'll mention just one detail. Checking the original sources of stuff that gets into WP via WD is absolutely not the same as checking sources on WP. If I want to check the source of something imported from WD (and I don't care how you call that), I first have to go to WD, then find the thing I want to check, then find it's source, then check that source. On WP, all I need do is go to the reference, then go to the source. In addition, if something on WP is changed, I see that in my watchlist. If, say, an official website is "hijacked" by some spammer on WD, I won't see that. Etc etc etc. --Randykitty (talk) 02:42, 21 May 2018 (UTC)
- You can import data together with the source (and actually block data which have no source or a bad source from being imported).--Ymblanter (talk) 07:29, 21 May 2018 (UTC)
- @Ymblanter: You keep on saying (and others do the same):
"You can import data together with the source (and actually block data which have ... a bad source from being imported)"
. Can you show me a real example of data that is being transcluded (please use the right term, this is transcluded, not imported) from WikiData where a 'bad source' is filtered. Does 'bad source' equate 'from Wikipedia'? Because we have more 'bad sources' than Wikipedia itself. And how does said template that is to transclude data from WikiData decide that the data actually NEEDS a source (see WP:BLUE in the specific instance of transclusion? --Dirk Beetstra T C 10:55, 21 May 2018 (UTC)- I am not a right person to answer this. I once inquired whether this is possible, and was told it is possible. We need a more tech-oriented person to explain how this is possible.--Ymblanter (talk) 10:59, 21 May 2018 (UTC)
- I know exactly how what can be done: I can filter data on BLP's by having a reference, or not. If it has a reference, I can see whether that reference is from some typical unreliable sources (say, imdb), and exclude those. After that, it becomes impossible, as it becomes judgement. There is, as we already discussed on WP:RS, no way more accurate than artificial intelligence that can decide whether a certain source is a reliable source for the fact. For some data it is a case-by-case judgement. That means that every transclusion (at edit or when added later) needs to be manually checked whether the transcluded fact is reliably sourced. The only way that I see around this would be a change in data model. But at this time, plain filtering by domain (which is limited and heavy) is the only LUA can do. --Dirk Beetstra T C 11:41, 21 May 2018 (UTC)
- No, this is incorrect. One can make a list of sources which were used by bots to source the statements in Wikidata (which is not so long) and to see in advance which sources are acceptable and which are not. This is the only manual work which can be done (plus possible vandalism on Wikidata as I explained above).--Ymblanter (talk) 11:45, 21 May 2018 (UTC)
- @Randykitty: Checking the original sources of stuff that gets into WP via WD absolutely is the same as checking sources on WP.
- If you want to check the source of something imported into an infobox from WD, you click the pen-icon link and look at the reference; if it's online you follow the hyperlink, otherwise you find the print version.
- If you want to check the source of something manually added to an infobox directly on Wikipedia, you first have to find the text in the article that has the same information, then click superscripted link and look at the reference; if it's online you follow the hyperlink, otherwise you find the print version.
- The only difference is that the fact imported from WD to the infobox is guaranteed to have a reference by default; the fact added to an infobox directly on Wikipedia has no such guarantee.
- {tq|"if something on WP is changed, I see that in my watchlist."}} Yes, and if the same thing is changed on WD, I see that on my watchlist as well. You're imagining problems that don't exist. --RexxS (talk) 14:17, 21 May 2018 (UTC)
- Yes, I can see WD edits in my enWP watchlist, I know that. Originally, I switched that feature on, only to have my watchlist flooded with inconsequential changes (adding of a label in Swahili, all kinds of irrelevant database identifiers, and whatnot). So I switched that off and won't switch it on again. So, no, I won't see any WD changes. --Randykitty (talk) 14:44, 21 May 2018 (UTC)
- So let's be accurate. You choose not to see WD changes, but you choose to see WP changes. And that is your basis for rejecting WD and accepting WP. The only difference between the two cases is that you choose not to look at changes occurring in one of them. I expect the closer will give that reasoning the weight it deserves. --RexxS (talk) 15:16, 21 May 2018 (UTC)
- Yes, that's what I wrote, isn't it? The way WD works, if I add it to my watchlist it's flooded with irrelevant stuff, making it difficult or almost impossible to filter out the stuff that's important. If that means that I now have to accept unreliable stuff that is prone to vandalism in WP articles, so be it. --Randykitty (talk) 15:24, 21 May 2018 (UTC)
- So let's be accurate. You choose not to see WD changes, but you choose to see WP changes. And that is your basis for rejecting WD and accepting WP. The only difference between the two cases is that you choose not to look at changes occurring in one of them. I expect the closer will give that reasoning the weight it deserves. --RexxS (talk) 15:16, 21 May 2018 (UTC)
- Yes, I can see WD edits in my enWP watchlist, I know that. Originally, I switched that feature on, only to have my watchlist flooded with inconsequential changes (adding of a label in Swahili, all kinds of irrelevant database identifiers, and whatnot). So I switched that off and won't switch it on again. So, no, I won't see any WD changes. --Randykitty (talk) 14:44, 21 May 2018 (UTC)
- No, this is incorrect. One can make a list of sources which were used by bots to source the statements in Wikidata (which is not so long) and to see in advance which sources are acceptable and which are not. This is the only manual work which can be done (plus possible vandalism on Wikidata as I explained above).--Ymblanter (talk) 11:45, 21 May 2018 (UTC)
- I know exactly how what can be done: I can filter data on BLP's by having a reference, or not. If it has a reference, I can see whether that reference is from some typical unreliable sources (say, imdb), and exclude those. After that, it becomes impossible, as it becomes judgement. There is, as we already discussed on WP:RS, no way more accurate than artificial intelligence that can decide whether a certain source is a reliable source for the fact. For some data it is a case-by-case judgement. That means that every transclusion (at edit or when added later) needs to be manually checked whether the transcluded fact is reliably sourced. The only way that I see around this would be a change in data model. But at this time, plain filtering by domain (which is limited and heavy) is the only LUA can do. --Dirk Beetstra T C 11:41, 21 May 2018 (UTC)
- I am not a right person to answer this. I once inquired whether this is possible, and was told it is possible. We need a more tech-oriented person to explain how this is possible.--Ymblanter (talk) 10:59, 21 May 2018 (UTC)
- @Ymblanter: You keep on saying (and others do the same):
- You can import data together with the source (and actually block data which have no source or a bad source from being imported).--Ymblanter (talk) 07:29, 21 May 2018 (UTC)
- @RexxS: I am sorry, I am wording this stronger than Randykitty: This is utter, pertinent bullshit.
- The material is on WikiData, and a piece of code is making it appear in en.wikipedia. That is exactly the same as how templates work, a mechanism that we call transclusion. We ‘refer’ to the data we want to see, and the code makes it appear. It is not imported, it stays on WikiData. En.wikipedia does NOT have control over what is on WikiData,
- Regarding the filtering. Yes, we have LUA code that filters data to define whether data has a reference. That code cane, to a certain extend filter also the type of reference (exclude ‘from Wikipedia’). It is impossible for the LUAcode to filter whether the attached reference is a correct source (does it actually state the statement), whether the value has been altered without touching the reference, does the reference actually constitute a reliable source (imdb for a BLP, though I could write code to exclude imdb, I cannot filter ALL unreliable sources known to mankind in our LUA code, especially knowing that often context matters). No, RexxS, you cannot assure you hat all material on WikiData that IS referenced is suitable for transclusion. And I know you are now going to suggest that we should then by hand ovverride all data that we can’t use, which just negates the ease of use, and makes infoboxes more confusing to editors: transclude all from Wikidata, hiding where data is, but not this, that and the other and for such use this local value.
- As I wrote elsewhere, it stretches my AGF that every editor who is transcluding an infobox NOW is going to check whether all data is correct and whether all data is referenced according to our referencing standards. Especially for newbies. Moreover, I can NOW add an wikidata infobox with field ‘day of birth’ empty on both wikipedia and wikidata. Then someone can come next week to WikiData and add a birthday referenced to an unreliable source (but stating the source). That results that THEN en.wikipedia is starting to transclude that data. Am I NOW responsible for enabling the transclusion of BLP sensitive data with a bad reference that is going to be added in the future. Do I HAVE to return to check when WikiData changes? If that happens multiple times do I get blocked because I am not adhering to the BLP policy. Or are you going to block any editor on WikiData if that editor repeatedly adds BLP sensitive data with references that fail en.wikipedia reference standards? I am waiting for our banned editors to use WikiData as a proxy.
- Going on on references, there is WP:BLUE material on WikiData .. that we don’t transclude because it not referenced. So more local settings: only referenced data from WikiData, but this is finebecause it does not need a reference.
- The whole ‘you can filter by references’ argument is a red herring, a carrot. There is not enough control (until we teach our LUA infoboxes artificial intelligence).
- Yes, data is tainred because it is on WikiData: it is an unreliable source for our data, not following our referencing standards, we do not have a reliable system for filtering data that we want or not. Until WikiData is changing its datamodel it is not suitable foruse on wikipedia.
- @RexxS: as to your definition of Transclusion:
In computer science, transclusion is the inclusion of part or all of an electronic document into one or more other documents by hypertext reference. Transclusion is usually performed when the referencing document is displayed, and is normally automatic and transparent to the end user.[1] The result of transclusion is a single integrated document made of parts assembled on the fly from separate sources, possibly stored on different computers in disparate places
- the data that we display in a fully wikidata enabled infobox arrives in the article by transclusion of said infobox, where the data displayed is transcluded WikiData data. That means that if the template changes, or the data on WikiData changes, the displayed article is changing. - In the end it does not matter - the editor who is importing data from WikiData, or the editor who is adding code resulting in the transcludion of data from WikiData has sourced information from an unreliable source, and is responsible for what is displayed in the article (and, IMHO, also for data that is added later to WikiData if that is resulting in transclusion onto the article that they added the automatic transclusion to). —Dirk Beetstra T C 04:15, 21 May 2018 (UTC)
- @Beetstra: I'm also sorry, but the only bullshit is your fake claims of problems that don't exist.
"The material is on WikiData, and a piece of code is making it appear in en.wikipedia. That is exactly the same as how templates work.
Nonsense. Templates dumbly transclude all of the marked content from a document. Data from Wikidata database is examined and filtered by the code that imports its values. We can control what is imported and reject unsuitable data or choose to replace it with a local value as we see fit. We can't do that with transclusion. The control is just a strong as the control we have over locally added data.- You complain that
"It is impossible for the LUAcode to filter whether the attached reference is a correct source ... whether the value has been altered without touching the reference, does the reference actually constitute a reliable source ... especially knowing that often context matters.
Yet you wilfully ignore the fact that you can't know that a reference manually added to infoboxes (in the few cases they exist) is a "correct source", or whether the value in the infobox has been altered without touching the reference, or that the reference actually constitutes a reliable source. It's just the same set of problems whether the infobox content and reference comes from Wikidata or from somewhere else in the article (or from an editor's OR). And the solution remains the same: you check the fact against the reference that supports it, a job made easier in practice by using content from Wikidata. But you continually choose to ignore that fact. - I do not assume that "all material on WikiData that IS referenced is suitable for transclusion". It's not transcluded. It's filtered to remove the most egregious stuff, which is more than can be said of material added locally. What is left may or may not meet our sourcing requirements, but it still stands a better chance than what can be added locally by any editor, which completely unfettered.
- Our infoboxes are designed by default so that fetching Wikidata has to be enabled on each article. Why should we not ask editors who add such an infobox to be responsible for checking that the data they are enabling comes from a reliable source? We ask the same of editors who add a non-Wikidata infobox (in theory). It stretches my AGF that you believe that happens every time.
- I can add a non-Wikidata infobox with the with field 'day of birth' empty. If someone comes next week to Wikipedia and adds a birthday referenced to an unreliable source (but stating the source), are you going to start complaining about me adding the infobox that allowed the editor to add unreliably sourced information? Because that's exactly analogous with what you're demanding of me if I add a Wikidata-enabled one. In every case, it's the same whether we grab the sourced content from Wikidata or from a editor locally. You might as well claim that "local editors re not a reliable source", because it's just as nonsensical as your fake claim about Wikidata being a source.
- Your admission that you think Wikidata taints a source is telling. If the same content and same source were found on es-wiki, that would be fine with you. But the moment that content and source is fetched from Wikidata, you're up in arms. Your argument is nothing more than IDONTLIKEIT.
- It's not my definition of transclusion. It's Wikimedia's. If you're going to use the word, use it properly. The infobox template is transcluded into an article, of course, but the content of each field does not come from that template document. The date of death of George Harris (theologian) is no more "transcluded" from Template:Infobox person/Wikidata than the date of death of George Harrison is "transcluded" from Template:Infobox musical artist. While we're looking at those two, tell me how long it takes you to verify the fact in each case. The Wikidata one is verified in a few seconds; a much longer time is needed to verify the non-Wikidata one. Let anybody who has an open mind do the same. They will soon see whether your fears are real. --RexxS (talk) 15:11, 21 May 2018 (UTC)
- We are again running in circles, RexxS. At least for the third time. To me, WikiData is at the very best the same as en.wikipedia, and with what I see going on, I don’t think it is. —Dirk Beetstra T C 16:11, 21 May 2018 (UTC)
- You know, Dirk, I can agree with your assessment that "WikiData is at the very best the same as en.wikipedia" – at least when we're thinking about sourcing. I do not now (nor ever will, I suspect) claim that Wikidata has sourcing any better than English Wikipedia's. For me, the value of making use of Wikidata is what it brings to the smaller Wikipedias, and the convenience of having a single place to update facts as they change. Where we disagree, I guess, is that I think there's enough of Wikidata that has decent sourcing to make the effort of fetching infobox data worthwhile; and you don't. I'm prepared to leave it at that. Feel free to have the last word. Cheers --RexxS (talk) 21:12, 23 May 2018 (UTC)
- @RexxS: and I fully agree with that. There is value in WikiData. The system is great, ir allows for significantly increasing reliability of what we do here. But, in my opinion, not in the current state/data model. I currently see two, heavy pro-WikiData editors heavily bickering (almost an edit war) about transcluding or not transcluding a WikiData infobox because there is badly sourced datafrom WikiData (or not). Both WikiData and infoboxes have been in front of ArbCom (both sent back).
- If this RfC closes as 1A, I hope that WikiData starts thinking. Very much unlike any other Wiki, WikiData has the potential to be a reliable source. In my opinion, a true wiki-way does NOT make sense on WikiData (and I have been saying similar a couple of times). It is late, but not too late, and it will take a humongous effort. —Dirk Beetstra T C 03:30, 24 May 2018 (UTC)
- You know, Dirk, I can agree with your assessment that "WikiData is at the very best the same as en.wikipedia" – at least when we're thinking about sourcing. I do not now (nor ever will, I suspect) claim that Wikidata has sourcing any better than English Wikipedia's. For me, the value of making use of Wikidata is what it brings to the smaller Wikipedias, and the convenience of having a single place to update facts as they change. Where we disagree, I guess, is that I think there's enough of Wikidata that has decent sourcing to make the effort of fetching infobox data worthwhile; and you don't. I'm prepared to leave it at that. Feel free to have the last word. Cheers --RexxS (talk) 21:12, 23 May 2018 (UTC)
- We are again running in circles, RexxS. At least for the third time. To me, WikiData is at the very best the same as en.wikipedia, and with what I see going on, I don’t think it is. —Dirk Beetstra T C 16:11, 21 May 2018 (UTC)
- Guys. It really doesn't matter whether we call the process by which the Wikidata data arrives here "transclusion", "importation", or "Abra Kadabra Presto!". Look at the conditions and outcomes, and make your arguments for or against those rather than quibbling over terminology. Nikkimaria (talk) 13:37, 21 May 2018 (UTC)
- Exactly, Nikkimaria. --Dirk Beetstra T C 14:32, 21 May 2018 (UTC)
Notifications
[edit]I haven't read this whole thread, but in case it hasn't been mentioned before, there needs to be better automatic notification of when a Wikidata edit affects a Wikipedia article. E.g. if you edit something on Wikidata, it should show up as an edit/change to any affected Wikipedia pages. Implementing this will be a significant technical challenge however. SharkD Talk 08:33, 21 May 2018 (UTC)
- That feature has been available for some time. If you look at a standard watchlist, there's a line prefixed "Hide:"; if you untick the "Wikidata" box, you'll see changes to Wikidata items associated with the pages on your watchlist. However, up until recently, the only way to get information from a Wikidata item into a Wikipedia infobox was to read the entire item and pick out the property you want. That meant that Wikidata could not determine which property is used on Wikipedia, so every change to the Wikidata item showed up in watchlists, which was too much for many editors, and they switched it off. I have an updated version of the Lua module:WikidataIB waiting in its sandbox which utilises newer developments that allow Wikipedia to read just the properties required for the infobox, not the whole Wikidata item. That means the Wikidata part of the watchlist can now be filtered selectively to only display changes that will affect the infobox. That is still being refined (for example, we want to see changes to the English label, but not the Portuguese label, etc.). I'm loathe to roll-out a major change to code while the RfC is still ongoing, but I assure you that the Wikidata developers are keen to implement precise watchlist notifications, and I expect a big reduction in false positives soon. --RexxS (talk) 10:38, 24 May 2018 (UTC)
Status
[edit]A quick summary, 1A vs non-1A is 27/25. User:Capankajsmilyo(Talk | Infobox assistance) 04:03, 17 April 2018 (UTC)
- But the vote isn't 1A vs non-1A. Please don't summarize 6 options into "1 vs. all" (using your method gives "1F vs non-1F is 10/42", any reason you don't add that result?). Either give a votecount for all options (plus the "others"), or don't bother as it isn't a vote anyway. Fram (talk) 11:32, 17 April 2018 (UTC)
- If you're interested in the statistics, see this spreadsheet. By my count, 1A vs. non-1A is currently 22.5 to 26.5, but please let me know if I've miscalculated (you can see the counting on the second sheet). And as I said on the talk page when I first posted that sheet, "Of course, that's the crudest possible measure of this RfC, and doesn't take into account people's comments, but it might be a useful first-look." Thanks. Mike Peel (talk) 13:25, 17 April 2018 (UTC)
- Something that will have such a profound effect on Wikipedia should be more widely advertised. 188.28.163.10 (talk) 15:16, 17 April 2018 (UTC)
- @anon any suggestions of additional places to advertise it? Thanks. Mike Peel (talk) 22:10, 17 April 2018 (UTC)
- @Mike Peel: Capankajsmilyo brought this up just now in Wikidata's Project chat. mahir256 (talk) 05:04, 18 April 2018 (UTC)
- WP:CANVASS anyone? Are we now going to inject WikiData editors here to make a decision on how Wikipedia is going to use WikiData? --Dirk Beetstra T C 08:05, 18 April 2018 (UTC)
- It should be not difficult to figure out if we have votes of editors normally not active on the English Wikipedia.--Ymblanter (talk) 08:12, 18 April 2018 (UTC)
- Now, since this is a quite important feature and situation that we are talking about, should this RfC not be on a site-notice or similar? --Dirk Beetstra T C 08:23, 18 April 2018 (UTC)
- Agree with Beetstra on this. User:Capankajsmilyo(Talk | Infobox assistance) 08:33, 18 April 2018 (UTC)
- I mean if individual RfAs are on a watchlist notice, seems like this should have a watchlist notice... Galobtter (pingó mió) 08:46, 18 April 2018 (UTC)
- Now, since this is a quite important feature and situation that we are talking about, should this RfC not be on a site-notice or similar? --Dirk Beetstra T C 08:23, 18 April 2018 (UTC)
- It should be not difficult to figure out if we have votes of editors normally not active on the English Wikipedia.--Ymblanter (talk) 08:12, 18 April 2018 (UTC)
- WP:CANVASS anyone? Are we now going to inject WikiData editors here to make a decision on how Wikipedia is going to use WikiData? --Dirk Beetstra T C 08:05, 18 April 2018 (UTC)
- @Mike Peel: Capankajsmilyo brought this up just now in Wikidata's Project chat. mahir256 (talk) 05:04, 18 April 2018 (UTC)
- @anon any suggestions of additional places to advertise it? Thanks. Mike Peel (talk) 22:10, 17 April 2018 (UTC)
- Something that will have such a profound effect on Wikipedia should be more widely advertised. 188.28.163.10 (talk) 15:16, 17 April 2018 (UTC)
An interesting this to see is that Portal discussion has got more than 500 votes and much more comments. This discussion seem to have accumulated 50-60 till now. Some of the locations/strategies used there for advertisement can be used here as well. User:Capankajsmilyo(Talk | Infobox assistance) 06:01, 24 April 2018 (UTC)
Another thing that can be done is including the link to discussion in {{Infobox}}
so that it can be seen on every page / template using infobox. User:Capankajsmilyo(Talk | Infobox assistance) 06:03, 24 April 2018 (UTC)
- I got a signpost today regarding portal RFC, but this RFC was not mentioned there. I think, this should have been in there. User:Capankajsmilyo(Talk | Infobox assistance) 03:15, 26 April 2018 (UTC)
- Yup, people only seem to care when something is up for deletion, not when say something could affect ~3 million pages not in a deletion related way. Any way of a watchlist notice? The only reason it has 400 !votes is because it had a notice on ~1500 portal pages. Galobtter (pingó mió) 06:11, 26 April 2018 (UTC)
- The large intro in probably is dissuading many people too and the generally boringness of it, I suppose too. Like the bike-shed effect. Galobtter (pingó mió) 06:17, 26 April 2018 (UTC)
- Threaded discussion moved down from voting section:
- I only see four users below who did not vote 1A; these four users have 8K, 1K, 243K, and 46K edits on the English Wikipedia, respectively.--Ymblanter (talk) 11:06, 25 April 2018 (UTC)
- Since you want more detail, of the four (current) such !votes: The first has 832k Wikidata edits vs 8k EnWiki edits (99%), and was in the middle of a simultaneous Wikidata-run while they !voted. The second has 74k Wikidata edits vs 1k EnWiki edits (98.4%), and had been active on Wikidata more recently than EnWiki before !voting. The third is not in question, they were locally aware of the RFC during drafting. The fourth has a significant percentage of activity on both sites, and could credibly have arrived to !vote here via local advertizement or via Wikidata canvass. The origin of future !votes is an open question. Alsee (talk) 11:43, 25 April 2018 (UTC)
- Right, but you yourself only have 6K votes here. I would prefer you to not judge users based on solely bad faith assumptions, in particular if these users have contributed to the English Wikipedia (and, for that purpose, to Wikidata) more than you did.--Ymblanter (talk) 11:58, 25 April 2018 (UTC)
- Ymblanter it is unhelpful to gratuitously attack my edit count. It is unhelpful to incorrectly assume that I "judge[d] users based on solely bad faith assumptions". I did not judge the users at all. The sole issue is how the !votes arrived on this page. Biased participation can manufacture a fictional-majority in any direction on any issue. It is perfectly appropriate for me to notify the closer that some !votes may be present due to de facto notification of users with a predetermined position. It is then up to the closer to determine how to handle that concern. BTW I invite you (or anyone else) to move my replies and yours down to the discussion section, starting from your 11:06 25 April comment. We don't need to clutter the !voting section with this dialog. Alsee (talk) 12:56, 25 April 2018 (UTC)
- Right, but you yourself only have 6K votes here. I would prefer you to not judge users based on solely bad faith assumptions, in particular if these users have contributed to the English Wikipedia (and, for that purpose, to Wikidata) more than you did.--Ymblanter (talk) 11:58, 25 April 2018 (UTC)
- Since you want more detail, of the four (current) such !votes: The first has 832k Wikidata edits vs 8k EnWiki edits (99%), and was in the middle of a simultaneous Wikidata-run while they !voted. The second has 74k Wikidata edits vs 1k EnWiki edits (98.4%), and had been active on Wikidata more recently than EnWiki before !voting. The third is not in question, they were locally aware of the RFC during drafting. The fourth has a significant percentage of activity on both sites, and could credibly have arrived to !vote here via local advertizement or via Wikidata canvass. The origin of future !votes is an open question. Alsee (talk) 11:43, 25 April 2018 (UTC)
- I don't see why this needs to be in the !vote section at all, so I've moved the whole lot down. Note that the mention on Wikidata's project chat was not posted as an invitation to participate, but as a FYI/start of discussions there about some of the concerns raised here. Thanks. Mike Peel (talk) 14:20, 25 April 2018 (UTC)
- First, intent is irrelevant as far as the RFC itself goes. The post at Wikidata had a de facto effect of canvassing, and it is reasonable to alert the closer to the concern. As an independent matter, the sequence and timing of events makes your interpretation of intent rather implausible. The user has previously been warned for canvassing. They !voted on April 7. Ten days later they posted a vote-count that the RFC was going against them. In a matter of hours this sequence played out: IP replied to the vote-count by suggesting more advertizing, someone asking where to advertize it, then then the user posted at Wikidata with the section title Enwiki RFC raises concerns and a link for Wikidata users to click over to here, then they "agree"d with wanting more advertizing. It strains credulity to suggest that they were utterly unaware that their actions were going to have the effect of a WP:Votestacking advertizement in their preferred direction, when they were in the middle of noting that the RFC was going against them and discussing their desire to advertize. Alsee (talk) 01:03, 27 April 2018 (UTC)
- I only see four users below who did not vote 1A; these four users have 8K, 1K, 243K, and 46K edits on the English Wikipedia, respectively.--Ymblanter (talk) 11:06, 25 April 2018 (UTC)
- Why is this posted on my talk page by you? Are you an administrator? Cos if you are not, I would like to know what's happening. Further, what is it I'm being accused of? You people raised concerns about Wikidata verifiability, and you want me to not even tell the people there that there policies are flawed (in the views of editing community here) ? Capankajsmilyo(Talk | Infobox assistance) 01:44, 27 April 2018 (UTC)
- Discussed on usertalk. Alsee (talk) 02:02, 27 April 2018 (UTC)
- How can you assume that this RFC is going against my direction? As far as I can see there's 50% votes on both sides. From the limited experience of RFCs of mine, this leads to a state of no consensus even if 5-10% up/down. And how is your opinion to not advertise on English Wikipedia via signpost and/or infobox in good faith? Capankajsmilyo(Talk | Infobox assistance) 02:19, 27 April 2018 (UTC)
- How the RFC is going is irrelevant to whether or not something is canvassing. However your posting of your votecount here shortly before posting at Wikidata:Project_chat suggests that you were concerned with the votecount when you posted at Wikidata:Project_chat. Alsee (talk) 02:27, 27 April 2018 (UTC)
- How can you assume that this RFC is going against my direction? As far as I can see there's 50% votes on both sides. From the limited experience of RFCs of mine, this leads to a state of no consensus even if 5-10% up/down. And how is your opinion to not advertise on English Wikipedia via signpost and/or infobox in good faith? Capankajsmilyo(Talk | Infobox assistance) 02:19, 27 April 2018 (UTC)
- Discussed on usertalk. Alsee (talk) 02:02, 27 April 2018 (UTC)
- Why is this posted on my talk page by you? Are you an administrator? Cos if you are not, I would like to know what's happening. Further, what is it I'm being accused of? You people raised concerns about Wikidata verifiability, and you want me to not even tell the people there that there policies are flawed (in the views of editing community here) ? Capankajsmilyo(Talk | Infobox assistance) 01:44, 27 April 2018 (UTC)
What is a Wikidata infobox
[edit]Is it an infobox which pulls all fields from Wikidata (option i) or some fields from Wikidata (option ii)? My understanding was that we are discussing option ii, and if 1A passes that outlaws any infoboxes pulling anything from Wikidata (for example, fields like Commons Category or Official Website - we have plenty of those). This is not the choice I support, but I only have one vote, fine. I see however that some people oppose use of infoboxes clearly under understanding that we are talking about option (i) - all or nothing. This needs to be clarified, otherwise it makes the whole RfC even more confusing.--Ymblanter (talk) 05:24, 19 April 2018 (UTC)
- @Ymblanter: This is about infoboxes pulling some, or all, fields from Wikidata (not your option i or ii, but both). This is about whether infoboxes should pull any data from WikiData. And Official Websites are just as vandalism prone (if not more) than others, as I see now spammers targeting WikiData first, and that those websites stay vandalized for months is not helping their cause. --Dirk Beetstra T C 07:40, 19 April 2018 (UTC)
- The I believe this vote is invalid since it is based on a false premise.--Ymblanter (talk) 07:59, 19 April 2018 (UTC)
- It would still be true if there is one local field and one magically appearing WikiData field (though the editor did not express that). And see the confusion when someone is locally removing a wrong field, and sees another, (or even the same) magically staying there. Or when fields are added they don't do anything because they are overridden by a WikiData value.
- The case this editor describes, however, is explicitly mentioned in the lede of this RfC: '
[[File:South Pole Telescope infobox from Wikidata.jpg|thumb|right|An example infobox built entirely from Wikidata information. This is used in the [[South Pole Telescope]] article and if you look at the edit window, all you will see is {{tl|Infobox telescope}}.]]
'. So this is about option 'í' (what this editor apparently !voted on), and about your 'option ii' (which encapsulates option 'i' as an extreme case). --Dirk Beetstra T C 08:18, 19 April 2018 (UTC)- I believe that your argument is incorrect, but I have no intention of entering into a long argument (which in the past never lead to anything good), so that I will leave this for a closing admin to decide on.--Ymblanter (talk) 08:20, 19 April 2018 (UTC)
- @Ymblanter: I would suggest that you take up this question with the caster of that !vote then to get it clarified. --Dirk Beetstra T C 08:22, 19 April 2018 (UTC)
- I believe that your argument is incorrect, but I have no intention of entering into a long argument (which in the past never lead to anything good), so that I will leave this for a closing admin to decide on.--Ymblanter (talk) 08:20, 19 April 2018 (UTC)
- The I believe this vote is invalid since it is based on a false premise.--Ymblanter (talk) 07:59, 19 April 2018 (UTC)
Use Wikidata to pre-fill templates
[edit]I'm coming to this discussion late and currently have no time to go through all the different proposals and !votes. So I'll throw out my idea here, apologies if somebody else has already brought up something similar. I share the concern of those people who feel that WD is much more vandalism-prone than WP. As far as I see, the templates most involved are authority control, infoboxes, and the {{official website}} template. Would it be possible to have WD pre-fill these templates on demand when an editor adds them to an article? The editor would then have to verify the data and be responsible for including them into the article. Once that is done, subsequent editors could have a button that would cross-check info in the article with the WD item and, again, editors would have the choice of including those data or not. In no case would WD (or a bot) add data directly from WD into WP. Just a thought, thanks for listening... --Randykitty (talk) 13:13, 11 May 2018 (UTC)
- See Template:Infobox UNESCO World Heritage Site#Pre-loading Wikidata values, which has sort of an implementation of that idea for a particular template. --Francis Schonken (talk) 13:21, 11 May 2018 (UTC)
- The problem is that such a setup only works in one direction: the content from Wikidata comes here, but then any changes to it aren't fed back to Wikidata. The system works best when it's bi-directional: we gain information from Wikidata in infoboxes here, and we share back new information into Wikidata where it's then used in other language Wikipedias and elsewhere. For comparison, it's like saying "let's keep all images used in enwp here, and have a script to copy new ones over from Commons", which would be a nightmare. Thanks. Mike Peel (talk) 13:43, 11 May 2018 (UTC)
- The problem with possible vandalism from Commons is minimal compared to the damage that WD can do. And I don't see why WD cannot regularly scan changes to an infobox and "leech" whatever changes were made. But that's their problem, not ours. --Randykitty (talk) 14:13, 11 May 2018 (UTC)
I still don’t see why we need to import from WD. They have downloaded data from external sources or manually added, and there is no quality control (as the example by user:Fram from the vandalism that Mike Peel copied over shows), nor sufficient vandalism control. Whether there is a reference to said source is no guarantee that the data has not been vandalised on WD since (forgetting that said source would be a reliable source for that data on en.wikipedia in the first place). It is much better to use the same external database or careful editors directly, and control the data here -our vandalism control outnumbers WD by at least an order of magnitude. No, user:Mike Peel, it is not ‘much better’, it is orders of magnitudes worse, and I would, at best, allow careful single pulling of data from WD at discretion of single editors (who can be ‘blamed’ for being not careful enough). The bidirectional route is, at the moment, a recipe for disaster. —Dirk Beetstra T C 18:29, 11 May 2018 (UTC)
Wikidata now showing more "appropriately" in watchlists
[edit]The way changes to Wikidata show up in your watchlist on English Wikipedia has recently been improved (happened a couple of days ago). Previously if a Wikidata property was used within a Wikipedia page any change to any property in the entire item on Wikidata would create a line in your watchlist.
Now only if a change occur to the specific property used (or the aliases) will a line appear in your watch list. In my opinion this is a significant step forwards so congrats to the Wikidata team. I have requested the option not to show changes to "aliases". In my opinion this significantly improves quality assurance for material used from Wikidata and thus makes me more comfortable supporting Wikidata use within Wikipedia. Doc James (talk · contribs · email) 21:12, 13 May 2018 (UTC)
- Well done to Ladsgroup and the rest of the folks who have been working on that. Following improvements to the Wikidata–Lua interface, I've been able to do an overhaul to Module:WikidataIB (in the sandbox while testing) that ensures it now reads only the property related to the infobox field, rather than the whole Wikidata entry, so that should also reduce the number of false positives in watchlists when Wikidata is changed. --RexxS (talk) 21:47, 13 May 2018 (UTC)
- @Doc James: this helps in keeping an eye on the data, but this does not take away my other concerns. --Dirk Beetstra T C 05:17, 14 May 2018 (UTC)
Time to find an admin to close?
[edit]Given that this RFC has been open for over a month, and that discussion seems to be dying down, is it time to request that an uninvolved admin review and close? Blueboar (talk) 12:10, 11 May 2018 (UTC)
- The request has been sitting there for about a week. We clearly have a shortage of admins in a suicidal mood.--Ymblanter (talk) 12:14, 11 May 2018 (UTC)
- Since 6 May [17]--Ymblanter (talk) 12:18, 11 May 2018 (UTC)
- Any closer claiming a consensus would be expected to be able to defend that decision. I would not wish to be that person. A no consensus close might also be challenged, but then the onus would be on a challenger to show how this collection of diverse opinions could be construed as a consensus, and for what. If nobody will touch it we can eventually draw the conclusion that no-one is willing to claim a consensus. If no-one steps forward to close by the end of May, I suggest that we accept no consensus as the default. Cheers, · · · Peter (Southwood) (talk): 10:07, 12 May 2018 (UTC)
- We can still post once directly at AN (the current announcement is in a transcluded template) and explicitly ask for three closers.--Ymblanter (talk) 10:10, 12 May 2018 (UTC)
- NO. This has to be independently closed. I can agree with a closing committee, but not with a ‘by default’ close. —Dirk Beetstra T C 12:01, 12 May 2018 (UTC)
- Really, Dirk? What sort of closure are you expecting?
- Q1: Can Wikidata infoboxes be used in mainspace? - only a minority voted "No".
- Q2: What can Wikidata infoboxes display? - an even smaller minority voted "Nothing".
- Are you really going to insist that there is consensus for those two outcomes? Because it sure doesn't look like any consensus to me. If it lies open forever, it will just as clearly signal "no consensus" as would a default close. --RexxS (talk) 12:35, 12 May 2018 (UTC)
- And that is exactly why I want this to be properly closed. ArbCom is ready for more fights about infoboxes and wikidata, and leaving this open makes that clear that they need to look at it. If there is no consensus then have that properly codified. —Dirk Beetstra T C 13:09, 12 May 2018 (UTC)
- Assuming a “no consensus”... the real question that the closers will have to address is: what does that “no consensus” MEAN? Does it mean there is no consensus FOR using Wikidata in infoboxes (default to not using it) or does it mean there is no consensus AGAINST using Wikidata in infoboxes (default to allowing it) or does it mean no consensus AT ALL ... for OR against... (With no default position. In which case the debate remains unresolved, and will require additional RFCs). Blueboar (talk) 14:41, 12 May 2018 (UTC)
- The status quo is defined by Wikipedia:Requests for comment/Wikidata Phase 2:
"It is appropriate to modify existing infoboxes to permit Wikidata inclusion when there is no existing English Wikipedia data for a specific field in the infobox"
. A "no consensus" close will not alter that. --RexxS (talk) 14:54, 12 May 2018 (UTC)- Although it's probably fair to say that that particular RFC would have turned out differently with the experience that we have now, 5 years later... --Randykitty (talk) 15:18, 12 May 2018 (UTC)
- To be fair, it wouldn't turn out differently. The majority of editors then, as now, were in favour of some Wikidata integration, carefully implemented. We've had an explosion of content on Wikidata leading to many of the observed problems, but we've also created the tools to make infoboxes far more flexible, putting the decision on whether to import Wikidata in the hands of the editors at the article level, and ensuring that unsourced data is filtered out by default. We also now have the experience of very successful opt-out infoboxes such as {{infobox gene}} and {{infobox telescope}}, as well as offering opt-in alternative versions of other templates by creating a "/Wikidata" version. Editors have all the control they need to ensure that they can choose in any given article, by local consensus, to allow or disallow Wikidata integration right down to the individual field level. That's a perfectly tenable position that is no different from the consensus-driven process of adding (or removing) any other content to an article, and the majority recognise that. --RexxS (talk) 18:26, 12 May 2018 (UTC)
- Sigh, and that is why I think a committee should look at the results and not have a by default close. The whole sourcing question is a big issue, and if we do end up in the middle, we will need an RfC on that whole sourcing question - there is IMHO a whole discrepancy in the whole definition of sourcing with respect to WikiData that cannot be addressed within the constraints of this RfC (it is too late ... I should have thought of this earlier). —Dirk Beetstra T C 15:04, 13 May 2018 (UTC)
- Why will we need an RfC? The sourcing of infobox information from Wikidata is subject to just the same requirements as the sourcing of infobox information locally on Wikipedia, except that in the Wikidata case, we can arrange to automatically exclude any data that is not sourced. The necessity of being able to verify the accuracy of the sourcing is identical in the two cases, except that in the Wikipedia case, you often have to search through the article text to find the reference to check. The downside of getting information from Wikidata is that the watchlisting is still crude, but it's improving steadily. What's the question you want to ask? Perhaps someone can answer it in a satisfactory manner for you, without needing another 30 days of inconclusive debate. --RexxS (talk) 17:42, 13 May 2018 (UTC)
- No... The information from Wikidata ISN’T subject to the same strict sourcing requirements. That’s one of the primary reasons why so many editors oppose incorporating it. Blueboar (talk) 20:48, 13 May 2018 (UTC)
- Yes. You misunderstand. All of the information that is used in Wikipedia is subject to the same strict sourcing requirements, regardless of whether it comes from Wikidata or from a local edit – by definition. And the same mechanism is available to verify the compliance of that sourcing: check the reference. If that misunderstanding is one of the primary reasons for opposition, it's a pretty clear indication that there's a need to inform folks better. --RexxS (talk) 21:15, 13 May 2018 (UTC)
- Exactly, Blueboar, it IS NOT. That is why anything but 1A requires follow up RfCs. —Dirk Beetstra T C 02:36, 14 May 2018 (UTC)
- Yes it IS. The Wikipedia community creates the sourcing requirements for its content, and nobody else. There is no need for follow up RfCs, especially when you can't even articulate what questions need to be answered. --RexxS (talk) 11:52, 14 May 2018 (UTC)
- Yes, we have our sourcing requirements, RexxS, and WD fails that. And I have articulated that in my !vote, WikiData does not meet our sourcing requirements (with the whole sourcing on WikiData being a massive red herring). --Dirk Beetstra T C 14:12, 14 May 2018 (UTC)
- No, you're wrong. The text brought in from Wikidata is filtered by default to ensure it has a reference, so Wikidata's internal policies are irrelevant. You might as well complain that we shouldn't include text from Encyclopedia Britannica 1911, because EB's sourcing requirements don't match ours. We ensure that content on Wikipedia meets WP:V by checking the reference. That is how we ensure our sourcing requirements are met. It is just as simple (and often simpler) to check that a fact in an infobox is supported by its reference when it comes from Wikidata as when it has been added by an editor locally. The only red herring is your insistence that the sourcing requirements on Wikidata have any effect on the sourcing requirements here on Wikipedia. --RexxS (talk) 14:45, 14 May 2018 (UTC)
- Yes, we have our sourcing requirements, RexxS, and WD fails that. And I have articulated that in my !vote, WikiData does not meet our sourcing requirements (with the whole sourcing on WikiData being a massive red herring). --Dirk Beetstra T C 14:12, 14 May 2018 (UTC)
- Yes it IS. The Wikipedia community creates the sourcing requirements for its content, and nobody else. There is no need for follow up RfCs, especially when you can't even articulate what questions need to be answered. --RexxS (talk) 11:52, 14 May 2018 (UTC)
- Exactly, Blueboar, it IS NOT. That is why anything but 1A requires follow up RfCs. —Dirk Beetstra T C 02:36, 14 May 2018 (UTC)
- Yes. You misunderstand. All of the information that is used in Wikipedia is subject to the same strict sourcing requirements, regardless of whether it comes from Wikidata or from a local edit – by definition. And the same mechanism is available to verify the compliance of that sourcing: check the reference. If that misunderstanding is one of the primary reasons for opposition, it's a pretty clear indication that there's a need to inform folks better. --RexxS (talk) 21:15, 13 May 2018 (UTC)
- No... The information from Wikidata ISN’T subject to the same strict sourcing requirements. That’s one of the primary reasons why so many editors oppose incorporating it. Blueboar (talk) 20:48, 13 May 2018 (UTC)
- And that is here the discrepancy in understanding is. Let’s first see what this RfC brings, shall we. I am waiting for the closing committee. —Dirk Beetstra T C 20:25, 13 May 2018 (UTC)
- Once again, that is a red herring. Material can be referenced against an unreliable source, still it would be incorporated if not filtered out (especially if the data ref is added after the fully enabled box is inserted and the editor decided to opt-in for all referenced data -you can’t opt-out on unsuitably referenced data when it is not there yet). Data can be altered without changing the reference (meaning that an editor could have checked his BLP data to be correct, only to see it vandalised later). Data however that is unreferenced and suitable to be transcluded is not, because it is unreferenced, even if that data does not need a reference (sky, color=blue). But it is all besides the point, WikiData is an unreliable source, it does not matter whether a datapoint is referenced (properly) or not, it is unreliable by nature. —Dirk Beetstra T C 16:21, 14 May 2018 (UTC)
- Your argument is the crimson clupeid. Material added locally on Wikipedia can also be referenced against an unreliable source just as easily. The only difference is that checking for verifiability is somewhat easier in a Wikidata infobox because: (1) the link to check the reference is always present in a field added from Wikidata, unlike the multitude of unreferenced fields added locally; and (2) you are guaranteed that a reference actually exists when the information comes from Wikidata, unlike the locally added values which may not be referenced anywhere, and you have to search through the entire article to establish that. In the cases where a non-Wikidata infobox actually has references, it is just as simple, nay simpler, for the data to be changed without changing the reference – with exactly the same consequences for the editor who sees his contribution vandalised later. As for the case of data that is suitable for transclusion without filtering, it is a simple matter to turn off the filtering for fields like "image". You're making up a problem that doesn't exist. You can't manufacture a case that distinguishes Wikidata from local edits when exactly the same considerations apply to each. Your characterisation of Wikidata as "an unreliable source" is misguided. Wikidata is not now, and never has been a source, unreliable or otherwise. It's merely a container. You might as well claim that paper is an unreliable source because some books contain falsehoods. The information contained in Wikidata is no more reliable or unreliable than the data in the current infoboxes on Wikipedia - and that's not surprising because a lot of it originated there anyway. What is beside the point is this big fuss over a pretence that we don't have control of what we show our readers. We do, and that's a fact, no matter how inconvenient to your theories. --RexxS (talk) 16:49, 14 May 2018 (UTC)
- Once again, that is a red herring. Material can be referenced against an unreliable source, still it would be incorporated if not filtered out (especially if the data ref is added after the fully enabled box is inserted and the editor decided to opt-in for all referenced data -you can’t opt-out on unsuitably referenced data when it is not there yet). Data can be altered without changing the reference (meaning that an editor could have checked his BLP data to be correct, only to see it vandalised later). Data however that is unreferenced and suitable to be transcluded is not, because it is unreferenced, even if that data does not need a reference (sky, color=blue). But it is all besides the point, WikiData is an unreliable source, it does not matter whether a datapoint is referenced (properly) or not, it is unreliable by nature. —Dirk Beetstra T C 16:21, 14 May 2018 (UTC)
- I give up, you simply don’t get it. We’ll see each other in a next discussion after this RfC closes I guess. —Dirk Beetstra T C 17:35, 14 May 2018 (UTC)
- I think you'll find you're the one not "getting it", but no doubt we will explore that another time. --RexxS (talk) 19:30, 14 May 2018 (UTC)
- Why will we need an RfC? The sourcing of infobox information from Wikidata is subject to just the same requirements as the sourcing of infobox information locally on Wikipedia, except that in the Wikidata case, we can arrange to automatically exclude any data that is not sourced. The necessity of being able to verify the accuracy of the sourcing is identical in the two cases, except that in the Wikipedia case, you often have to search through the article text to find the reference to check. The downside of getting information from Wikidata is that the watchlisting is still crude, but it's improving steadily. What's the question you want to ask? Perhaps someone can answer it in a satisfactory manner for you, without needing another 30 days of inconclusive debate. --RexxS (talk) 17:42, 13 May 2018 (UTC)
- Sigh, and that is why I think a committee should look at the results and not have a by default close. The whole sourcing question is a big issue, and if we do end up in the middle, we will need an RfC on that whole sourcing question - there is IMHO a whole discrepancy in the whole definition of sourcing with respect to WikiData that cannot be addressed within the constraints of this RfC (it is too late ... I should have thought of this earlier). —Dirk Beetstra T C 15:04, 13 May 2018 (UTC)
- To be fair, it wouldn't turn out differently. The majority of editors then, as now, were in favour of some Wikidata integration, carefully implemented. We've had an explosion of content on Wikidata leading to many of the observed problems, but we've also created the tools to make infoboxes far more flexible, putting the decision on whether to import Wikidata in the hands of the editors at the article level, and ensuring that unsourced data is filtered out by default. We also now have the experience of very successful opt-out infoboxes such as {{infobox gene}} and {{infobox telescope}}, as well as offering opt-in alternative versions of other templates by creating a "/Wikidata" version. Editors have all the control they need to ensure that they can choose in any given article, by local consensus, to allow or disallow Wikidata integration right down to the individual field level. That's a perfectly tenable position that is no different from the consensus-driven process of adding (or removing) any other content to an article, and the majority recognise that. --RexxS (talk) 18:26, 12 May 2018 (UTC)
- Although it's probably fair to say that that particular RFC would have turned out differently with the experience that we have now, 5 years later... --Randykitty (talk) 15:18, 12 May 2018 (UTC)
- The status quo is defined by Wikipedia:Requests for comment/Wikidata Phase 2:
- Assuming a “no consensus”... the real question that the closers will have to address is: what does that “no consensus” MEAN? Does it mean there is no consensus FOR using Wikidata in infoboxes (default to not using it) or does it mean there is no consensus AGAINST using Wikidata in infoboxes (default to allowing it) or does it mean no consensus AT ALL ... for OR against... (With no default position. In which case the debate remains unresolved, and will require additional RFCs). Blueboar (talk) 14:41, 12 May 2018 (UTC)
- And that is exactly why I want this to be properly closed. ArbCom is ready for more fights about infoboxes and wikidata, and leaving this open makes that clear that they need to look at it. If there is no consensus then have that properly codified. —Dirk Beetstra T C 13:09, 12 May 2018 (UTC)
- Really, Dirk? What sort of closure are you expecting?
- Any closer claiming a consensus would be expected to be able to defend that decision. I would not wish to be that person. A no consensus close might also be challenged, but then the onus would be on a challenger to show how this collection of diverse opinions could be construed as a consensus, and for what. If nobody will touch it we can eventually draw the conclusion that no-one is willing to claim a consensus. If no-one steps forward to close by the end of May, I suggest that we accept no consensus as the default. Cheers, · · · Peter (Southwood) (talk): 10:07, 12 May 2018 (UTC)
- Since 6 May [17]--Ymblanter (talk) 12:18, 11 May 2018 (UTC)
Why is it that Wikidata proposers see this 4-question, six-option-for-each RfC (which they designed) now as a "Wikidata vs. no Wikidata" RfC? If that was the question you wanted to ask, then you should have asked that question. To now claim that the option of the 6 you proposed which got the most support is the one that has been rejected is disingenious. If you don't ask a yes-no question, then don't interpret the results as a yes-no outcome. Fram (talk) 06:52, 14 May 2018 (UTC)
- @Fram: There were a wide range of options to see where people were on the distribution, and that means that the options aren't exclusive/uncorrelated (someone that picks one answer is likely to be happy with an adjacent answer, possibly except when they're at one end of the scale). It's not a simple yes/no in the same way that it's not a simple pick-the-one-with-the-most support (plus it's a !vote), but given the distribution (a lot of people going for option A) it makes sense to also look at that vs. the other options (as per Nouill's comment above). Also, you yourself had input into the design of this RfC, as did many others, so you can't now claim it's just been designed by Wikidata proposers! Mike Peel (talk) 11:07, 14 May 2018 (UTC)
- My input was to remove the most obvious bits of bias. I gave my misgivings about the current RfC (and my opinion about what kind of RfC should be had instead) before the start of the RfC. My input into the design of this RfC was not a stamp of approval for it. I note, from the final discussions before this went live, that Beetstra, Francis Schonken, and others had misgivings about it, and that e.g. Beetstra explicitly said that starting with a straightforward yes-no question would have been better. You didn't go with it, but want to twist the results into a straightforward yes-no anyway, because otherwise you have one option with a lot of supports, and 5 with each a lot less support, and that gives a different impression than your grouping now of 2-6 together. Fram (talk) 11:25, 14 May 2018 (UTC)
- You had your chance to frame the question Can Wikidata infoboxes be used in mainspace?, Fram, so it's no good whining over the outcome now where only a minority answered that question in the negative. There's no way that a minority view can be a consensus, so accept it and move on. --RexxS (talk) 11:52, 14 May 2018 (UTC)
- I had my chance to frame the question? Where? Anyway, I'm not whining over the outcome, I'm not the one here repeatedly claiming how the closers should read the results or that option X or Y has been voted out. There is a minority for all questions, which is logical if you divide an RfC into so many questions; I tried to frame it as a yes-no question. You had your chance to suppport that question at the time, but decided against it. To claim the result as a yes-no vote now and dismiss anyone who sees through that obvious redefining of the question as "whining" is not helpful. If the outcome is so obvious, then why the need from at least three people not voting for A to put this here and pre-decide the outcome? The only things obvious is that there is no consensus for B, C, D, E or F, but for some reason none of you feel the need to dwell on those, only focusing on A again and again. Fram (talk) 12:29, 14 May 2018 (UTC)
- Sure, you had your chance. And you are whining about the outcome, even though the question was your preferred Can Wikidata infoboxes be used in mainspace?. Let's face it, you're only complaining because the majority didn't agree with you. You don't see me complaining that the majority didn't agree with me, do you? This RfC was problematical, not because of Question 1 (which asks a simple question that has a simple answer); and not because the answers were pre-packaged into a multiple choice; but because of the colossal amount of misinformation and misunderstanding spread about beforehand. My aim now is to examine what works well and build upon that, and discuss what concerns folks have and find ways of resolving them. Your position has been akin to Ian Paisley's "No surrender" stance that blocked progress for years. It's time for you to start seeking compromise and working with those who are doing their best to ensure that Wikidata integration produces results that the majority can be happy with. --RexxS (talk) 13:03, 14 May 2018 (UTC)
- I see you reframing the RfC to suit your purposes. You still sprout the same "Wikidata is progress, anyone opposing integration of Wikidata is hampering progress" nonsense. I am more than happy to work with "those who are doing their best to ensure that Wikidata integration produces results that the majority can be happy with", but I haven't found them so far. I have found people doing all they can to ensure that the Wikidata version of infoboxes remain, or the Wikidata info in infoboxes, like Mike Peel with the World Heritage Site infobox, or you in multiple instances of the biography infobox. I don't need your patronizing holier-than-thou words when your deeds speak for themselves, and I have seen often enough what "progress" Wikidata infoboxes have brought after 5 years of experimenting. But apparently this is due to my stance that blocked this "progress" for years, even though only one Wikidata infobox (which was really dreadfully implemented) has been reverted due to my "stance" and you and others were free to continue developing and showing the progress. But of course, if you want to dismiss all people not wanting more of these infoboxes as being duped by the "colossal amount of misinformation" and not as having genuine concerns and enough valid arguments to oppose the continuation of this experiment, then it is easy to ask for compromise. The majority agreed with no one here, not just with me, as this is an extremely divisive issue, which has been shown again and again. The smart thing then is to let such an RfC run its course and let uninvolved admins close it without trying to influence them. All my "whining" has been in response to people like you who just can't let this happen and feel that they need to pre-summarize the outcome for some unclear reason. A great basis for compromise and collaboration you are forming there. Let's just drop this silly conversation and let people simply close it without "helping" them with these counts, shall we? Fram (talk) 13:17, 14 May 2018 (UTC)
- You're still hoping that this RfC will be closed in your favour, aren't you? Well, what will your response be when it isn't? Will you actually stop insulting me ("holier-than-thou" indeed), and start working with me to improve how Wikidata is brought into Wikipedia? You obviously need reminding that I've created no infobox templates other than cut-down ones for demonstration purposes, like {{Infobox video game series/Wikidata}}. I have worked for five years refining the tools that editors have asked for: ensuring that every call will defer to any locally supplied value; implementing whitelisting and blacklisting of fields to enable opt-in infoboxes where control is at the individual article level; ensuring that by default only Wikidata that has a reference is imported; providing convenient links to the Wikidata entry to edit it or to check references; and more. You only have to look at the "1A, 1B, etc." votes to see that many are made under false assumptions, with no experience or knowledge of the conditions governing how Wikidata information is able to be imported. So I don't think I'll be giving you a free run to misdirect the closers in the vain hope that they might think that a minority somehow represents a consensus. --RexxS (talk) 13:45, 14 May 2018 (UTC)
- No, I'm hoping that whoever closes this will see the attempts from you and others to influence his close. If you don't want to be insulted, then drop the personal comments and you won't be replied to in kind. I have no interest in working with anyone with the attitude you display here though. Can you give a link to any post by me instructing the closers how to close this? Or are all my posts on this subject simply replies to others proclaiming triumphantly that A hasn't gotten a majority, and keeping silent on the much lower scores for B, C, D, E, or F? "You obviously need reminding that I've created no infobox templates other than cut-down ones for demonstration purposes" No, I don't need reminding of this, both because it's information that doesn't interest me, and because I didn't claim that you created infoboxes or not. It would be nice if you could reply to what has actually been said (or just not reply at all). Not to what hasn't been said, not to what you think people hope or believe. I could just as well claim that you still hope that whoever closes this will dismiss most "A" votes "because they were made under false assumptions". Fram (talk) 14:04, 14 May 2018 (UTC)
- And regarding your request as to what needs to be done if this one closes as no-consensus, then we will need a follow up RfC. And that will be the case for anything that is not 1A or 1F. --Dirk Beetstra T C 14:12, 14 May 2018 (UTC)
- @Fram: Keep insulting me then - it just shows the weakness of your arguments, as does your disdain for working with other good-faith editors like myself. Here's the link to you trying to influence the closers by whinging about Q1, Can Wikidata infoboxes be used in mainspace?. Now, can you give a link to any post by me instructing the closers how to close this? Or are all my posts on this subject simply replies to others proclaiming that there is a consensus while ignoring the fact that their preference is not shared by the majority? After all, I didn't start this thread.
... your deeds speak for themselves, and I have seen often enough what "progress" Wikidata infoboxes have brought after 5 years of experimenting.
- it seems you did create the false impression that I had been creating "Wikidata infoboxes". So yes, you do need reminding of the facts when you have the nerve to criticise my efforts to accommodate every concern that's been raised by now. It must be really galling for you to see how much real progress has been made in meeting genuine concerns, but of course it will never be enough for you, will it? As for my expectations, I'll certainly explain to you the false assumptions that several A votes were made under if you ask. --RexxS (talk) 14:32, 14 May 2018 (UTC)- I guess both Beetstra and myself (and many others presumably) are then unable to read, write, and comprehend arguments. If you read my reply to this thread as an instruction to closers how to close this, then so be it. To me, something like "Are you really going to insist that there is consensus for those two outcomes? Because it sure doesn't look like any consensus to me. If it lies open forever, it will just as clearly signal "no consensus" as would a default close. Your post of 12:35, 12 May 2018 (UTC) is the one focusing solely on the "A" option for some reason, as if it was an A vs not A question. Followup replies by you have "A "no consensus" close will not alter that. " and " the majority recognise that." and "another 30 days of inconclusive debate.". Adding to that the claim I already repeated above, about "A" votes being made "under false assumptions". Oh, and "it's no good whining over the outcome now where only a minority answered that question in the negative. There's no way that a minority view can be a consensus, so accept it and move on." I don't think I'll bother explaining the discrepancy between what I said about your work on Wikidata infoboxes and your defense of them, and your "false impression" statement above. If you can't see the difference between what I actually said, and what you read in it, then it is another piece of evidence that it is pointless continuing this discussion. Fram (talk) 04:36, 15 May 2018 (UTC)
- I indeed do not see anymore how I have to explain that WikiData data is not reliable. As I said, anything else than a full 1F or a full 1A closure will need follow up discussion, and that is best performed in the form of an RfC. How that follow up RfC is going to be is depending on how this RfC is closed - if it is a default 'no consensus' close, then that will be an RfC similar to this one but likely more direct, or an RfC that is going to specify how to interpret specific questions regarding the 'intermediate' close. --Dirk Beetstra T C 06:34, 15 May 2018 (UTC)
- And I don't see why you insist on trotting out patent falsehoods like "Wikidata is not reliable". That is pure nonsense – propaganda for those who wish to resist progress in any form. Anyone who looks at Wikidata can find huge amounts of reliable information, and I find it astonishing that you could think that folks are so gullible that they might fall for your claims. What unreliable data is present in the infobox at South Pole Telescope? Twenty pieces of information, and every one of them pulled from Wikidata. How about a more modest effort: François Barbé-Marbois whose infobox has only six pieces of information, each of which can be easily verified by following the icon-link. There is more information available on Wikidata, like the name of his wife, but it's unreferenced and so does not appear. By default we don't bring in unsourced claims, and that means that the editors on Wikipedia are in control of what comes from Wikidata. It matters not a jot what Wikidata's sourcing policies are, because we can enforce our own. Until you realise that, there won't be a common starting point for further discussion, but if you want to waste another month arguing from an unreal position, I guess that's your prerogative. I would prefer to spend my time improving the encyclopedia. --RexxS (talk) 17:21, 15 May 2018 (UTC)
- @Fram: Keep insulting me then - it just shows the weakness of your arguments, as does your disdain for working with other good-faith editors like myself. Here's the link to you trying to influence the closers by whinging about Q1, Can Wikidata infoboxes be used in mainspace?. Now, can you give a link to any post by me instructing the closers how to close this? Or are all my posts on this subject simply replies to others proclaiming that there is a consensus while ignoring the fact that their preference is not shared by the majority? After all, I didn't start this thread.
- You're still hoping that this RfC will be closed in your favour, aren't you? Well, what will your response be when it isn't? Will you actually stop insulting me ("holier-than-thou" indeed), and start working with me to improve how Wikidata is brought into Wikipedia? You obviously need reminding that I've created no infobox templates other than cut-down ones for demonstration purposes, like {{Infobox video game series/Wikidata}}. I have worked for five years refining the tools that editors have asked for: ensuring that every call will defer to any locally supplied value; implementing whitelisting and blacklisting of fields to enable opt-in infoboxes where control is at the individual article level; ensuring that by default only Wikidata that has a reference is imported; providing convenient links to the Wikidata entry to edit it or to check references; and more. You only have to look at the "1A, 1B, etc." votes to see that many are made under false assumptions, with no experience or knowledge of the conditions governing how Wikidata information is able to be imported. So I don't think I'll be giving you a free run to misdirect the closers in the vain hope that they might think that a minority somehow represents a consensus. --RexxS (talk) 13:45, 14 May 2018 (UTC)
- I see you reframing the RfC to suit your purposes. You still sprout the same "Wikidata is progress, anyone opposing integration of Wikidata is hampering progress" nonsense. I am more than happy to work with "those who are doing their best to ensure that Wikidata integration produces results that the majority can be happy with", but I haven't found them so far. I have found people doing all they can to ensure that the Wikidata version of infoboxes remain, or the Wikidata info in infoboxes, like Mike Peel with the World Heritage Site infobox, or you in multiple instances of the biography infobox. I don't need your patronizing holier-than-thou words when your deeds speak for themselves, and I have seen often enough what "progress" Wikidata infoboxes have brought after 5 years of experimenting. But apparently this is due to my stance that blocked this "progress" for years, even though only one Wikidata infobox (which was really dreadfully implemented) has been reverted due to my "stance" and you and others were free to continue developing and showing the progress. But of course, if you want to dismiss all people not wanting more of these infoboxes as being duped by the "colossal amount of misinformation" and not as having genuine concerns and enough valid arguments to oppose the continuation of this experiment, then it is easy to ask for compromise. The majority agreed with no one here, not just with me, as this is an extremely divisive issue, which has been shown again and again. The smart thing then is to let such an RfC run its course and let uninvolved admins close it without trying to influence them. All my "whining" has been in response to people like you who just can't let this happen and feel that they need to pre-summarize the outcome for some unclear reason. A great basis for compromise and collaboration you are forming there. Let's just drop this silly conversation and let people simply close it without "helping" them with these counts, shall we? Fram (talk) 13:17, 14 May 2018 (UTC)
- Sure, you had your chance. And you are whining about the outcome, even though the question was your preferred Can Wikidata infoboxes be used in mainspace?. Let's face it, you're only complaining because the majority didn't agree with you. You don't see me complaining that the majority didn't agree with me, do you? This RfC was problematical, not because of Question 1 (which asks a simple question that has a simple answer); and not because the answers were pre-packaged into a multiple choice; but because of the colossal amount of misinformation and misunderstanding spread about beforehand. My aim now is to examine what works well and build upon that, and discuss what concerns folks have and find ways of resolving them. Your position has been akin to Ian Paisley's "No surrender" stance that blocked progress for years. It's time for you to start seeking compromise and working with those who are doing their best to ensure that Wikidata integration produces results that the majority can be happy with. --RexxS (talk) 13:03, 14 May 2018 (UTC)
- I had my chance to frame the question? Where? Anyway, I'm not whining over the outcome, I'm not the one here repeatedly claiming how the closers should read the results or that option X or Y has been voted out. There is a minority for all questions, which is logical if you divide an RfC into so many questions; I tried to frame it as a yes-no question. You had your chance to suppport that question at the time, but decided against it. To claim the result as a yes-no vote now and dismiss anyone who sees through that obvious redefining of the question as "whining" is not helpful. If the outcome is so obvious, then why the need from at least three people not voting for A to put this here and pre-decide the outcome? The only things obvious is that there is no consensus for B, C, D, E or F, but for some reason none of you feel the need to dwell on those, only focusing on A again and again. Fram (talk) 12:29, 14 May 2018 (UTC)
- You had your chance to frame the question Can Wikidata infoboxes be used in mainspace?, Fram, so it's no good whining over the outcome now where only a minority answered that question in the negative. There's no way that a minority view can be a consensus, so accept it and move on. --RexxS (talk) 11:52, 14 May 2018 (UTC)
- My input was to remove the most obvious bits of bias. I gave my misgivings about the current RfC (and my opinion about what kind of RfC should be had instead) before the start of the RfC. My input into the design of this RfC was not a stamp of approval for it. I note, from the final discussions before this went live, that Beetstra, Francis Schonken, and others had misgivings about it, and that e.g. Beetstra explicitly said that starting with a straightforward yes-no question would have been better. You didn't go with it, but want to twist the results into a straightforward yes-no anyway, because otherwise you have one option with a lot of supports, and 5 with each a lot less support, and that gives a different impression than your grouping now of 2-6 together. Fram (talk) 11:25, 14 May 2018 (UTC)
- It is an open wiki, and an unstable one, the very definition of an unreliable source. Wikidata gets spammed, vandalised, filled with vandalised data because the editor does not care to check. WikiData has the same disclaimer as Wikipedia. You have no clue whether data is correct. It is not propaganda, it is s8mply a site that fails blindly our sourcing requirements. I don’t know how often I have to repeat that. I don’t trust em.wikipedia to be a reliable source, I don’t trust es.wikipedia to be a reliable source, I don’t trust a sentence on en.wikipedia that ‘’has’’ a reference without checking a reference. I don’t trust WikiData, and you expect that Wikipedia will be more trustworthy after blindly and automatically importing data from another site ‘because we only use data that is referenced’. WikiData IS an unreliable source. I spend my time on improving this encyclopedia, not by making it worse with this blind importing of unreliable data. Have you read WP:V recently. Don’t you see that WikiData fails that? That the material is sourced does not matter, a lot of material on en.wikipedia is sourced, but WikiData has no control over how well it is sourced (just like en.wikipedia), or whether the statement has not been changed since.
- I know I m not convincing you, and probably most of the other W8kiData fans, and tha5 is likely why we need a follow up RfC on any close that is not pure 1A or pure 1F.
- I said earlier that I was going to stop, but I am afraid that that we will need to repeat that any claim of reliability on WikiData is utter nonsense. —Dirk Beetstra T C 20:13, 15 May 2018 (UTC)
- @RexxS: In case you think that I am the only person saying that WikiData is an unreliable source, then please read the !vote of the last two !voters. From the very first all the way to the now last, it is a recurring theme: WikiData is an unreliable source. —Dirk Beetstra T C 20:30, 15 May 2018 (UTC)
- Wikidata is not a source. At all.--Ymblanter (talk) 20:41, 15 May 2018 (UTC)
- What? So where is Wikipedia getting that data? —Dirk Beetstra T C 21:02, 15 May 2018 (UTC)
- Editors add it, just like they do on Wikipedia. Wikipedia is not a source, either. --RexxS (talk) 21:21, 15 May 2018 (UTC)
- What? So where is Wikipedia getting that data? —Dirk Beetstra T C 21:02, 15 May 2018 (UTC)
- (edit conflict) And I don't know how often I have to repeat that Wikidata is not a source. It is a repository. The facts stored there are either sourced or not. The sources for those facts may be reliable or not. You know exactly if the data is correct in just the same way as you do for data stored on Wikipedia: you check the reference. Did you read what you wrote?
"I don’t trust a sentence on en.wikipedia that ‘’has’’ a reference without checking a reference."
and that's exactly the same for a fact imported from Wikidata. Why are you claiming that there is any difference? The steps you go though to meet WP:V are just the same. There is nothing about Wikidata that inherently makes the information found there unreliable, any more than that sentence on Wikipedia that you are willing to check. Why not check the fact imported into an infobox from Wikidata and test whether it meets our sourcing policies? If it doesn't, you can block it, fix it, or supply a value that you know to be correct. That's no different from any other fact in any non-Wikidata infobox. I also spend my time improving this encyclopedia, by making sure that editors have the tools to make proper use of information from Wikidata without having to do it blindly. You can stop any time you like, but you've still not refuted my demonstration that Wikidata contains reliable information in the two infoboxes I quoted to you. Here's another two at random: Very Large Telescope and Robert Axelrod. You'll see that the first has some local parameters which override the data on Wikidata; the editor has control. In the second, Wikidata gives his occupation as "mathematician", but the infobox doesn't show that because on Wikidata it's only sourced to the English Wikipedia, so it never arrives in our article. What's left is information that can be checked against the sources, just the same as in any other infobox. How many more examples of reliable data on Wikidata do I have to give you before you concede that the way we draw information from Wikidata into an infobox means that such information is made just as reliable as what we would find in an infobox that was populated completely by locally supplied values? --RexxS (talk) 21:21, 15 May 2018 (UTC)
- Wikidata is not a source. At all.--Ymblanter (talk) 20:41, 15 May 2018 (UTC)
- ”It is intended to provide a common source of data which can be used by Wikimedia projects such as Wikipedia ..”. Every place that contains data is a source. WikiData is a repository for data, a common source of data. If I use it, I sourced it on WikiData, and WikiData sourced it elsewhere. CNNis a source, PubChem is a source, my mother is a source, any place you get data from is a source. WikiData is a source. —Dirk Beetstra T C 21:49, 15 May 2018 (UTC)
- From WP:RS:
. Which of those three possibilities does Wikidata fit? It is not the work itself, as that is the facts that are imported. It is not the creator of those facts, as they come from external authors. It may well be considered the publisher of the facts, but like a Wikipedia, it is only ever a republisher of information that has already been published elsewhere. The reliability of the facts that arrive in Wikipedia are not affected by their transmission via Wikipedia – they were either reliable or not where they were originally published. Again, I'll make it clear: to judge the reliability of any fact on Wikipedia, whether added locally or imported from Wikidata, you have to look at the real source, the one you find by examining the reference. And the result is identical, regardless of whether the source is cited at the bottom of the article or on Wikidata. No matter how you look at it, Wikidata is not a source in any meaningful sense of the word for any of the facts that arrive in our infoboxes. --RexxS (talk) 22:26, 15 May 2018 (UTC)Definition of a source ... The word "source" when citing sources on Wikipedia has three related meanings: The piece of work itself (the article, book); The creator of the work (the writer, journalist); The publisher of the work (for example, Random House or Cambridge University Press)
- From WP:RS:
- of course it is A piece of work. It is a database, a list of information, it has information aggregated. That is a laughable argument. Let me dumb it down: it is an unreliable database of information where wikipedia gets part of the information it displays gets from, and for every infobox that displays correct data transcluded from WikiData I give you 50 correct statements on Wikipedia (with references). That however, does not make wikipedia, nor wikidata, a reliable place to withdraw information from. Both are open wikis. Both do not have the sufficient protection mechanisms that are needed to make it reliable (even is mostly correct). Unlike Wikipedia, however, WikiData has the possibility to to become a reliable database with information (if they change their model and work on it), in which case I will fully support transclusion of data as that would (massively) increase the reliability of wikipedia. Until then, as you say, it makes, at best, no difference, and think that WikiData is just bringing the quality of en.wikipedia down. —Dirk Beetstra T C 03:34, 16 May 2018 (UTC)
- If we apply this reasoning consistently, information which is added to Wikipedia articles from other Wikipedia articles is sourced to Wikipedia (and there is quite a lot of non-trivial information of this kind, for example, in the articles on railway stations the whole railway lines are imported via templates and can not be modified within the article). Even more, since the templates typically do not reansclude the sources, by this logic, this information is not reliable.--Ymblanter (talk) 06:38, 16 May 2018 (UTC)
- Wow ... You just confirmed Wikipedia:General disclaimer and WP:VERIFY. Wikipedia is not a reliable source, WikiData is not a reliable source, external open wikis are not a reliable source. Whether it is referenced or not only gives you a chance to verify, it does not make it a reliable source. --Dirk Beetstra T C 07:08, 16 May 2018 (UTC)
- Sure, but I do not see anybody around on a crusade to eliminate templates in Wikipedia because they transclude data from an unreliable source and are difficult to edit. The discussion on what to do with vandalism on Wikidata, and how Wikidata-transcluded fields in the template can be edited is a valid discussion. The full rejection of Wikidata because it is not a reliable source is not a valid viewpoint. It is not a source, and should not be viewed as such.--Ymblanter (talk) 07:22, 16 May 2018 (UTC)
- What, is that what I am doing, being on a crusade to eliminate things? Wikipedia is not a reliable source, and we are all working to get it better. Wikidata is a source, it should be viewed as such. You are aggregating data from other sources, but by that it becomes a source - what else is the use of WikiData, just aggregate data . If I create an open Wiki on Wikia, and I only add information with references there, I am still an open wiki and I am still not a reliable source, even if all material is true.
- Now secondly, Wikipedia has a massive editor base and a significant number of admins. They have significant capabilities in vandal fighting and all those edits happen here. And we do have capabilities of vandal fighting on templates (and by the way, many templates use data within the page format the display, so vandalism happens on the article, not on the template - many templates with higher risk are protected so they cannot be vandalized, all measures to increase the reliable use of templates).
- But now we have established that Wikipedia is not a reliable source, in part because we transclude templates that can be altered behind the scenes, how do you think that becomes if we transclude data from an external source through templates that can be vandalized into our articles.
- Again, if WikiData would be more stringent on the correctness of their data than Wikipedia itself is, then it would make sense to transclude their data (as that would make the data on Wikipedia more reliable), but since that is not the case, transcluding that data will only make Wikipedia less reliable. I am willing to change my mind when that happens, but that needs a lot of work to be done on WikiData. --Dirk Beetstra T C 07:49, 16 May 2018 (UTC)
- Right, we are back to our discussion of half a year ago. You think we should forget about Wikidata and bot-import data from databases directly to the English Wikipedia (and other Wikipedias and sister projects, which you do not have to care about, should themselves organize a similar process). I disagree with that, and I think the positions are pretty much clear. (I still insist that saying that Wikidata is used as a source is incorrect - we do not have a single piece of data outside of Wikidata article sourced to Wikidata). They are probably not going to change. The problem is that stupid bot owners for whatever reason are unwilling to perform this export directly to the English Wikipedia (possibly because many of them just do not care about the English Wikipedia) and they perform this import of data - reliably sourced - to Wikidata. Try to convince them to come here, or perform your own bot import, may be the problem is solved then.--Ymblanter (talk) 08:00, 16 May 2018 (UTC)
- I don't need to convince them, our editors are more than capable. And I see scenarios where it is possible to use WikiData, but that needs a change in how WikiData operates, it needs to become significantly stricter in sourcing ánd protection so that the reliability (not correctness) is better than Wikipedia.
- Indeed, no data is sourced to WikiData, because then you would show that it is wrong - but if the infobox would tell you where the data was from, it is from WikiData, and that is the source of the information ... there should be a citation that the data is from WikiData, or, since WikiData is not a reliable source, the data should not be there in the first place. --Dirk Beetstra T C 09:06, 16 May 2018 (UTC)
- We are going in circles. By the same logic, template transclusion at Wikipedia should not exist.--Ymblanter (talk) 09:15, 16 May 2018 (UTC)
- No, because templates can be protected and/or templates display information that is in the article. Moreover, if you transclude a template, you know what you transclude, unless the template is altered (of which many are protected) the display does not change. Here you transclude a template, and when you add data on WikiData that has a reference, the page changes. Yes, we are unreliable, that we already know. But WikiData is less reliable at the moment and not in the article - not even on this wiki.
- The point is, we have reliable data from an non-MediaWiki source, we include the information and add a reference to the reliable data .. done. But if you import that data into WikiData, you have it stored on a unreliable source (even if correct), and then Wikipedia transcludes it from the unreliable (even if correct) source (WikiData). By your logic, when Wikipedia is putting a statement with a reference, Wikipedia becomes a reliable source. And that is not so. The very fact that anyone can change the fact, change the reference, or change both makes it, even when correct, still unreliable.
- But I agree, we are going in circles. --Dirk Beetstra T C 10:46, 16 May 2018 (UTC)
- @Beetstra: Sorry, but your behaviour is totally unfair: "it needs to become significantly stricter in sourcing and protection so that the reliability (not correctness) is better than Wikipedia." You can't require that WD has an higher reliability than WP, you can just require than WD has the same requirement than WP and have expectation than WD can provide better reliability.
- Then always the same argumentation which doesn't take care of the reality:
- "no data is sourced to WikiData": wrong some data are sources, in the same way than WP, and it is possible to filter data according to their source
- "the infobox would tell you where the data was from, it is from WikiData": wrong, you can display the source data beside the value in WP. And if you don't trust WD as repository for source data, how can you continue to work in WP as the same reality exists: everyone can modify WP in the same way than in WD. And we have the same way to monitor modification than in WP using watchlist. Snipre (talk) 11:18, 16 May 2018 (UTC)
- We are going in circles. By the same logic, template transclusion at Wikipedia should not exist.--Ymblanter (talk) 09:15, 16 May 2018 (UTC)
- Right, we are back to our discussion of half a year ago. You think we should forget about Wikidata and bot-import data from databases directly to the English Wikipedia (and other Wikipedias and sister projects, which you do not have to care about, should themselves organize a similar process). I disagree with that, and I think the positions are pretty much clear. (I still insist that saying that Wikidata is used as a source is incorrect - we do not have a single piece of data outside of Wikidata article sourced to Wikidata). They are probably not going to change. The problem is that stupid bot owners for whatever reason are unwilling to perform this export directly to the English Wikipedia (possibly because many of them just do not care about the English Wikipedia) and they perform this import of data - reliably sourced - to Wikidata. Try to convince them to come here, or perform your own bot import, may be the problem is solved then.--Ymblanter (talk) 08:00, 16 May 2018 (UTC)
- Sure, but I do not see anybody around on a crusade to eliminate templates in Wikipedia because they transclude data from an unreliable source and are difficult to edit. The discussion on what to do with vandalism on Wikidata, and how Wikidata-transcluded fields in the template can be edited is a valid discussion. The full rejection of Wikidata because it is not a reliable source is not a valid viewpoint. It is not a source, and should not be viewed as such.--Ymblanter (talk) 07:22, 16 May 2018 (UTC)
- Wow ... You just confirmed Wikipedia:General disclaimer and WP:VERIFY. Wikipedia is not a reliable source, WikiData is not a reliable source, external open wikis are not a reliable source. Whether it is referenced or not only gives you a chance to verify, it does not make it a reliable source. --Dirk Beetstra T C 07:08, 16 May 2018 (UTC)
- If we apply this reasoning consistently, information which is added to Wikipedia articles from other Wikipedia articles is sourced to Wikipedia (and there is quite a lot of non-trivial information of this kind, for example, in the articles on railway stations the whole railway lines are imported via templates and can not be modified within the article). Even more, since the templates typically do not reansclude the sources, by this logic, this information is not reliable.--Ymblanter (talk) 06:38, 16 May 2018 (UTC)
- Totally unfair? That is what we see with commons, they are stricter on fair use than we are. And I possibly could live with a similar reliability .. but we are far away from that.
- Again the same red herring, 'it is possible to filter data according to their source' - yes, but a) you don't know if the data changed under the source, b) whether the data is actually properly supported by the source, or whether the source is a source that we would use (for that data). Then there is data that does not need a source in some cases, and does in others. "sky:color=blue".
- No, the data is from WikiData, it is sourced from WikiData, and WikiData sourced it somewhere else - and I can get that data as well. I agree that it helps that we can now see more of the changes, but we don't know whether the data that is transcluded was right in the first place. Guys, WikiData is an unreliable source, it is an open wiki. --Dirk Beetstra T C 12:03, 16 May 2018 (UTC)
- We seem to be re-arguing everything already stated with the !votes in the threads above. No need to repeat ourselves. The closers will read all our comments and make a determination. Let’s leave them to it, shall we? Blueboar (talk) 12:29, 16 May 2018 (UTC)
- Indeed. I'm a bit disturbed by all these "1A is definitely losing" attempts, above, to cloud an incoming closer's decision-making and assessment process, especially since the statistical technique, if you can even call it that, is childishly faulty, comparing 1A against all other options combined. The air is thick with desperate bullshit in here, so I feel impelled to counter that manipulation by blowing some holes in it. 1A is clearly the single leading outcome, by a very wide margin. As of this writing, and to just take question 1 (since if it goes in the direction of 1A, it makes the other questions moot), I see a 41 : 3.5 : 5.5 : 13.5 : 3 : 17.5 ratio (in A through F order, counting things like "1D or 1F" as 0.5 each). Even if you do add it up "all against 1A" (which is patently manipulative, and if turned around each in turn into "all against 1B", etc., looks very bad for every option except 1A) it's 41 : 42, which in any sane assessment will come out as a consensus for 1A – a single solidly supported outcome with consistent rationales, versus a bunch of random stuff with incompatible rationales – or as "no consensus" if the closer is spineless.
However, there are more important considerations here that the raw numbers (what with this being not a vote, which is part of why I changed the heading that implied that it was one). First off, 1F is irrational. I reads, "Every infobox that is technically ready to convert from local content to Wikidata content may be implemented in mainspace", but the entire nature of the dispute is that many in the community do not believe any infobox is "ready" to be converted to WD, because WP's core content policies do not apply on WD, so we have no en.WP content control over what appears on en.WP via WD. That is, 1F is a textbook example of the fallacy of begging the question. Second, 1B has no bearing whatsoever on 1A versus any other option(s), because it's a sandbox. Indeed, 1A could proceed as the consensus, and 1B could happen anyway, because we can do anything reasonable in a sandbox that isn't presented as an actual article (otherwise the entire "Draft:" namespace and much of userspace and the project namespace would be deleted immediately). 1C is essentially identical in actual meaning to 1A, other than it concedes that a future RfC might decide otherwise on a case-by-case basis. Given the overwhelming solidarity behind 1A, an RfC producing a 1C exception outcome for any given template is unlikely. Though 1A doesn't state that a future RfC might overturn it or make limited exception to it, and we all know it, and it's implicit in 1A's meaning, because no RfC or other decision here is Forever, and every consensus can change. Lastly, 1D and 1E are basically more rational takes on the vague sentiment behind 1F. If you're going to do math, add 1A and 1C versus 1D, 1E, and 1F, and ignore 1B as noise. The 1A 1C outcome is clearly strongly dominant. This is another case of "too soon", as so many of us have been saying about WD for several years now. Rather than fix the core (namely core content policy) problem, by rejiggering WD to support tagging of data with different sourcing standards, so that data can be extracted on sourcing-filtered basis, WD's fans keep re-re-re-RfCing this stuff in hope of WP:WINNING by exhausting the opposition (another pair of classic fallacies). It's patent forum shopping, and it needs to stop. — SMcCandlish ☏ ¢ 😼 15:58, 18 May 2018 (UTC)
- @SMcCandlish: While I'm not sure about the discussion in this section, you can hardly claim that the RfC is on the whole forum shopping. There hasn't been an organized RfC on this in ages, and this RfC has been directly requested by the Arbitration Committee in order to resolve the content disagreement within the community. I'm not sure about the conduct in the section, but the RfC itself can hardly be called forum shopping. Tamwin (talk) 19:13, 23 May 2018 (UTC)
- The WD fans pushing, pushing, re-pushing, the same basic "let us replace WP content with WD content, never mind your concerns" thing, in essentially the same terms over and over again, without the concerns being addressed (yet the concerns remaining the same and being widespread among editors, i.e. representing a consensus that they're legitimate issues that need to be resolved first), well, yes, it is forum shopping. The sad thing is, a lot of us see the potential of WD, but the WD crowd don't want to accede to anything, they just want it their way, and now. But it's not ready for primetime, not when you're talking about directly integrating WD content into en.WP and other sites with their own verifiability and other policies. — SMcCandlish ☏ ¢ 😼 22:17, 23 May 2018 (UTC)
" ... without the concerns being addressed ..."
. That's a lie. Every legitimate concern raised with me about the code I write to import Wikidata into infoboxes has been addressed, and resolved. You should be ashamed of yourself for spreading such patently untrue propaganda. --RexxS (talk) 11:00, 24 May 2018 (UTC)"The sad thing is, a lot of us see the potential of WD, but the WD crowd don't want to accede to anything, they just want it their way, and now."
I don't think this is true at all. This to me mostly seems like a "we will take the problems in the short term, to solve them in the long term" vs. a "we don't want any problems in the short term, call us when it works" type of debate. And neither is acceding on the opposing viewpoint. It's actually not so different from inclusionists vs deletionists debates we have had in that regard. —TheDJ (talk • contribs) 11:39, 25 May 2018 (UTC)
- The WD fans pushing, pushing, re-pushing, the same basic "let us replace WP content with WD content, never mind your concerns" thing, in essentially the same terms over and over again, without the concerns being addressed (yet the concerns remaining the same and being widespread among editors, i.e. representing a consensus that they're legitimate issues that need to be resolved first), well, yes, it is forum shopping. The sad thing is, a lot of us see the potential of WD, but the WD crowd don't want to accede to anything, they just want it their way, and now. But it's not ready for primetime, not when you're talking about directly integrating WD content into en.WP and other sites with their own verifiability and other policies. — SMcCandlish ☏ ¢ 😼 22:17, 23 May 2018 (UTC)
- @SMcCandlish: While I'm not sure about the discussion in this section, you can hardly claim that the RfC is on the whole forum shopping. There hasn't been an organized RfC on this in ages, and this RfC has been directly requested by the Arbitration Committee in order to resolve the content disagreement within the community. I'm not sure about the conduct in the section, but the RfC itself can hardly be called forum shopping. Tamwin (talk) 19:13, 23 May 2018 (UTC)
- Indeed. I'm a bit disturbed by all these "1A is definitely losing" attempts, above, to cloud an incoming closer's decision-making and assessment process, especially since the statistical technique, if you can even call it that, is childishly faulty, comparing 1A against all other options combined. The air is thick with desperate bullshit in here, so I feel impelled to counter that manipulation by blowing some holes in it. 1A is clearly the single leading outcome, by a very wide margin. As of this writing, and to just take question 1 (since if it goes in the direction of 1A, it makes the other questions moot), I see a 41 : 3.5 : 5.5 : 13.5 : 3 : 17.5 ratio (in A through F order, counting things like "1D or 1F" as 0.5 each). Even if you do add it up "all against 1A" (which is patently manipulative, and if turned around each in turn into "all against 1B", etc., looks very bad for every option except 1A) it's 41 : 42, which in any sane assessment will come out as a consensus for 1A – a single solidly supported outcome with consistent rationales, versus a bunch of random stuff with incompatible rationales – or as "no consensus" if the closer is spineless.
Posted closure request on AN Galobtter (pingó mió) 13:58, 14 May 2018 (UTC)
- How is it that I found this page only now, because today I happened to casually took a look at a infobox talk page ? --Robertiki (talk) 15:30, 27 June 2018 (UTC)
- Robertiki the RFC was unusually well advertised, including but not limited to announcement on Centralized Discussion.[18] There will always be people on both sides of an issue who don't notice an appropriately advertised RFC until it's over. If you want to more actively participate in RFCs in general, you can always join Feedback request service. You could also watchlist Village Pump (proposals) and/or {{Centralized discussion}}. Alsee (talk) 16:19, 27 June 2018 (UTC)
Delay makes checking Wikidata changes from enwiki impossible
[edit]Above, there are some discussions about using the watchlist or recent changes on enwiki to keep an eye on Wikidata changes affecting enwiki articles. Which sounds all fine and useful, until one realises that once again, the delay between a change on Wikidata, and its appearance on our wtachlist or on recent changes, has now become more than 15 hours (you can check the current situation on this page). Which means that no Wikidata changes appear in recent changes, and in your watchlist they only appear way down your list, long after you have checked the other (enwiki) changes made at that time.
This is an unworkable situation and sets the door wide open for vandalism (or good faith mistakes) which can't be spotted by regular vandalism watchers or most watchlist users on enwiki. This isn't the first time this happened, I have dropped notes about such delays and longer ones regularly in 2017 as well. Fram (talk) 10:59, 24 May 2018 (UTC)
- That's unusual. Pinging @Lydia Pintscher (WMDE):. Mike Peel (talk) 11:42, 24 May 2018 (UTC)
- Like I said, not really unusual at all. There are better periods (although even then the delay is often too much to be useful for recent changes, but not for the watchlist), but these bad tims happen too often as well. Fram (talk) 11:49, 24 May 2018 (UTC)
- It's unusual enough that you felt it necessary to mention it here and now. From my checking of it in the past, it's normally minutes, not hours. Mike Peel (talk) 13:38, 24 May 2018 (UTC)
- Thanks a lot for the ping. That is definitely not normal as you can see on https://grafana.wikimedia.org/dashboard/db/wikidata-dispatch?refresh=1m&orgId=1. Something broke and my team is looking into it right now. --Lydia Pintscher (WMDE) (talk) 15:21, 24 May 2018 (UTC)
- Ok we found the issue. It was a case of trying to improve something and breaking it instead. Fixed now. The lag is still high but going down now. Sorry again for the issue. --Lydia Pintscher (WMDE) (talk) 18:25, 24 May 2018 (UTC)
- I mentioned it "here and now" because multiple edits on this page right before my post were discussing the use of the watchlist and recent changes to watch relevant Wikidata changes. As this discussion was already spread over different sections, it seemed easier to add a new section especially for this. Fram (talk) 04:30, 25 May 2018 (UTC)
- Thanks a lot for the ping. That is definitely not normal as you can see on https://grafana.wikimedia.org/dashboard/db/wikidata-dispatch?refresh=1m&orgId=1. Something broke and my team is looking into it right now. --Lydia Pintscher (WMDE) (talk) 15:21, 24 May 2018 (UTC)
- It's unusual enough that you felt it necessary to mention it here and now. From my checking of it in the past, it's normally minutes, not hours. Mike Peel (talk) 13:38, 24 May 2018 (UTC)
- Like I said, not really unusual at all. There are better periods (although even then the delay is often too much to be useful for recent changes, but not for the watchlist), but these bad tims happen too often as well. Fram (talk) 11:49, 24 May 2018 (UTC)
Low visibility of post-facto edits on WikiData
[edit]- At 07:07, 30 May 2018 the article Diamela del Pozo was created. It has {{Twitter}} and {{Facebook}} on them.
- The WikiData item at that time is NOT linked to the new Wikipedia article. See box 'Wikipedia' in top-right, only showing an 'es' entry in this revid of 04:47, 13 February 2018 on WikiData
- That results in the article NOT transcluding the Twitter and Facebook, but displaying "
{{Twitter}} template missing ID and not present in Wikidata.
" and "{{Facebook}} template missing ID and not present in Wikidata.
". - At 7:08the item is linked to the en.wikipedia article. Thát action shows as "
07:08 Q21127480 (diff | hist) . . Nick Number (talk | contribs) (A Wikidata item has been linked to this page.)
" - That not only results in the appearance of the interwiki in es in our article, but also in the transclusion of the twitter and facebook (they now display as "
Wikidata/2018 Infobox RfC on Twitter
" and "Wikidata/2018 Infobox RfC on Facebook
". (the watchlist-message from WikiData does not say that it actually resulted in the transclusion of the facebook, and twitter).
This shows, that we data appears in articles after an edit is performed, and that an editor who edits here does not know what data will be transcluded in the future. This is a new BLP, which in the minutes after creation (and probably for a relatively unknown person in the hours/days after creation - in about 2 hours the number of watchers is still '2', me now being 1 of them to be able to test above, the other I assume is the page creator) editors could introduce all sorts of data into BLPs that are not noticed by anyone on en.wikipedia but the page creator (and only if they both watchlist their new article, and have WikiData changes turned on). With editors regularly being away for several hours, with 2 page-watchers edits on WikiData are easily missed. --Dirk Beetstra T C 06:23, 30 May 2018 (UTC)
- I'm not sure what the problem is that you're trying to point out here? The same editor who created the enwp article added the sitelink on Wikidata (@Nick Number:), so it would be reasonable to assume that the Wikidata-enabled templates were added with the intention of showing the info from Wikidata after they added the sitelink. If you're watching the article, then you'll see "A Wikidata item has been linked to this page." as a line in your watchlist. If you're doing new page patrol then you either see the error messages or the post-sitelinked version of the article. Do you somehow want the Wikidata sitelink to be added before the article is created? Mike Peel (talk) 11:27, 30 May 2018 (UTC)
- No, this is to show that data is/can be added after a transcluding template is being placed (which is obvious). On low-watched articles (as this article is), that means that only very few editors will see material being added to such an article. There are above continuing arguments that people who add a WikiData-transcluding template to an en.wikipedia article are responsible for checking whether the material that is transcluded at that time is correct/verified/appropriate - but if that material is added later (as this example technically shows - there are no further issues in this example) there may be no-one to check whether the material is appropriate for en.wikipedia. --Dirk Beetstra T C 12:31, 30 May 2018 (UTC)
- So...I'll just keep doing what I'm doing then. If I'm understanding correctly, there's no great harm in having broken templates on the article for the <1 minute between creating it and linking it. Nick Number (talk) 15:45, 30 May 2018 (UTC)
- No Nick Number, what you did was very correct, you just gave me an example of something that I think could give problems elsewhere. I’ll wait/search for such an example, but I guess people get the idea. —Dirk Beetstra T C 16:24, 30 May 2018 (UTC)
- But surely if an editor adds a template to a page with only two watchers and somebody later alters how that template renders in the article, very few editors will see that change. That's true whether the template is connected to Wikidata or not. Why are you touting this as a problem for Wikidata-infoboxes, but not for any other type of template? Are you going to be asking editors who place any sort of template to be responsible for any future changes to the template? As usual, this is a general issue that is being misrepresented as something peculiar to Wikidata-infoboxes. I will assume that anyone observing this thread will get what the actual situation is. --RexxS (talk) 20:38, 30 May 2018 (UTC)
- I agree that this is a problem with using templates in general (and why I don’t think we should use them) ... however, also think that the problem is harder to spot and fix when WD is involved. Blueboar (talk) 23:44, 30 May 2018 (UTC)
- a) templates transcluded on pages generally have way more watchers, b) edits to the new BLP here are to the page here, and indeed will not have many watchers. BLP violations through edits on WD are just way harder to spot, WD does not have the vandal fighting capabilities that en.wikipedia has. —Dirk Beetstra T C 03:15, 31 May 2018 (UTC)
- The point that I am trying to make here is though: there are arguments being made basically saying: "if I add an infobox to en.wikipedia, then it is my responsibility to check whether all the values that are being transcluded are properly referenced, factual, correct, etc. etc.". The point is, that I can add an infobox with transcluded information that is thoroughly checked by me, but that someone else can later, on WikiData, add data that results in it being transcluded without having been checked by me. The editor who added the infobox does not have responsibility for what is being added later on WikiData, and the editor who adds data to WikiData has no responsibility for what is transcluded on en.wikipedia. I know that the situation is the same as with changing a template on en.wikipedia that results in changing articles, but that editor has an en.wikipedia responsibility, and has to follow en.wikipedia policies (or at least, en.wikipedia standards), whether he edits the article or the template. I let others extrapolate what that means in areas where there are legal policies applicable (BLPs), or where we have sanctions applied (arbitration enforcement). --Dirk Beetstra T C 07:39, 31 May 2018 (UTC)
- But surely if an editor adds a template to a page with only two watchers and somebody later alters how that template renders in the article, very few editors will see that change. That's true whether the template is connected to Wikidata or not. Why are you touting this as a problem for Wikidata-infoboxes, but not for any other type of template? Are you going to be asking editors who place any sort of template to be responsible for any future changes to the template? As usual, this is a general issue that is being misrepresented as something peculiar to Wikidata-infoboxes. I will assume that anyone observing this thread will get what the actual situation is. --RexxS (talk) 20:38, 30 May 2018 (UTC)
- No Nick Number, what you did was very correct, you just gave me an example of something that I think could give problems elsewhere. I’ll wait/search for such an example, but I guess people get the idea. —Dirk Beetstra T C 16:24, 30 May 2018 (UTC)
- So...I'll just keep doing what I'm doing then. If I'm understanding correctly, there's no great harm in having broken templates on the article for the <1 minute between creating it and linking it. Nick Number (talk) 15:45, 30 May 2018 (UTC)
- No, this is to show that data is/can be added after a transcluding template is being placed (which is obvious). On low-watched articles (as this article is), that means that only very few editors will see material being added to such an article. There are above continuing arguments that people who add a WikiData-transcluding template to an en.wikipedia article are responsible for checking whether the material that is transcluded at that time is correct/verified/appropriate - but if that material is added later (as this example technically shows - there are no further issues in this example) there may be no-one to check whether the material is appropriate for en.wikipedia. --Dirk Beetstra T C 12:31, 30 May 2018 (UTC)
Most widely used templates on enwiki are restricted to very few users (admins plus template editors), meaning that the chance that someone changes e.g. something in "template:infobox biography" as vandalism is very, very low (accounts get compromised, and editors do stupid stuff, but these are exceptions). On the other hand, if you have a template in your article which is protected here, but fetches data from Wikidata, then the values displayed here can be vandalized much more easily (see the examples of changes to the labels of countries). Basically, on enwiki you can change template data locally (directly in the article), which will show up in the page history, watchlist and recent changes immediately. For most templates, you (where "you" is a random vandal) cannot change the template directly though. But with a Wikidata-filled template, random vandals again can easily vandalize articles, and it doesn't show up in the page history, it often doesn't show up in recent changes (due to the lag, e.g. at the moment WD changes take more than 6 minutes to appear here), so at best it shows up in the watchlist. And after it has eventually been reverted on Wikidata, it may still linger in enwiki for days. Fram (talk) 08:46, 31 May 2018 (UTC)
- Stronger than that, we even protect single use, fully prefilled templates (infoboxes) so that the data they display does not get vandalized. With implementation of WikiData on that, we would lose that capability. --Dirk Beetstra T C 10:33, 31 May 2018 (UTC)
Post RFC discussion
[edit]First, my thanks to the closers. Now... the next question is how to implement what they say. If the consensus is that Wikidata can be used ... but only as long as Wikipedians can be assured that Wikidata’s info is reliable... how do we achieve that assurance? I would say the most obvious way would be to require an in-line citation to a reliable source along with any info that is populated via Wikidata. It would be up to the developers at Wikidata to figure out how to make that happen. Blueboar (talk) 22:29, 13 June 2018 (UTC)
- @Blueboar: Most of the Wikidata templates already have the code built in to show the in-line references (set refs=yes), it's easy enough to change the default for that back to showing them if we want that. Thanks. Mike Peel (talk) 22:41, 13 June 2018 (UTC)
- @Blueboar: What have the developers at Wikidata got to do with how we import data into Wikipedia? The code to do that job resides here, under our control, not on another project. Why should we have to have an inline citation for "sky is blue"-type statements or for images that are imported from Wikidata? We already have the guarantee that each piece of information likely to be challenged has a reference, and that it can be verified by following the link to that reference. What more must be required to meet Wikipedia's policies regarding factual information? I understand that data subject to MEDRS will need stronger sourcing, but the means of verification, even of that, can rely on the same mechanism. --RexxS (talk) 23:42, 13 June 2018 (UTC)
- Well, according to the closers, the consensus of the community is: “what WP does now is not enough” ... so my question is essentially asking: “OK... then what more needs to be done? What do we here at WP want WD to do so we can be assured of reliability?” Blueboar (talk) 02:20, 14 June 2018 (UTC)
- I would say there are two main problems: 1) unsourced and badly sourced data and 2) vandalism on Wikidata. For 1) we need to require that all data we pull out of Wikidata is sourced reliably for our standards, as discussed above. 2) is at this point more difficult to address. I would say we need (possibly with the help of Wikiprojects) to identify the items which gets vandalized too often and are not safe at this point to show here - for example, country names (or, possibly, another way round - which items are relatively safe to show, or, conversely, where the benefit of having them on Wikidata overweighs the risks related to vandalism), and also we need some technical discussion about the Google caching issues identified during this RfC,--Ymblanter (talk) 05:37, 14 June 2018 (UTC)
- I am afraid that this is an unclarity in the closing statement (or in the RfC). I guess a follow up question is, whether the community thinks that the material sourced from wikidata is sufficiently sourced on wikidata (when it carries sources, or when it does not need to be sourced). —Dirk Beetstra T C 06:01, 14 June 2018 (UTC)
- @Ymblanter: I've taken two significant measures with Module:WikidataIB following the close of this RfC: (1) because the Wikidata label has shown itself vulnerable to casual vandalism, where a Wikidata item has a sitelink, that is now used instead of the label (thanks to Was a bee for suggesting that); (2) the code now only fetches the Wikidata statement required, rather than the whole entry, which should help eliminate a lot of the irrelevant changes reported in watchlists. The aim is to eventually make it easier for more editors here to use their watchlists to spot vandalism on Wikidata. If anyone has any suggestions for further improvements in how we import from Wikidata, please express them. --RexxS (talk) 10:34, 14 June 2018 (UTC)
- Thanks, it definitely is a good direction.--Ymblanter (talk) 10:38, 14 June 2018 (UTC)
- @Ymblanter: I've taken two significant measures with Module:WikidataIB following the close of this RfC: (1) because the Wikidata label has shown itself vulnerable to casual vandalism, where a Wikidata item has a sitelink, that is now used instead of the label (thanks to Was a bee for suggesting that); (2) the code now only fetches the Wikidata statement required, rather than the whole entry, which should help eliminate a lot of the irrelevant changes reported in watchlists. The aim is to eventually make it easier for more editors here to use their watchlists to spot vandalism on Wikidata. If anyone has any suggestions for further improvements in how we import from Wikidata, please express them. --RexxS (talk) 10:34, 14 June 2018 (UTC)
- I am afraid that this is an unclarity in the closing statement (or in the RfC). I guess a follow up question is, whether the community thinks that the material sourced from wikidata is sufficiently sourced on wikidata (when it carries sources, or when it does not need to be sourced). —Dirk Beetstra T C 06:01, 14 June 2018 (UTC)
- I would say there are two main problems: 1) unsourced and badly sourced data and 2) vandalism on Wikidata. For 1) we need to require that all data we pull out of Wikidata is sourced reliably for our standards, as discussed above. 2) is at this point more difficult to address. I would say we need (possibly with the help of Wikiprojects) to identify the items which gets vandalized too often and are not safe at this point to show here - for example, country names (or, possibly, another way round - which items are relatively safe to show, or, conversely, where the benefit of having them on Wikidata overweighs the risks related to vandalism), and also we need some technical discussion about the Google caching issues identified during this RfC,--Ymblanter (talk) 05:37, 14 June 2018 (UTC)
- Well, according to the closers, the consensus of the community is: “what WP does now is not enough” ... so my question is essentially asking: “OK... then what more needs to be done? What do we here at WP want WD to do so we can be assured of reliability?” Blueboar (talk) 02:20, 14 June 2018 (UTC)
- @Blueboar: What have the developers at Wikidata got to do with how we import data into Wikipedia? The code to do that job resides here, under our control, not on another project. Why should we have to have an inline citation for "sky is blue"-type statements or for images that are imported from Wikidata? We already have the guarantee that each piece of information likely to be challenged has a reference, and that it can be verified by following the link to that reference. What more must be required to meet Wikipedia's policies regarding factual information? I understand that data subject to MEDRS will need stronger sourcing, but the means of verification, even of that, can rely on the same mechanism. --RexxS (talk) 23:42, 13 June 2018 (UTC)
There appears to be a problem with the overall summary. In the fourth paragraph, after finding no consensus on other issues, the closers find
Despite this clear bifurcation, there is a consensus on one point: if Wikipedia wants to use data from Wikidata, there needs to be clear assurances on the reliability of this data.
In the final summary, the first of these conditions is dropped:
There is a consensus that data drawn for Wikidata might be acceptable for use in Wikipedia if Wikipedians can be assured that the data is accurate, and preferably meets Wikipedia rules of reliability.
That seems to have a quite different connotation. Kanguole 10:59, 14 June 2018 (UTC)
- How so? They seem to have the same connotations to me. Blueboar (talk) 11:58, 14 June 2018 (UTC)
- No, the problem is that with question 3, there was no "no data, even sourced, is acceptable" option, so everyone(?) who answered 1A and 2A also answered 3A, not meaning "data drawn for Wikidata might be acceptable for use in Wikipedia if Wikipedians can be assured that the data is accurate, and preferably meets Wikipedia rules of reliability." but "we need the most strict rules about using data from Wikidata, and preferably don't use it at all. Basically, some 40 votes who supported "Wikidata is never acceptable" have now been counted with the "data might be acceptable" consensus simply because no stronger option existed for Q3.
- This doesn't mean that it perhaps for now isn't the best possible consensus position, but it is too simplistic to look at 3A in isolation and say that most people supported it. Fram (talk) 12:05, 14 June 2018 (UTC)
- That is reflected in the fourth paragraph, where there is the qualification
if Wikipedia wants to use data from Wikidata
. However this was omitted from the overall summary, which says that there is a consensus that using data drawn from Wikidata might be acceptable under some conditions. Such a consensus is not supported by the main body of the closure (or the RFC). Kanguole 12:13, 14 June 2018 (UTC)- I think the point is that if it "might" be acceptable under some conditions, it is certainly unacceptable without them. In other words, the consensus is unclear if we include the conditions, but clearly against the use otherwise. No inconsistency arises from that reading. Tamwin (talk) 18:33, 14 June 2018 (UTC)
- so what the naysayers should do now is spell out what those conditions actually are. Give the good folks at Wikidata some goals to achieve. Blueboar (talk) 20:01, 14 June 2018 (UTC)
- Blueboar most wikidata-proponents have already been willing to do virtually anything&everything technically possible to "improve" how wikidata is used. The problem is that all of those efforts don't help. Those "improvements" do exactly zero to address the fundamental objections of most critics. The fundamental problem is the automatic-import of content. The only improvement that actually resolves those concerns is one that shuts off automatic-import. That is effectively equal to 1A. That leaves us at an impasse. We can't resolve the situation by proponents offering to work harder at what they're already doing. We need X% of people "in the middle" to either get on board with the side saying what we've already got is good enough, or we need X% of people "in the middle" to get on board with the side saying that automatic-import creates issues that are literally impossible to fix. Alsee (talk) 23:33, 14 June 2018 (UTC)
- so what the naysayers should do now is spell out what those conditions actually are. Give the good folks at Wikidata some goals to achieve. Blueboar (talk) 20:01, 14 June 2018 (UTC)
- I think the point is that if it "might" be acceptable under some conditions, it is certainly unacceptable without them. In other words, the consensus is unclear if we include the conditions, but clearly against the use otherwise. No inconsistency arises from that reading. Tamwin (talk) 18:33, 14 June 2018 (UTC)
- That is reflected in the fourth paragraph, where there is the qualification
- Items appear to be populated from wikidata with no indication in the article that they are being brought in from an external source surely each and every item should be indicated as such. Keith D (talk) 20:16, 14 June 2018 (UTC)
- @Keith D: By default, items using the WikidataIB module have an "edit-icon" (a pen) at the end, which provides a link to the relevant Wikidata statement like this example for Douglas Adam's spouse:
{{wdib |P26 |qid=Q42 |fwd=ALL}}
→ Jane Belson . The link gives a quick way to check the reference or correct an error. It also provides the indication you wanted that the information is from Wikidata by using a tooltip visible if you hover over it. Does that help? --RexxS (talk) 23:22, 14 June 2018 (UTC)- There are reasonable arguments for any wikidata-items to be tagged with an icon. However my impression is that the prevailing view is that cluttering infoboxes with a hundred million wikidata icons is unacceptably gross. It's useless and visually disruptive for the 99.9 % merely trying to read the article. Alsee (talk) 23:53, 14 June 2018 (UTC)
- The item is not shown with an "edit-icon", an IP editor blanked out a field in Wikipedia then complained that the item still appeared. I had to look into the code for the infobox to find that the item was been dragged in from wikidata. This is no way to be allowing wikidata to appear without showing it is from wikidata. Keith D (talk) 00:08, 15 June 2018 (UTC)
- Keith D yeah... I've raised the same concern myself. I basically see four options on that. Not use wikidata, accept that we're going to confuse the hell out of editors with this sort of mysterious magical value appearing, crap-up infoboxes with wikidata-edit icons on each field, or 2C-Opt-in where we crap-up the infobox source with parameters like |Year=wikidata so users can figure out why the value magically appears. Alsee (talk) 00:22, 15 June 2018 (UTC)
- (edit conflict × 2) @Alsee: And yet, for those infoboxes that have rather less than a hundred million icons, we find that 99.999% of readers never complain about them at all. My impression is that the prevailing view is they perform a useful function (as Keith D's request indicates). Of course, the icons can be turned off with a single parameter on a per-article basis, or you can even have a single "edit at Wikidata" link at the bottom of the infobox if that is what tickles your aesthetic sensibilities.
- @Keith D: Of course it's an "edit icon". The pen is used ubiquitously to indicate editing, especially in the Wikimedia interfaces. Show me a link to where "an IP editor blanked out a field in Wikipedia then complained that the item still appeared". if you mean that an IP editor removed a locally-supplied value from an infobox that was enabled for fetching Wikidata, then your best course is to ask them why they wanted to remove the locally-supplied value because another editor had obviously made the effort to place that value there. If the value that then appeared was fetched from Wikidata, it would by default show the edit icon indicating that the value came from Wikidata as I've explained to you. --RexxS (talk) 00:25, 15 June 2018 (UTC)
- I am saying there was no edit icon shown on the item. The IP was removing a locally supplied value for a dead URL as the company is now defunct and so the URL value should not be displayed. Keith D (talk) 00:32, 15 June 2018 (UTC)
- The other option is to spot the parameter at the top of the infobox that says
|fetchwikidata=ALL
. It should be a bit of a give-away that the information can be fetched from Wikidata. --RexxS (talk) 00:27, 15 June 2018 (UTC)- RexxS the problem is that you're applying the perspective of a wikidata-enthusiast. Despite the oft-repeated claims that "wikitext is confusing for new users", wikitext is generally not too difficult for a half-way-competent new user to figure out what's going on.... or at least to be able to identify the part that they don't understand. For example they see a value in an infobox-wikitext, and they see the same or similar content in the rendered infobox. They add, alter, or remove a value from infobox-source, and it's added, altered, or removed from the rendered infobox. A new user has literally never heard of wikidata, and a lot of not-new editors know exactly zero about wikidata. Here we have an editor who gets the idea of how infoboxes work, they delete a value from the source, and WTF an utterly mysterious magical value is still appearing in the infobox! It is completely unreasonable to expect that this user is supposed to understand how that value is appearing in the infobox. It is unreasonable to expect that user to figure out what they need to do to get rid of that value. (Either they would have to delete it at wikidata, or they have to find the exact magical-incantation parameter-string that tells the infobox not to fetch that specific value. You know that magic incantation parameter string, but you can't expect them to know it, you can't expect them to find it.) Alsee (talk) 01:21, 15 June 2018 (UTC)
- P.S. The issue can be addressed by something like |year=wikidata to actively request value be retrieved. The user can reasonably guess/try deleting "wikidata" and the value disappears. They then understand "wikidata" showhow provides a value, and if they want to understand more they know exactly what they need to learn. Alsee (talk) 01:33, 15 June 2018 (UTC)
- Keith D yeah... I've raised the same concern myself. I basically see four options on that. Not use wikidata, accept that we're going to confuse the hell out of editors with this sort of mysterious magical value appearing, crap-up infoboxes with wikidata-edit icons on each field, or 2C-Opt-in where we crap-up the infobox source with parameters like |Year=wikidata so users can figure out why the value magically appears. Alsee (talk) 00:22, 15 June 2018 (UTC)
- The item is not shown with an "edit-icon", an IP editor blanked out a field in Wikipedia then complained that the item still appeared. I had to look into the code for the infobox to find that the item was been dragged in from wikidata. This is no way to be allowing wikidata to appear without showing it is from wikidata. Keith D (talk) 00:08, 15 June 2018 (UTC)
- There are reasonable arguments for any wikidata-items to be tagged with an icon. However my impression is that the prevailing view is that cluttering infoboxes with a hundred million wikidata icons is unacceptably gross. It's useless and visually disruptive for the 99.9 % merely trying to read the article. Alsee (talk) 23:53, 14 June 2018 (UTC)
Some followup discussion
[edit]
See also Wikipedia:Village pump (policy)#Wikipedia:Wikidata/2018 Infobox RfC: closed (will eventually be in archive page 143 or 144, probably). — SMcCandlish ☏ ¢ 😼 01:39, 15 June 2018 (UTC)
- I am confused. Where is the formal Closure text of this RfC? Is it the top block of #Discussion?
- I found in archive VPP#143:
- - DePiep (talk) 17:06, 12 July 2018 (UTC)
- Yes, this is the top block of #Discussion. --Ymblanter (talk) 17:24, 12 July 2018 (UTC)
Looking for a Summary
[edit]llywrch, Swarm, Fish and karate or anyone, can the conclusion be summarized? I'm starting at #Discussion, I can't make heads or tails of this massive wall of text. As soon as I try, I find it seems to me it's being misrepresented. [edit: After posting that, I found the summary of the summary, and made it more prominent at the top of this page.] --RudolfoMD (talk) 11:30, 26 November 2023 (UTC)