Talk:List of languages by number of native speakers/Archive 10

Archive 5Archive 8Archive 9Archive 10Archive 11Archive 12

Semi-protected edit request on 15 February 2014

"Partially mutually intelligible with Ukrainian[2] and Belarusian[2]" I find this comment racially offensive. Ukrainian language dates back to 8 century A.C., when Russia did not even exist. The source quotes Soviet propaganda literature, which tried to create a single "united nation" from Ukrainians and Russians. Ukrainian language was forbidden in schools and even to talk on the streets. You could be send to a concentration camp for using it. So, most of Ukrainians were forced to speak Russian, which they did with Ukrainian accent and were ridiculed for this. Against this racist "interchangeable" history speaks the fact that Ukrainians, the elder generation, understand Russians, since they were forced to learn the language in schools, yet Russians don't understand Ukrainian language. I find this comment racially offensive.

Ahereuser (talk) 18:03, 15 February 2014 (UTC)

Not done. It's not clear what change you propose. The sources for the claim you don't like satisfy WP:RS. Kindly note that this is not a forum WP:FORUM.Jeppiz (talk) 19:36, 15 February 2014 (UTC)

Catalan

Catalan language is missing in the list. — Preceding unsigned comment added by 188.119.216.112 (talk) 18:38, 30 December 2013 (UTC)

Catalan/Spanish are as mutually intelligible as French/Italian. Article doesn't include all languages. — Preceding unsigned comment added by 145.36.235.4 (talk) 14:14, 20 January 2014 (UTC) Catalan is very close to Spanish, being in fact an Iberoromance language. I can read it with no problems and understand its spoken form pretty well, and I have no formal instruction in that language, being a Spanish speaker. By the way, Italian and Spanish are also fairly similar and communication is quite possible between them, which is not the case of French with neither Spanish nor Italian. Of the large languages, Portuguese-Spanish-Italian are mutually intelligible in different degrees. Pipo. — Preceding unsigned comment added by 76.26.48.77 (talk) 16:06, 25 January 2014 (UTC)

The former comment about Catalan can only come from a Spanish speaker and is motivated by political reasons, not by objective ones as there's a bitter controversy about this subject in Spain. Catalan is a recognised language and official in many places. It is for example the sole official language of Andorra. It is also listed 5 times in the CIA world fact book (https://www.cia.gov/library/publications/the-world-factbook/fields/2098.html). Catalan is as close to Spanish as any other language from Latin origin and if we apply this rule we have to put all those languages under a same common 'Latin' denomination. Figures of Catalan Speakers vary from 7 to 12 if one includes it's various dialects (there's also a bitter argument about this in Spain and some deny that the language spoken in Valencia or Balearic Islands is the same as Catalan) — Preceding unsigned comment added by 88.105.83.153 (talk) 20:29, 26 January 2014 (UTC)

Catalan was removed from the list in April 2013 for no discernible reason. It's generally considered a language of its own. I've restored it. --Florian Blaschke (talk) 00:22, 4 March 2014 (UTC)

Merging two sources together.

I find it a bit wasteful to have two sections, one for the Nationalencyklopedin and the other for the Ethnologue. Placing the estimates under separate columns but under the same header makes the information less cluttered. Besides, the Ethnologue has more recent information. It should be up top rather than below. — Preceding unsigned comment added by AlexTeddy888 (talkcontribs) 16:13, 7 March 2014 (UTC)

Canada?

I believe the Canada should be added to the English speaking countries category. I don't understand why Australia and New Zealand are mentioned separately yet Canada -- which is majority English speaking and has more native speakers than either of those countries -- is not. 75.156.109.116 (talk) 08:37, 8 March 2014 (UTC)

Italian language

Half of italians are bilinguals????

Yes, very many Italians speak both Standard Italian and regional languages. --Florian Blaschke (talk) 00:27, 4 March 2014 (UTC)
I'm sorry, but this is the most ridiculous thing I've ever read. I've been living in Italy for many years, and I can say with certainty that bilinguals in Italy are just a small minority (mainly foreign immigrants). If one includes dialects, of course almost everyone in the world is at least bilingual. The British, for instance, are bilingual because they can speak English and, say, Cockney, Brummie, etc. Americans are trilingual, because they speak English, American English and the various regional dialects. As for me, I speak English, American English, Texan, Californian, Italian, Roman, and Milanese, so I am heptalingual. I found out I'm a genius. — Preceding unsigned comment added by 79.32.177.207 (talk) 23:46, 8 March 2014 (UTC)

Semi-protected edit request on 14 March 2014

Gujarati is also spoken in Daman & Diu as well as Dadra and Nagar Haveli and is both Union territories state language. It may pleas ebe added.[1] Nirav 06:35, 14 March 2014 (UTC)

  Done - Arjayay (talk) 09:09, 14 March 2014 (UTC)

Semi-protected edit request on 18 March 2014

Tian Belawati (talk) 01:54, 18 March 2014 (UTC) please insert into the table: Bahasa Indonesia, which is spoken by at least 240 million Indonesian people. Thank you, Tian Belawati Tian Belawati (talk) 01:54, 18 March 2014 (UTC)

  Not done for now: Currently Indonesian language (aka Bahasa Indonesia]] is listed as Malay/Indonesian. I could suggest that you open a new discussion, if you think it should be listed in another way. Check out the talk page archives first. Sam Sailor Sing 08:27, 18 March 2014 (UTC)

PORTUGUESE LANGUAGE

The Portuguese languages is not mutually inteligible with Spanish. Yes, a very cultured Portuguese and a very cultured Spaniard may be able to hold a conversation and understand each other without any training. But we can't say that a Portuguese can understand a Mexican Spanish speaker or that a Brazilian will understand a Spaniard. I am going to erase that nonesense.

ZZTop — Preceding unsigned comment added by 79.169.211.119 (talk) 18:11, 20 March 2014 (UTC)

  • Your first line is almost the definition of mutually intelligible languages, if your "very cultivated people" are people with a good knowledge of their own language in the standard version. --Megustalastrufas (talk) 09:08, 22 March 2014 (UTC)

You are not right, my friend. I am Spanish and live in the Portuguese border. Both languages are extremely similar. In reading it is no problem to most people to understand both languages without formal trainig. In speaking the Portuguese understand Spanish better than vice versa because of the sound system. Pipo. — Preceding unsigned comment added by 76.26.48.77 (talk) 00:38, 27 March 2014 (UTC)

Semi-protected edit request on 28 March 2014

Guinea-Bissau is missing in the Portuguese speaking countries' list. This can be confirmed in the Community of Portuguese Language Countries: https://en.wikipedia.org/wiki/Community_of_Portuguese_Language_Countries Elmariachi pt (talk) 00:53, 28 March 2014 (UTC)

  DoneNot sure how exhaustive the list of countries is meant to be, but if Timor-Leste is not contested with its 600 native speakers of portuguese, don't see why this country would be. Cannolis (talk) 02:33, 28 March 2014 (UTC)

Semi-protected edit request on 28 March 2014

Please change number of French native speakers from 74 million to 112 million. See http://fr.wikipedia.org/wiki/Distribution_du_français Only in France there are 65 million native French speakers. In the Congo DR there are more than 24 million, in Canada more than 7 million, Belgium more than 4 million. Your figure is grossly inaccurate. 66.183.31.239 (talk) 03:39, 28 March 2014 (UTC)

  Not done - As clearly stated in the tables, the first table is taken from Nationalencyklopedin, the later tables from Ethnologue. This ensures that exactly the same parameters are used when collecting and collating the data.
Please note that the second table gives "128 million "native and real speakers" (includes 65 million French people)" under "other estimates" - which is rather more than the 112 million you suggest.
Furthermore your proposed figure is taken from Wikipedia - which is not a reliable source. Arjayay (talk) 09:04, 28 March 2014 (UTC)

Regarding use of flags

There is some debate over the addition of flag icons to countries in the "Mainly Spoken In" column of the Nationalencyklopedin (2007) section. It is said that the use of flags is only allowed for states with official languages, per the manual of style for flags. I contested, citing the lack of such a claim in the MOS and the fact that the column is for Mainly Spoken, not Official Languages. Thoughts? AlexTeddy888 (talk) 09:40, 28 March 2014 (UTC)

That's exactly the point. Flags for official countries are inappropriate in a list of mainly-spoken countries.
MOS:FLAG, under 'appropriate use', says,
Flag icons may be relevant in some subject areas, where the subject actually represents that country, government, or nationality – such as military units, government officials, or national sports teams.
Unofficial languages do not "represent the country". An official diplomat or national sports team is equivalent to an official or national language.
It continues,
In lists or tables, flag icons may be relevant when the nationality of different subjects is pertinent to the purpose of the list or table itself.
Languages do not have nationalities (only language standards do), and if they did, they wouldn't be pertinent in a table on the numbers of native speakers. A parallel is given for sports figures, since this is where most of the abuse has taken place. They say,
Flags should never indicate the player's nationality in a non-sporting sense; flags should only indicate the sportsperson's national squad/team or representative nationality
In the context of languages, "representative nationality" would mean an official or national language.
Compare our language info box, where we have a well-established consensus on this following MOS:FLAG: In the list of countries where the language is spoken, we do not use flags. We only use flags in the list of where the language is official. The template documentation instructs us,
Do not use flag icons except for national or official status.
kwami (talk) 21:10, 28 March 2014 (UTC)
Yeah, definitely a MOS:ICONS problem to add flags here, and introduces a massive WP:V/WP:NOR problem, since every single claim that x language is mostly spoken in y place is a fact that has to be reliably sourced. It doesn't help our readers in any way (and may give incorrect implications), while it adds a pointless huge burden on editors.  — SMcCandlish ¢ ⚞(Ʌⱷ҅̆⚲͜^)≼  07:46, 29 March 2014 (UTC)
You are inferring the wrong ideas. The use of the flag does not have to mean the subject is official - it is only when the flag is attached to the subject. For example,   England. That implies English is a language unique to England, or has a nationality, which is wrong. In this case, the flags are attached directly to the country, showing that for this language, the following countries have a large percentage of the population speaking it. As for the language infobox, that is because the field flags are under is called "Official Language in...". This is not the case for the article. Of course, there is the issue of reliability, but this has to do with the countries, not the flag. If we attached flags to the countries, it will not insinuate any official connotation, but rather that the country is significant enough for there to be a flag, which it is since it's under the "Mainly Spoken In" field.
Nonetheless, I'll yield. Since you are so insistent on not having flags for the article, I assume your intentions are ultimately to enhance the clarity of the article, similar to mine. AlexTeddy888 (talk) 13:09, 29 March 2014 (UTC)

Intelligibility among Romance languages.

I have posted above an issue about Spanish and Italian. In fact, most Romance languages are mutually intelligible with different degrees. When spoken, the only exception is probably French, among the major languages, but languages like Portuguese-Spanish-Italian-Romanian are intelligible, as I said with different degrees. Here you have references and articles: http://robertlindsay.wordpress.com/2009/02/08/mutual-intelligibility-in-the-romance-languages/ http://www.sulaandjohn.com/files/users/e/535D6469E2612048E040A8C0AC002D4E/Mutual Comprehension.pdf

In short, there is ample evidence and sources here to include that these languages are partially mutually intelligible, not just Portuguese and Spanish. I am not changing the article myself because as a new user I have no access, but somebody should.

Then here you can find blogs with people and their experiences on this issue: http://www.antimoon.com/forum/t7448.htm http://www.antimoon.com/forum/t12042.htm http://linguaphiles.livejournal.com/4017710.html

And I could add scores of links. Pipo. — Preceding unsigned comment added by 76.26.48.77 (talk) 20:07, 28 March 2014

Please sign your posts with four tildes (~~~~). I don't see that this is particularly relevant here at all. The partial mutual intelligibility you're talking about is true of the languages within any family of closely related languages; it's not something special about Romance languages. Even it it was, it doesn't have anything to do with this article.  — SMcCandlish ¢ ⚞(Ʌⱷ҅̆⚲͜^)≼  08:02, 29 March 2014 (UTC)

Your comment is quite inconsistent with the article itself and with a lot of other facts(English, for example, is not mutually intelligible with any other languag) where we can see comments about the intelligibility between various languages. Have you really checked the article before making that comment? Pipo. — Preceding unsigned comment added by 76.26.48.77 (talk) 18:25, 29 March 2014 (UTC)

As you were told above, please sign your posts with four tildes (~~~~). Apart from the issue of relevance, there also the issue of accuray. Romance languages are not as mutually intelligible as you claim; Romanian is not understood by speakers of any other Romance language. Sure, we who speak Italian or French can understand the occasional word, but we most definitely do not understand spoken Romanian. Most of us don't understand European Portuguese either, even if we manage to read it. There's quite high a degree of mutual intelligibility between Italian and Spanish, but that's basically where it stops. French speakers are not understood by speakers of any other major Romance language, nor are Romanian speakers or speakers of European Portuguese or Caribbean Spanish. And even if we could understand each other, what would that have to do with this article?Jeppiz (talk) 00:22, 30 March 2014 (UTC)

1. I have provided sources. I think Wiki is about providing sources, right? 2. I basically agree with you: There is an important degree of intelligibility between Spanish and Italian. That is why I assert that this fact should be in the article. 3. Once again. Have you read the article? There is a section in the first table that mentions the languages that are mutually intelligible or partially mutually intelligible. For example, in the case of Russian it reads: Partially intelligible with Bielorussian and Ukranian. I think in the case of Spanish it should read: Partially intelligible with Italian and Portuguese. Do you guys read the article before posting here your comments? 4. I will try to sign as you say. Pipo. 76.26.48.77 (talk) 02:48, 30 March 2014 (UTC)

Semi-protected edit request on 9 April 2014

The article says "Half of the world's population speak the 13 most spoken languages, the other half speak the rest." Could this not be shortened to "Half of the world's population speak the 13 most spoken languages" or even "The top 13 languages are spoken by half the world's population" ???···98.220.132.115 (talk) 21:25, 9 April 2014 (UTC)

  Not done: it's not clear what changes you want made. Please mention the specific changes in a "change X to Y" format. Why? — {{U|Technical 13}} (tec) 01:25, 10 April 2014 (UTC)

Semi-protected edit request on 9 April 2014

The Arabic language is spoken by 420 native speakers (as of 2013) with varying levels of intelligibility, and every country has a dialect in the same light that Latin American countries and Spain have different dialects. Arabic is the 3rd most spoken language ranking 3rd in amount of native speakers and should be ranked above English.···98.220.132.115 (talk) 21:25, 9 April 2014 (UTC)

  Not done: please provide reliable sources that support the change you want to be made. Cannolis (talk) 21:48, 17 April 2014 (UTC)

The only conclusive source I could find is from 2012 and states 362.5 million, which still even at this amount ranks Arabic as the third most populous language, overcoming english

http://data.worldbank.org/country/ARB75.62.20.174 (talk) 09:08, 18 April 2014 (UTC)

Hindustani language (Hindi-Urdu)

Hindustani language is world second most spoken language by number of native speakers. it has 490 million native speakers. if total speakers is counted then the whole population of subcontinent can speak this it has 1.6 billion total speakers — Preceding unsigned comment added by 119.63.143.27 (talk) 08:15, 21 April 2014 (UTC)

Source? --Zyma (talk) 11:43, 21 April 2014 (UTC)

The only conclusive source I could find is from 2012 and states 362.5 million, which still even at this amount ranks Arabic as the third most populous language, overcoming english

http://data.worldbank.org/country/ARB···98.220.132.115 (talk) 21:25, 22 April 2014 (UTC)

That link only gives the population, not the amount of speakers. Elockid (Talk) 22:54, 22 April 2014 (UTC)

Semi-protected edit request on 8 May 2014

Please change the number of French native speakers from 74 million to 110 million because the number you quoted is from 1999 and is very misleading. See http://www.ucl.ac.uk/clie/learning-resources/sac/french 198.62.158.205 (talk) 17:21, 8 May 2014 (UTC)

  Not done: The source you cited cites Wikipedia as the source of that information. Give us a better source and we might be amenable to changing it. —KuyaBriBriTalk 01:54, 9 May 2014 (UTC)
... if it includes a comparable number of languages. — kwami (talk) 04:35, 9 May 2014 (UTC)

Edit war

I see we're broiling down to a head here. How do you know that it is spurious? Russian is not limited to Russia, Portuguese is not limited to Portugal! Also, can't you just provide references for changes? Ack! Ack! Pasta bomb! (talk) 23:24, 30 April 2014 (UTC)

References? We're talking about the references:
Portuguese: There are 10 million L1 speakers and 15 million L2 speakers in Portugal. Portugal has a population of 10 million. A 1st-grader could tell you that's impossible.
Russian: There are 137 million L1 speakers and 110 million L2 speakers in Russia. Russia has a population of 143 million. Ibid.
Now, given that Ethnologue can't get the really obvious stuff right, how can we rely on them for the more obscure stuff?
Meanwhile I've removed languages and figures that aren't supported by Ethnologue.
kwami (talk) 00:10, 1 May 2014 (UTC)

Does it explicitly say in Portugal or Russia?Ack! Ack! Pasta bomb! (talk) 11:12, 2 May 2014 (UTC)

Yes: "10,000,000 in Portugal (ELDIA 2012) ... L2 users: 15,000,000 in Portugal." "137,000,000 in Russian Federation (2010 census). ... L2 users: 110,000,000 in Russian Federation." It would help if you read the sources before edit-warring over them. Read the thread above: The numbers for Hindi, Spanish, Portuguese, Russian, French, Punjabi, and Arabic were all spurious, and that was just among the most populous languages.
Also, it is not a coherent list, because the population estimates are not from similar years. I've come across estimates, supposedly from the 1990s, that were actually from the 1950s (the 50s account was just reprinted in the 90s), and in one case from the 1920s. Even within a single language, the population estimates for different countries may come from different decades, often 30 or 40 years apart, it parts of the world where the population may double every 25 or 30 years. You can't legitimately compare languages like that. We used Ethnologue, despite its deficiencies, when we had nothing better, but now we have the Nationalencyklopedin, where all language populations have been estimated for 2007. It would be nice if we had another source for comparison, but Ethnologue is not it. You'll note that Ethnologue itself does not try to equate its population estimates with each other – that's something we impose on it. They used to have a list of most populous languages, but gave up on it. — kwami (talk) 16:39, 2 May 2014 (UTC)
I reported the Portugues/Portugal inconsistency to Ethnologue to see if they had an explanation. The response, paraphrased, was "Thanks for reporting the problem. We'll update our database. The website will be updated early next year." Which, of course, doesn't give us any further info on what the error is.
Can someone give me some more bad country/language pairs to feed them to maybe get some idea of what the problem is and which numbers we can trust? (I see the language list above, but no countries) —[AlanM1(talk)]— 17:16, 2 May 2014 (UTC)
All we have is OR. We have guidelines for judging whether a ref is a reliable source, but not for which data within a source is reliable data. With Russia, I suspect that the L2 population was for the USSR, and that someone blindly substituted "Russia" for "USSR". But that's just a guess – that is, OR. For Portugal, I have no idea: That figure was never possible, and it is too small for the total number of L2 speakers. Similarly with French: "60,000,000 in France (ELDIA 2012). ... L2 users: 50,000,000 in France [population: 64,613,000]." And Arabic: "L2 users: 246,000,000 in Saudi Arabia [population: 27,137,000]." For Tagalog, they assign L1 and L2 speakers different ISO codes. For Punjabi, the assign different ISO codes to either side of the border, equivalent to claiming that the language switches from "American" to "Canadian" at the US–Canadian border. For Hindi, they rely on the census data, but omit the caution inherent in the layout of the census, that none of the Hindi figures are for coherent languages. For Italian and Hebrew, they simply don't know the number of native speakers; for Belorussian, they give the ethnic population, when only have speak the language natively. This isn't something we can fix. Doing so would effectively mean creating our own reference.
The principal problem with Ethnologue is that they're a pastiche of uncited data. When they give a dated figure and then a second figure, whether that's the international total, or number of L2 speakers, or whatever, it's not generally from the same date. It may not even be the same language, or may be the same people counted twice. A lot of the info in Ethnologue is inherited from older catalogues, such as Voegelin & Voegelin (1977), which in turn inherited from still other catalogs, and much of that is uncited. That's why so many of the Ethnologue languages turn out to not actually exist, or to be mere naming variants of other languages. The more recent data in Ethnologue is much more reliable, but you can't tell which is the more recent data without going back and comparing the entry to older editions. And when languages have been split up or merged, it can be impossible even for the editors of Ethnologue to figure out what's going on – often they need to delete the data, because they don't know how to fix it. (I suspect that will be the case for Portuguese. It's quite possible they didn't tell you what the error is because they don't know.) If they can't fix it, despite having all their editing notes, records, and references, how could we?
Again, Ethnologue never claimed the demographic data was directly comparable between language entries. In fact, when they did use to have a list of the most populous languages, they used different population data than they had in the language articles! They never made a list this extensive, and have apparently given up on what they did have. Now that we have a RS to cite for this list, we should abandon the OR that we used before then. — kwami (talk) 17:30, 2 May 2014 (UTC)
I've invited the Ethnologue managing editor to contribute here. —[AlanM1(talk)]— 08:44, 15 May 2014 (UTC)

Semi-protected edit request on 17 May 2014

It says, "French—Francais" whereas it should be "French—Français". 99.232.44.176 (talk) 19:32, 17 May 2014 (UTC)

  Done Thank you for pointing it out! cymru.lass (talkcontribs) 22:00, 17 May 2014 (UTC)

Tamazight and Kimbundu

where is Tamazight and Kimbundu? these languages exists and are spoken by living people, and are no dialects - i think they need to be indexed in this list too. — Preceding unsigned comment added by 82.155.224.197 (talk) 20:23, 22 May 2014 (UTC)

Russian language total speakers number lies

According to source which should say that there are 272 million total speakers of Russian language it shows only 160. Please change that.

Source says: 167,332,230 (native). --Zyma (talk) 17:31, 29 April 2014 (UTC)
I corrected the numbers: 167 for native, 277 for total. --Zyma (talk) 17:52, 29 April 2014 (UTC)
The L2 figure they give is meaningless: There cannot be 247 million speakers of Russian in a country of only 143 million people, even assuming the language data and population for the country are off by a couple decades. The L2 number might be a figure for the USSR, spuriously "corrected" to Russia.
You often can't tell if the L2 figure in Ethnologue is for L2 speakers or for all speakers including L2. Some don't make any sense either way, for example 15M L2 speakers of Portuguese in Portugal (population 10M). — kwami (talk) 22:28, 29 April 2014 (UTC)

I propose we delete the Ethnologue section as being so garbled it's useless. We can't blindly repeat E's figures, because they're often obviously garbage. But correcting them is OR. Also, sometimes it's not clear what the figures are supposed to mean, so we wouldn't even know how to blindly copy them. — kwami (talk) 22:34, 29 April 2014 (UTC)

Do you consider Swedish-language general Nationalencyklopedin more reliable than English-language linguistics-targeted Ethnologue?--Lüboslóv Yęzýkin (talk) 22:48, 29 April 2014 (UTC)
Definitely. That's why as a community we decided to abandon Ethnologue for languages covered by the Swedish encyclopedia. — kwami (talk) 01:01, 30 April 2014 (UTC)
Done. Among the more populous languages, the numbers for Hindi, Spanish, Portuguese, Russian, French, Punjabi, and Arabic were spurious. — kwami (talk) 20:34, 30 April 2014 (UTC)
We cannot get rid of the well-renowned source specialised on language statistics just because you, by some unknown reason, do not like it. Obviously, a general Swedish encyclopedia is much less reliable, no-one knows from where it got its data at all.--Lüboslóv Yęzýkin (talk) 04:33, 26 May 2014 (UTC)

Questioning 360 million figure for native English speakers

The number of native English speakers is closer to 380 million than 360 million. — Preceding unsigned comment added by Justrollingalong (talkcontribs) 09:49, 31 May 2014 (UTC)

Bengali

Bengal - 90 mil Bangladesh - 150 mil That's 240 mil right there. Plus other regions makes it close to 250 mil for Bengali.

114.143.124.67 (talk) 14:45, 3 June 2014 (UTC)

Semi-protected edit request on 13 June 2014

Please change (1st item on the table):

Mandarin 官话

Into:

Mandarin 普通話/國語/華語 普通话/国语/华语

because 官话 is an ancient term for Mandarin used only during mid and late Qing Dynasty. Nowadays no one use 官话 anymore, and you can only find it in historical documents (e.g. the late 19th century and early 20th century Chinese translations of the Bible by the Protestant missionaries). Nowadays the Mainlanders call it 普通话 ("the common language"), the Taiwanese call it 國語 ("the national language"), and Chinese in Singapore and Malaysia call it 華語 ("the language of the Chinese"). The three terms are often inter-used too, for example by people in Hong Kong and Macao

Stubborn Loner (talk) 16:08, 13 June 2014 (UTC)

  Not done: The current version matches the characters used on Mandarin. Even if you provided a reliable source for these other terms, what value do they have in this article of the English Wikipedia. The current table uses the language's name for itself in the character set of that language as an illustration of that language, but having multiple different names does not improve that illustration and might confuse a reader unfamiliar with Mandarin. Thanks, Older and ... well older (talk) 17:09, 13 June 2014 (UTC)

Petition to change Countries in Mainly Spoken In to regions.

The list "Mainly Spoken In' features countries, with regions in parentheses. Personally, it is an eyesore because I feel it justifies the use of flags, but it also represents an undefiled and vague area in which the language is mainly spoken in, in contrast with the "Native to" section which features a more detailed analysis of the areas in which they are spoken in. I would like to replace the list of countries under "Mainly Spoken In" which are disjointed into an organised prose that details a more in-depth description of the areas, not countries alone, of where the language is spoken in. AlexTeddy888 (talk) 14:45, 14 June 2014 (UTC)

Southern Luo Language

Isn't the Southern Luo language missing from this table? Carstensen (talk) 21:24, 22 June 2014 (UTC)

Should we switch to Ethnologue?

I personally prefer Ethnologue because it's more updated and could be more easily verified than the Nationalencyklopedian. I also feel that some major languages are missing, such as Lao, which should be included in a major source. AlexTeddy888 (talk) 12:05, 6 July 2014 (UTC)

Dropping out one problematic but still legitimate source (we have what we have) and replacing it with a local non-English hardly verifiable encyclopedia of which authors, methodology and sources are totally unknown and unreliable is simply ridiculous.--Lüboslóv Yęzýkin (talk) 23:19, 6 July 2014 (UTC)

Semi-protected edit request on 8 July 2014

I want to add a bit information about the Urdu language. The language is the first official language of Pakistan and also spoken in USA, Canada and UK as there are a large number of people who live in these countries. English is also an official language of Pakistan and every person who went to school can speak English. Safdaronwiki (talk) 16:21, 8 July 2014 (UTC)

  Not done: please provide reliable sources that support the change you want to be made. — {{U|Technical 13}} (etc) 16:27, 8 July 2014 (UTC)

Byelorussian

'беларусы' (currently in the article) is an ethnonym, the language is called 'беларуская мова'.

  Done Thanks for the suggestion - Arjayay (talk) 16:12, 6 August 2014 (UTC)

Clarification

19 Marathi मराठी 73 1.10%

20 Tamil தமிழ் 74* 1.06%

Why is this difference? Tamil has more Native speakers but less % ? --kavin (talk) 10:21, 13 August 2014 (UTC)

Arab League

Say what? List out countries please. 217.194.190.86 (talk) 21:10, 16 August 2014 (UTC)

Semi-protected edit request on 5 September 2014

In the table, under Arabic - Notes, it says "Arabic also is a liturgical language of 1.6 billion Muslim speakers." "Muslim" is not a language, and people don't speak Muslim. So the word "speakers" should be removed so that it just reads "Arabic also is a liturgical language of 1.6 billion Muslims."

Toomath (talk) 13:05, 5 September 2014 (UTC)

  Done Good point, well made - Arjayay (talk) 14:13, 5 September 2014 (UTC)

360 M english speakers?

hi. according to here wiki, United States of America is 318,7 M people; United Kingdom is 64,1 M people; Canada is 35,4 M; Australia is 23,6 M; and New Zealand is 4,5 M. Total: 446,3 M people, that's 86,3 M people over the number given in this article for english speakers. There can't be that many people in US who do not speak English, and there sure aren't that many in UK either. Something wrong in the article... 88.219.191.93 (talk) 17:43, 17 September 2014 (UTC)

 — Preceding unsigned comment added by 88.219.191.93 (talk) 18:22, 17 September 2014 (UTC) 

To begin with, and just speaking of Spanish, there are about 50 millions Hispanics in the US, of which, at least 30 million have Spanish as their native tongue. Take into account this is a list of native speakers. The same applies to other languages like Spanish. The sum of the populations of all Spanish speaking countries is significantly bigger, but for example, in Latin American, in many countries a lot of people have Amerindian languages as their first language. Pipo. — Preceding unsigned comment added by 99.73.133.221 (talk) 17:33, 21 September 2014 (UTC)

Semi-protected edit request on 2 October 2014

Kindly add information about konkani(Goa, India) language Jiituu (talk) 04:19, 2 October 2014 (UTC)

  Not done: it's not clear what changes you want to be made. Please mention the specific changes in a "change X to Y" format. Cannolis (talk) 04:51, 2 October 2014 (UTC)

Serbo-Croatian

Serbo-Croatian has another script called Arebica. Another way to type "Serbo-Croatian" is بۉسآنسقاى


Please see http://en.wikipedia.org/wiki/Arebica#Tehran_Sample for more information and http://www.isa-sari.com/osmanlica/?id=en to verify the script. Arebica is mainly a Bosnian script, so "Bosnian" (bosanski) = بۉسآنسقاى — Preceding unsigned comment added by 198.146.58.150 (talk) 22:31, 4 October 2014 (UTC)

Hungarian is spoken also outside of Hungary

Please note the Hungarian language is spoken not just within Hungary (95  % of ~ 10 million inhabitants of Hungary as a country / state), but the neighbor countries also:

  • Romania (mainly in Transylvania) ~1,3 millions Hungarian native speakers (Hungarian nationality as minority of the country);
  • Slovakia: 458 000 Hungarian native speakers (Hungarian nationality as minority of the country);
  • Serbia: 254 000 of Hungarian native speakers (Hungarian nationality as minority of the country);
  • Ukraine: 156 000 Hungarian native speakers (Hungarian nationality as minority of the country)
  • Croatia: 15 000 Hungarian native speakers (Hungarian nationality as minority of the country);
  • Slovenia: 7000 Hungarian native speakers (Hungarian nationality as minority of the country);
  • Austria: 55 000 Hungarian native speakers (Hungarian nationality as minority of the country);

in the above countries Hungarians have been living for hundreds of years, most of them are there sometime after the Hungarian entrance into Carpathian Basin (~ AD 895), so close to 1000 years, and living in a relatively homogenous Hungarian speaking community, so the usage of Hungarian language is a daily practice.

Into some below countries, regions high number of Hungarians had been emigrated in the last 100 years (typically before 1st WW [AD 1914], around Big Economical Crisis (~ 1929-1933), after WW II, after Revolution in 1956), so they have permanently lived there since decades. Also note the 2nd, 3rd, etc. generation of the these emigrants losing the ability to speak Hungarian continuously living in a relatively homogenous non-Hungarian language environment (e.g. English in US, Spanish in Argentina, etc.)

  • USA 1,5 million
  • Canada 315 000
  • Western EU ~ half million (like Germany 135 000, France 100 thousands, etc.)
  • North EU (Scandinavia) 45 000
  • South America (including Argentina, etc.) 200 000

[2] [3]


The recent 10-15 years (mostly the last 5 years) the number of Hungarians working abroad went up close to 600 000. [4] — Preceding unsigned comment added by 89.223.155.214 (talk) 12:13, 17 October 2014 (UTC)

Semi-protected edit request on 30 October 2014

sanskrit?? 14.140.104.238 (talk) 07:26, 30 October 2014 (UTC)

  Not done: Sanskrit only has 14 thousand native speakers. This article is for the top 100, which all have millions of native speakers. Stickee (talk) 09:27, 30 October 2014 (UTC)

Numbers

I have an impression that some numbers are incorrect. For example Hindi seems overestimated (in number of native speakers of course), where French or Polish underestimated. Shouldn't be this list somehow verified? Bests! Ventic (talk) 08:00, 2 December 2014 (UTC)

Semi-protected edit request on 10 December 2014

The order of numbers of native speakers is old and incorrect As we know, English was more than mentioned and getting more and more wider with native speakers!! it should be like this: 1st language: Mandarin with (1.051 Billion speakers) out of 1.3 billion people in China 2nd language: Spanish with (450 million speakers) collection of native speakers from many countries. 3rd language: English with (420 million speakers) collection of native speakers from many countries. 4th language: Arabic with (370 million speakers) out of 422 living in Arabs world and neighboring countries. 5th language: Hindi with (350 million speakers) out of 1.26 people in India and Fiji. Alhitham Khalid (talk) 18:02, 10 December 2014 (UTC)

  Not done: as you have not cited reliable sources to back up your request, without which no information should be added to, or changed in, any article.
Furthermore, as explained in the article lead, the figures are deliberately taken from one source, to avoid "cherry-picking" figures from multiple sources.
- Arjayay (talk) 18:57, 10 December 2014 (UTC)

Semi-protected edit request on 5 February 2015

There is an error in the row for German, column Notes. The whole list of dialects is formatted as a link, when it shouldn't be.

But it goes a little deeper than that. Going through the history, the paragraph started out as mentioning Low German as an example for a dialect that is not mutually intelligible with the standard language. Then the paragraph about Low German beeing an intependant language was added, then Low German in the first paragraph was replaced by a list of german dialects. This whole thing continues to grow because the original notes mentioned Low German as a dialect of german, which rubs some people the wrong way. And they're probably right about it beeing a seperate language, but this really isn't the place for that information.

I suggest cutting the whole thing back to the essentials; removing the part about Low German, and the list of dialects, and just refere to German dialects for more details. For example:

Wide dialect variety, within which many dialects are not mutually intelligible with the standard language.

178.193.182.184 (talk) 20:31, 5 February 2015 (UTC)

  Done Stickee (talk) 01:09, 6 February 2015 (UTC)
I think we should still indicate that (standard High) German is used a Dachsprache for all the Germanic languages of Germany, Switzerland and Austria. Would it be OK by us to say 'some perceived 'dialects', such as Low German and Swiss German, are in fact mutually unintelligible with the standard. In this sense Standard German serves as a Dachsprache'? Kielbasa1 (talk) 22:25, 8 February 2015 (UTC)

Lowe German is no dialect but a language!

Our Low German is no dialect but an independent German language which itself has some dialects. Standard German is a means for both Low German and High German dialect speakers to communicate together if their dialects are not mutually intelligible. --JFritsche (talk) 21:53, 3 February 2015 (UTC)

It's part of the Continental Germanic continuum, though, and policy on Wikipedia (c.f. German dialects) seems to refer to it neither as a definite dialect nor as a definite language. The original statement - that it's a '*dialect' but not mutually intelligible - seemed fine before your edits. It gets across the idea that LG is a 'dialect' in the sociopolitical sense but a 'language' in the linguistic sense.
I'd be willing to concede a 'supposed' before the problem word but that sentence gave as much information with much less confusion before. It's strange that you'd be fine conceding that Swiss German and Bavarian are also mutually unintelligible but still 'dialects'. Why do you award Low German a status these other dialects don't deserve? Swiss German is probably less intelligible with Hochdeutsch than LG is. Kielbasa1 (talk) 22:41, 8 February 2015 (UTC)

Rank column

A rank (position) column would be useful. Tavilis (talk) 09:40, 12 March 2015 (UTC)

We used to have one. We deleted it because it misinformed: language #20 might have more speakers than #15. — kwami (talk) 23:43, 12 March 2015 (UTC)

Semi-protected edit request on 21 March 2015

| Ukrainian
Українська || 30 || 0.46%|| Ukraine || Partially mutually intelligible with Russian and Belarusian.|-


Please change "українська мова" to correct form "Українська" in Ukrainian language field.
Source: uk:Українська мова
Ukrainian - Українська, Ukrainian language - Українська мова.
Thank you. Deluxeman (talk) 00:05, 22 March 2015 (UTC)


Well, I am not an Ukrainian, but as a Slavic speaker it sounds extremally odd to use only "Українська" as the name of the language.

"Polski" instead of "język polski" also sounds odd to native speakers. — Preceding unsigned comment added by 185.56.211.206 (talk) 20:35, 26 March 2015 (UTC)


  Not done: as your request is contradictory:-
First you asked to change "українська мова" to correct form "Українська"
Then you say Ukrainian - Українська, Ukrainian language - Українська мова.
Furthermore, you have not cited reliable sources to back up your request, Wikipedia is not a reliable source and without such a source no information should be added to, or changed in, any article. - Arjayay (talk) 11:15, 27 March 2015 (UTC)

Semi-protected edit request on 30 March 2015

Where is Hebrew on your list of languages?

Carol 24.66.90.170 (talk) 03:35, 30 March 2015 (UTC)

It is some way off the bottom of the list, which only includes the top 100 languages.
Konkani at No 100 has 7.4 million speakers, but according to Ethnologue, Hebrew only has 5.3 million - Arjayay (talk) 08:21, 30 March 2015 (UTC)

Bring back languages with < 7 million speakers?

I suppose this is something of an edit request, but really more a general comment/question. I teach linguistics and have used this list in the past for a project where students must research a language with between 1-10 million speakers. I understand some of the reasoning for restructuring and trimming this list but I have been unable to find any comparable resource online that clearly lists these smaller (but not endangered) languages. What I have ended up doing is sending my students to an archived version of the page that still contains the more comprehensive list of languages. While I am not a very active or prolific Wikipedia editor, I wonder if there is any interest among those who more actively monitor/update this page to include more languages on this list. I would be happy to add them myself based on Ethnologue or another suitable source but do not want to make edits without consulting with others who are interested in this topic. — Preceding unsigned comment added by Bibliotecaria iluminada (talkcontribs) 17:16, 2 April 2015 (UTC)

It's a good question. My spontaneous reaction would be that there is too little reliable information. Ethnologue is, in my personal opinion, about as trustworthy as the average soothsayer. I exaggerate a bit, but the typical Ethnologue entry can date to anytime between 1975 and 2015, making any comparison moot (and that is without even entering into the question of severe factual errors). So is there a reliable source for languages below 7 millions?Jeppiz (talk) 17:24, 2 April 2015 (UTC)

Semi-protected edit request on 14 April 2015

| Tamil || தமிழ் || 210 || 661,500.0000000%||India (Tamil Nadu, Karnataka, Puducherry), Sri Lanka, Singapore, Malaysia, Mauritius||Schedule 8 official language of India.

106.51.239.154 (talk) 07:24, 14 April 2015 (UTC)

  Not done as explained at the top of the list, it ONLY uses the figures in Nationalencyklopedin so that all the figures are compatible. - Arjayay (talk) 08:01, 14 April 2015 (UTC)

Is it time to revisit the preference for Nationalencyklopedin, which seems to be full of incredible figures for various languages? -- WeijiBaikeBianji (talk, how I edit) 14:03, 27 April 2015 (UTC)
Sure, but what do you suggest instead? Ethnologue is filled with errors, Nationalencyklopedin may well have errors as well. If you have a good, reliable sources without errors, I'm all ears.Jeppiz (talk) 14:24, 27 April 2015 (UTC)

All the other language speakers can also speak English

The list says 955 million people speak Chinese, hence making it the most spoken language. But here's the thing, out of those 955 million people, I'm damn sure that at least a million or two can also speak English. It's the same with all the other languages. Every other language speaking population can also speak English, so technically speaking, shouldn't English be the most spoken language rather than any thing else? This is weird and insane. Gujarati's can speak English, people who speak Gujarati can also speak Hindi, people who speak Hindi can also speak another regional Indian language, who in turn can also speak English, so by this count, shouldn't the number of people who actually speak English increase by many fold?

I might be wrong here, please correct me if I am. Thank you. D437 (talk) 06:36, 2 May 2015 (UTC)

I can reassure you that people outside of the former British colonies (you seem to live in one) most probably do not speak English at all or hardly speak broken "tourist's" or "bazaar English".--Lüboslóv Yęzýkin (talk) 07:04, 2 May 2015 (UTC)
There is no reason to trade personal anecdotes here, when it should be possible to look up language survey figures in reliable sources. This is the article about counts of native speakers, so that is the focus of this article. -- WeijiBaikeBianji (talk, how I edit) 10:59, 2 May 2015 (UTC)

75 million for French speakers !

This is just ridiculous! Just France has 65 million people and speakers. If you had Belgium, Luxemburg, Switzerland, Quebec and all the African ex colonies like Ivory Coast, Cameroon, Gabon, the two Congos, Algeria etc. you get a figure which is well over 100 million. See this link for example: http://www.washingtonpost.com/blogs/worldviews/wp/2015/04/23/the-worlds-languages-in-7-maps-and-charts/?tid=sm_fb And also the statement that English has less speakers than Spanish is just as ridiculous. The same link confirms that. This whole article is devoid of scientific meaning.

 — Preceding unsigned comment added by 80.246.106.4 (talk) 09:01, 27 April 2015 (UTC) 
Yes, this article is certainly devoid of scientific meaning. So is all of Wikipedia, try citing Wikipedia in a scientific article and you'll be rejected right away (and rightly so). Wikipedia is not about advancing science, which is WP:OR.Jeppiz (talk) 14:26, 27 April 2015 (UTC)
Not everyone in France has French as their first language (L1), several million immigrants, I've seen figures from 5-10 million, have other languages as their first language and French as their second (L2) language, and in francophone Africa very few people have French as their first language, using it as their second language for business etc. And the statistics regarding the number of speakers of different languages only includes L1, that is people who use it as their first language. Thomas.W talk 15:14, 27 April 2015 (UTC)
French Belgians, French Swiss and French Canadians may add total up to 12 mln to 63 mln of France. Luxembourg and African countries we may freely ignore as where French is a second language. So the number may not be too precise but very close - up to 80 mln, but obviously slightly less than German.--Lüboslóv Yęzýkin (talk) 13:16, 10 May 2015 (UTC)

Semi-protected edit request on 13 January 2015

Hello, I noticed that the amount of Malay/ Indonesian speakers is far too low. There are about 270 native speakers, the Wikipedia page on the Malay language shows the same amount. Almost everybody in Indonesia alone is able to speak bahasa indonesia which is approximatly 250 million. Could you therefore change 77 (for malay/ indonesian speakers) to 250. This way this page and the Wikipedia page on the Malay language will not contradict each other.

great resource Stickee. The thing that's hard to explain is in Indonesia, people grow up speaking 2 langs: Indonesian, and the island dialect. At home they'll use the dialect, but for all official transactions (reading, writing, or speech), and in big cities, it's all in Indonesian. So really there are two bahasa sehari-hari. It's hard to explain if you come from America, where you only grow up with English. I'd like to point out you interpreted a dichotomy between daily and outside languages, while actually they're the same. Finally, there are actually 197m speakers, page 427/732. I see this is a losing battle, so please, for the fourth time asking: how do I add to the Other Estimates section? If I can't change the number of native speakers (250m), I really really would like to at least add a note about growing up with the dialect/national langs simultaneously.137.113.230.3 (talk) 04:24, 15 May 2015 (UTC)

How about this as the note: (start)Having only been officialized in 1945, Indonesian is one of the youngest languages in the world[5]. Because of this, Western censuses decided to classify hundreds of millions of its speakers as non-native speakers. However, most Indonesians, when asked if they consider themselves native speakers of Indonesian, answer with a clear "yes"; therefore this number may possibly be much higher at around <bold>268 million</bold> native speakers[6].(end) Is that ok? Please read especially the second ref carefully.Bookracoon (talk) 02:45, 18 May 2015 (UTC)

that's because, once again, you editors didn't answer my simple yes-no questions, nor does it appear do you read my questions carefully. I'm asking to add to Other Notes, yet Jeppiz is adamantly against that? Of course it matters if Indonesia has bilingual citizens, the current main estimate is hundreds of millions off. 82.171.34.191 (talk) 18:10, 13 January 2015 (UTC)

  Not done: it's not clear what changes you want to be made. Please mention the specific changes in a "change X to Y" format.  B E C K Y S A Y L E 23:40, 13 January 2015 (UTC)
  Thanks for the clarification but still not done: The information here is about native speakers, not about how many are able to speak the language.Jeppiz (talk) 17:18, 20 February 2015 (UTC)
  Not done as clearly explained at the top of the article we ONLY use the figures in the Swedish Nationalencyklopedin (2007, 2010), so the figures are comparable - no other sources, or opinions, will be used. - Arjayay (talk) 08:06, 14 May 2015 (UTC)
Still   Not done This is a list about the number of native speakers and it is based on the figures in Nationalencyklopedin. It is not "random", it is the most neutral and accurate estimation available to us, as explained by kwami in the discussion at the bottom of this talk page.Jeppiz (talk) 17:50, 14 May 2015 (UTC)
While you're right that most Indonesians can speak Indonesian (155 million), only 43 million Indonesians are native speakers according to the 2010 census (table 30.9). Stickee (talk) 22:23, 14 May 2015 (UTC)
  Not done: please establish a consensus for this alteration before using the {{edit semi-protected}} template. — {{U|Technical 13}} (etc) 18:10, 15 May 2015 (UTC)
  Not done: And please stop this repetitive use. Your request has been answered multiple times, always with "no". What you're suggesting is original research WP:OR. If Indonesia has a different definition of "native speaker", then that changes nothing at all. In case the whole world adopt that definition, we will change the article. As for now, we won't.Jeppiz (talk) 12:29, 18 May 2015 (UTC)

Hungarian

Hungarian is also natively spoken in neighbouring countries: Romania, Slovakia, Serbia, Austria, Slovenia, Ukraine. It also has official status in local governments. Shouldn't that be on the list?

Not if the column is described as "Mainly spoken in" and is focused on countries (or large provinces like Zhejiang and Andhra Pradesh).
Peter Isotalo 15:01, 18 May 2015 (UTC)

Why use Nationalencyklopedin as a source?

I see this is one of several articles about languages of the world that use Nationalencyklopedin as a source, but why? The statements in that source are in many cases plainly not comparable between one language and another, and a lot of other sources strenuously disagree with that source. (I speak both English and Chinese, and my sense of how those two languages compare in total number of speakers or in number of native speakers makes me doubt Nationalencyklopedin quite a lot.) When did editors here begin using Nationalencyklopedin as a source, and why? -- WeijiBaikeBianji (talk, how I edit) 20:02, 9 May 2015 (UTC)

WeijiBaikeBianji, you have been making this same comment several times, I think. The answer is still the same: we can very well use another source, but please provide which source. If you find a better source to use than Nationalencyclopedin, and other users agree it's better, then of course we will use it.Jeppiz (talk) 21:20, 9 May 2015 (UTC)
Thanks for your thoughtful reply. I wasn't sure if there was a lot of water over the dam about this issue in previous years, or if this was just a handy source for some editors (it is not handy for me) that has been heavily used on the basis of convenience. I'll start digging more deeply into sources that I have reason to think are more reliable, now that I have a better sense of the consensus here. -- WeijiBaikeBianji (talk, how I edit) 04:20, 10 May 2015 (UTC)
The main advantage of using Nationalencyklopedin as a source, compared to most other sources, is that all figures are comparable, i.e. all figures are based on the same criteria. So if you suggest a new source make sure that it includes all languages, or at least all major languages, and not just one or two. Thomas.W talk 08:04, 10 May 2015 (UTC)
Thanks for your thoughtful reply too. I gathered that that was the main rationale for using Nationalencyklopedin as a source. With respect to any editors who have used Nationalencyklopedin as a source in the past, I'd have to say that that source's treatment of Chinese, English, and Hindi shows that it is not using comparable criteria to rank the different languages, but rather very much comparing apples to oranges. (I speak Chinese and English fluently, and have studied a little bit of Hindi and of course know many speakers of that language.) But I will have to dig into other reliable sources to have any substantive edits to do here, and meanwhile will continue to observe the discussion here on the talk page and the edits to article text by other editors to guide my understanding of the editing issues here. -- WeijiBaikeBianji (talk, how I edit) 15:58, 10 May 2015 (UTC)
It was discussed here and here. The reason is simple: some person who considers himself as "responsible" for the "linguistic sector" of Wikipedia and thinks he knows the best than all others just decided to delete all other sources (particularly "Ethnologue"). And he ignored any objections as you can see.--Lüboslóv Yęzýkin (talk) 12:57, 10 May 2015 (UTC)
Thank you for drawing my attention to the links showing previous discussion of sources. I'll study the earlier discussion and ponder that as I look up sources. -- WeijiBaikeBianji (talk, how I edit) 15:58, 10 May 2015 (UTC)

Comment I most protest at Lüboslóv Yęzýkin snide personal attack at kwami. Kwami did not decide this unilaterally, it was a consensus decision that was perfectly in line with Wikipedia's policies. To get back to the topic, I doubt anyone think Nationalencyclopedin is a fantastic source or that it is 100% correct. However, previous discussions have resulted in a consensus that it is the least bad source. Should a better and more reliable source be found, we would seriously consider it. The problem is that there are very few WP:RS that list a large number of languages.Jeppiz (talk) 16:24, 10 May 2015 (UTC)

What Wikipedia policy argues for preferring a source like Nationalencyklopedin over, say, a source like Ethnologue? (I'm genuinely curious, as that will guide my selection of other sources.) It occurs to me, after searching the Worldcat library database for holdings of Nationalencyklopedin, that the Encyclopedia of language & linguistics may be much more accessible to a wider group of Wikipedians than Nationalencyklopedin is, although of course the facts and figures about particular languages are spread over dozens of different articles in that source. But that may still be a feature, rather than a bug, if the Encyclopedia of language & linguistics "shows the work" for how the figures are calculated for each language. -- WeijiBaikeBianji (talk, how I edit) 17:03, 10 May 2015 (UTC)
No Wikipedia policy recommends Nationalencyklopedin over Ethnologue, I was referring to the policy of discussions and consensuses. The two are not necessarily incompatible; for a long time we presented data from both Nationalencyklopedin and Ethnologue. Ethnologue was not removed because of Nationalencyklopedin, Ethnologue was removed because several users felt it is so bad as to better left out, due to its many (and I mean MANY) inaccuracies.Jeppiz (talk) 17:44, 10 May 2015 (UTC)
Nationalencyklopedin (NE) does not seem to be a major authority when it comes to language statistics. It seems to have been chosen mostly as a convenience. NE isn't a linguistic authority and it doesn't compile it's own statistics. As such, the sources that NE uses should be more relevant. Relying entirely on just one source in this way is clearly a POV problem.
Peter Isotalo 20:32, 10 May 2015 (UTC)
I'm afraid the comment by Peter Isotalo shows a lack of understanding of how articles like this work. Peter Isotalo is here because of WP:HOUNDING me after I opposed an edit of his on another article, so he probably didn't check out the archives, where this has been discussed at great length. For any list of this kind, we need to have a single source for all the data, as that is the only way to gain some accuracy. If we just used different sources for different languages, any user with the faintest nationalist WP:POV would find the data that makes their own language bigger, and there would be no way of controlling if the same methodology and definitions have been used. This is not unique to this article, a similar praxis is usually followed on most List of largest X articles. The question of whether Nationalencyklopedin is the best source or not is already ongoing, and I can only repeat what has been said several times by several users: it is the least bad we have to date, but any constructive suggestion of an alternative source would be welcome for discussion.Jeppiz (talk) 08:59, 11 May 2015 (UTC)
I'm not sure that the statement "For any list of this kind, we need to have a single source for all the data, as that is the only way to gain some accuracy" fully convinces me, especially given the problem you mention that there may or may not be some sort of bias that influences the source. We need to have reliable sources, yes, that "show their work" about where the numbers come from, but I'm not as readily convinced that all the numbers have to come from one source, even as a starting point. Meanwhile, I'll look for other sources. -- WeijiBaikeBianji (talk, how I edit) 12:34, 11 May 2015 (UTC)
For the record, we used to have it the way you propose, and that only resulted in eternal edit wars and POV-pushing. Any Turkisk user would come up with a source where Turkish was particularly big, French users with a source where French was really big, and the same for Koreans, Italians, Russians and you name it. Whenever there is a ranking, regardless of what is ranked, the methodology needs to be the same. If we would take one source that gives a certain number of English speakers and another source that gives a certain number of Spanish speakers and then claim that one is larger than the other, we would be engaging in original research as neither of those two sources would say that one is bigger than the other. If we do it for 100 languages, we have original researchx100.Jeppiz (talk) 12:43, 11 May 2015 (UTC)
So what exactly is NE's methodology and why is it superior to that of any other sources?
Peter Isotalo 12:59, 11 May 2015 (UTC)
NE satisfies WP:RS, and as has been said repeatedly in the discussion, nobody is claiming it's a fantastic source. On the contrary, it has also been repeated several times that everybody is happy to discuss an alternative source. So what about starting to WP:HEAR that now? What source do you support?Jeppiz (talk) 13:04, 11 May 2015 (UTC)
I have made no claims that it doesn't satisfy WP:RS, but I'm concerned that just one source is used. It's a perfectly valid concern in this context.
You've referred to the "methodology" of NE several times and stressed that it's very consistent. What exactly is that methodology, though? Where did NE's figures come from?
Peter Isotalo 13:11, 11 May 2015 (UTC)
I have absolutely no idea, and I could not care less. A national encyclopaedia is put together by experts, and is WP:RS. As usual, your point is purely disruptive and you contribute nothing to this article, just as you don't contribute to other articles either. Using sources does not require a thorough analysis of the exact methodology of those sources. There are countless articles referring to Encyclopaedias such as Encyclopaedia Britannica without the exact methodology. If you have a source to suggest, then suggest it. At Wikipedia, Competence is required and it's clear from your contributions across several articles that you don't have that competence. I quote Thomas.W, "That is, quite frankly, a load of cr*p, Peter. And you know it. I have pointed out your errors, your flawed reasoning/logic and your obvious lack of knowledge" [1]. The same comment applies here. Now, is there any serious user who has a serious comment to make?Jeppiz (talk) 13:21, 11 May 2015 (UTC)
Encyclopedias are useful, but they're not automatically superior to other sources. We evaluate sources all the time, and in this case, NE has obviously taken these statistics from someone else. NE is not a linguistic institute or a widely recognized authority on language demographics, so there's no reason to trust them blindly. So where do the figures come from? And what methodology and definitions have they actually used?
Peter Isotalo 13:48, 11 May 2015 (UTC)
Go and find out. In the meantime, does anyone has an alternative source to suggest?Jeppiz (talk) 14:05, 11 May 2015 (UTC)
Ethnologue for one, or the sources that Ethnologue refers to. Language has several general sources relating to the world's languages that contain figures, like The World's Major Languages (Comrie, ed. 2009) or Concise Encyclopedia of Languages of the World (Brown & Ogilvie, eds. 2008). Other major encyclopedias seems reasonable as well
The point here is that sticking to literally just one source is seldom, if ever, neutral or appropriate. Different sources usually give different figures and it's perfectly natural to explain this variation to readers. I see no indication that NE is considered an authority on language statistics and there is no indication that the methodology used is uniquely consistent and accurate. It should be complemented with other sources.
Peter Isotalo 14:47, 11 May 2015 (UTC)

Rather annoying to have to repeat oneself five times just because a user doesn't hear but nobody argues we should only one Nationalencyclopedin. On the contrary, if there is another source with an another ranking that satisfies WP:RS, we should use that source. So far, no such source has been presented. I have Comrie at hand, great book that I use almost every week, but it does not propose a ranking for the largest languages.Jeppiz (talk) 15:12, 11 May 2015 (UTC)

I see no reason to limit ourselves to sources that publish figures specifically in the form of rankings. The point of a list like this is to provide stats for native speakers of major languages overall. If different sources give different figures, we should report this, not just a single number; anything else is faux accuracy. This is not list of languages by number of native speakers according to unedited lists from single sources after all. There is nothing in WP:RS that bars us from compiling ranked lists from several sources.
Peter Isotalo 17:06, 11 May 2015 (UTC)
Making a claim that language A is larger than language B without a source supporting it is WP:OR, and I think most users understand that (Competence is required). Even if we would engage in original research like that (and we should not), the question still remains which source(s) to use and which ones to disqualify. We used to have more than one ranking in this article, I seem to recall, and there is no reason whatsoever we could not have more rankings again. They way to get there is to be constructive and produce a source and compile a ranking and suggest it for discussion. That was how the present ranking came about.Jeppiz (talk) 17:18, 11 May 2015 (UTC)
You've seriously misunderstood what this list is about, then. It's not so much a fixed ranking but a list of languages with the reported number of native speakers; otherwise it would be titled list of rankings of languages by number of native speakers. These figures vary depending on the source, so the ranking can also vary. And unless you can explain how and why NE has compiled these statistics, there's no reason to insist that it is in anyway superior to a compilation of several other sources.
As for lists being original research because they use several sources, I suggest checking these out:
You ought to take your attitude down a few notches, btw. Throwing WP:CIR in people's faces after making such an imaginative interpretation of WP:OR isn't terribly convincing.
Peter Isotalo 18:12, 11 May 2015 (UTC)
Well, I have little time for users who stalk other users to export conflicts across articles, concerning my "attitude". An admin already concluded you edit "with a vengeance" which is not the right reason. But back to the actual topic, I already said that you're more than welcome to make your own ranking and present it here, complete with sources. Just go ahead. If it's good, I will gladly support it. If it's not good, no doubt other users will also point that out. Until you have made the ranking, there's little to discuss.Jeppiz (talk) 18:24, 11 May 2015 (UTC)

Comment. I haven't read all of the above, as it started turning into personal attacks rather than an honest debate of the sources. IMO it would be best to have several lists, NE being one, so that the reader can compare them. We should not rank languages when the article is based on a single source, as that would imply that language #20 has more speakers than language #25, which it might not. But we could rank them if we had multiple sources, each in a separate section. Then readers would see that rankings are close to meaningless.

I've spoken to the editor of the NE article, and his methodology was to find the best source he could for the population of a language in each country it is spoken in, then adjust for the population growth of that country, so that the estimates would at least all be normalized to the same year. Thus all the estimates in NE (2007) are for 2006 (I think). This contrasts with Ethnologue, where one language may have an estimate for 2014, another for 1984, and another for 1954, and the dates given by Ethnologue may not be the dates of the estimates, but the dates of publication, which may be 30 or 40 years later. Peter, if you find other sources like NE, please provide them. They will make the article better. One thing we do not want is for each language population to be estimated by nationalists from that population, which is what happens when we allow different sources for each language. That's just as problematic as letting them classify their languages, so that Croatian is a primary branch of Slavic, unrelated to Serbian, Kurdish is a primary branch of Indo-European, unrelated to the Iranian languages, etc. — kwami (talk) 20:00, 11 May 2015 (UTC)

Is it Mikael Parkvall?
Peter Isotalo 21:46, 11 May 2015 (UTC)
Yes, that's right. We could probably do as good a job, but would run into accusations of bias, OR, etc. — kwami (talk) 23:49, 11 May 2015 (UTC)
I've used one of Parkvall's reports in minority languages of Sweden and Yiddish language#Sweden. He seems to be fairly thorough. It would be good if he's actually named as the author of the lists. It would also be preferable if his methods were detailed because choosing the "best source" is pretty vague.
Peter Isotalo 17:10, 12 May 2015 (UTC)
I agree, in case any of that information is available in NE (which I don't have at hand right now). To the best of my recollection, NE entries do not mention authors nor methodologies, though. What kwami says is very interesting, and I agree with Peter that Parkvall is thorough. If his methodologies are in personal communication with kwami, I'm not sure if it's correct to add them, but I could of course be wrong.Jeppiz (talk) 18:07, 12 May 2015 (UTC)
I added his name to the refs auto-generated by the language info box. — kwami (talk) 18:26, 12 May 2015 (UTC)
Jeppiz, we can never add details we don't have reliable sources for. Personal communication is not a reliable source.
Peter Isotalo 19:52, 12 May 2015 (UTC)
Peter, that's what I assumed, thanks for the confirmation.Jeppiz (talk) 20:02, 12 May 2015 (UTC)
Thanks, Kwami, for your statement, "if you find other sources like NE, please provide them," as I didn't want to act against consensus, but I do think that the article needs new and better sources. Your further statement that "We should not rank languages when the article is based on a single source, as that would imply that language #20 has more speakers than language #25, which it might not. But we could rank them if we had multiple sources, each in a separate section. Then readers would see that rankings are close to meaningless" is also a helpful statement, and it sums up how I've been feeling about the current ranking as I have been watching edits on this article over the course of this calendar year. Best wishes to you and to everyone here as I try to dig up other sources. I'm baffled that the Nationalencyklopedin isn't even held by the best academic library I have access to (not as an online resource nor as a print resource), even though it holds a huge collection of Swedish-language books. -- WeijiBaikeBianji (talk, how I edit) 18:36, 12 May 2015 (UTC)
You're right, WeijiBaikeBianji, that is a bit surprising. Foreign encyclopaedias are usually very high in priority at most academic libraries. As I said yesterday, using Nationalencycklopedin is apparently common on Wikipedia in quite many languages, which would suggest that not that many national encyclopaedias provide a ranking of languages. I would expect any language in which a domestic version is available to prefer that one over a Swedish one, that of course has the double disadvantage of not being widely available and being in a language few people are able to read. Best of luck in your search for sources! Jeppiz (talk) 19:45, 12 May 2015 (UTC)
  • Comment I took a quick look at the corresponding articles in some other Wikipedias to see if there would be a useful and different source among them, but found very little. Most other major languages use Nationalencyclopedin as well, probably influenced by this article, and often combined with Ethnologue, which I tend to dislike for the reasons kwami outline above. The French article used a third source, from La Francophonie, which was very close to Nationalencyclopedin with the sadly unexpected exception of ranking French much higher. That seems like bias, and I would be reluctant to include that source for that reason.Jeppiz (talk) 00:32, 12 May 2015 (UTC)

We don't use NE in the language info boxes because of the example of this list. The two changes occurred at the same time: We decided, after some discussion, to both add an NE section to this list and to shift from Ethn. to NE in the articles. Many of us had long been dissatisfied with Ethnologue, and when NE came along, we jumped at the chance to have an actual RS. The fact that it took so long to find a reliable source suggests there are not many of them. Ethnologue has long used sources like almanacs, which have no indication of where the data comes from, so they're evidently not finding much either, though recent editions seem to be taking the problem more seriously. Where we do not follow NE is with the Hindi languages, as NE is based on the Indian census, which relies on speaker identification as to whether their language is "Hindi" or not. (The publishers of the census results recognize the problem, and group together the languages whose speakers self-report as Hindi.) — kwami (talk) 02:04, 21 May 2015 (UTC)

Berber

Berber has between 16 and 30 million native speakers (see here). Should it be added? MassachusettsWikipedian (talk) 02:50, 24 May 2015 (UTC)

Spanish-Italian intelligibility.

I have already responded to a comment on Portuguese. The fact is that Italian and Spanish are mutually intelligible, not 100 percent of course, but to a large extent. Any Spanish speaker knows that. Why is that issue not addressed? In fact a native Speaker of Spanish can understand with a high degree of intelligibility formal, standard Italian. The fact changes a lot when we speak about regional dialects, informal language, slang, etc., but the standard, formal Italian language is understood to a large extend by Spanish speakers. I guess the same applies to Italian speakers with formal, standard Spanish: I have taken this example from the Italian article. Even if you do not speak Spanish, I think that you can see the similarities. If on top of that you take into account that the sound systems are almost the same, the fact is that communication between both languages is pretty acceptable. Under A is Italian, under B the Spanish translation:

A) La seguente tabella si basa su dati provenienti dalla pubblicazione.

B) La siguiente tabla se basa en datos provenientes de la publicación.

"Languages of the World".[1]

A) Molte delle stime si riferiscono ad anni precedenti il 2010.

b) Muchas estimaciones se refieren a años precedentes 2010.

A) Sono state considerate le prime cento lingue parlate al mondo.

B) Han sido considerados los primeros cien idiomas hablados en el mundo.

A) ordinate per numero di madrelingua.

B) ordenados por número de hablantes nativos.


I think you can see the extreme similarities between both languages. Often it is a question of style to use the same or other words, but the same words exist in both languages for the same or similar meanings in many cases.

I think you forgot

A) I sindaci dei villaggi si sono detto: mai più!
B) Los alcaldes de las aldeas se han dicho: ¡Nunca más!

Not that similar.

In Portuguese:

C) Os chefes das aldeias se tem dito: Nunca mais!

Almost identical to Spanish.

But joking aside, you are of course right. The problem is that while most parts of this article is rather well sourced, almost no claim of mutual intelligibility is sourced. It's not about whether it's true or not, it's about whether it's sourced. If nobody objects, I'll remove the claims that aren't sourced.Jeppiz (talk) 18:21, 18 May 2015 (UTC)

Even in that case there are a few Italian words that would be recongnized in Spanish. See:

A) I sindaci dei villaggi si sono detto: mai più!
B) Los sindicados de las villas dijeron: jamas mas!

Whereas in Portuguese the sentence is still much more similar to Spanish:

C) Os sindicatos das vilas dicerao: jamais mais! — Preceding unsigned comment added by 199.167.103.222 (talk) 13:33, 20 May 2015 (UTC)

This Italian / Spanish mutual intelligibility is plain nonsense. No Italian and Spanish speaker using everyday common vocabulary in their respective language would be able to sustain intelligible communication beyond the very basic greetings and small talk. On the street, your average Italian and Spanish speakers simply cannot sustain a fluid conversation - the conversation would begin getting muddled within the very first minute. This is a fact that has been observed and admitted to by the Italians and Spanish speakers themselves. The syntax of the two languages don't really line up to any appreciable extent, because the vocabulary and grammar between the two languages varies too much. Similar accents don't amount to much when completely unfamiliar and different vocabulary and grammar are used. A Spanish speaker, unfamiliar with Italian, would never know that 'burro' means 'butter' in Italian, no matter how clearly it is said. In Spanish 'burro' means donkey. There are endless examples of these kind of differences between Spanish and Italian. There are similarities between the two languages to be sure, but the languages are as different as they are similar. They may be intelligible to some extent, but certainly not to the point where they can maintain a sustained intelligent conversation.

On the other hand, communication between educated Portuguese and Spanish speakers flows effortlessly, and can be truly said to be at least almost fully intelligible when spoken clearly and slowly. They can have an intelligent conversation about any subject. There may be slight dips in the conversation here and there, but certainly not to the point that it would impede almost full intelligibility. This is not surprising given that the grammar, vocabulary and sentence structure between Portuguese and Spanish are 89% similar. Countless empirical linguistic studies of Portuguese and Spanish bear this out. After all, Portugal and Spain are Iberian neighbours that share a very similar culture, history, genetic makeup, and of course very similar languages. Ditto for Brazilians and Spanish Americans. Please include information in your articles using credible facts that are based on empirical linguistic research. When you say that Italian is partially intelligible with Spanish, and research only put the number at about 45-50%, it makes it seem that the intelligibility between Spanish and Portuguese is the same as between Italian and Spanish - this is wrong. The fact of the matter is that the spoken intelligibility between Spanish and Portuguese is around 70-80% depending on who is doing the talking and listening - however, it is still considerably higher than between Italian an Spanish. In other words, don't put apples and oranges in the same basket. Thank you. — Preceding unsigned comment added by 99.234.25.147 (talk) 23:27, 4 June 2015 (UTC)