Page MenuHomePhabricator

Filter for New User uploads in Mobile Web
Closed, ResolvedPublic

Description

Too much Copyright violations via Mobile... See...

https://commons.wikimedia.org/wiki/Category:MobileUpload-related_deletion_requests

https://commons.wikimedia.org/wiki/Category:Mobile_uploads_lacking_EXIF_data

https://commons.wikimedia.org/wiki/Category:Mobile_uploads_lacking_EXIF_data_and_with_multiple_Tineye_matches

Mobile Upload related:
"1,156 open deletion requests"
"4268 closed (as deleted) deletion requests"
"And a lot of speedy deletion requests... (See: https://commons.wikimedia.org/wiki/Commons:Mobile_app/deletion_request_tracking/archive (1, 2, ...)"

Pleas set up a filter.


See also wikimedia.mingle.thoughtworks.com/projects/mobile/cards/1314... (reported via IRC)

  • Add filter for new user uploads (0 uploads) on images with no exif data
  • Present these with either a nag or with a block and an educational statement.

Version: unspecified
Severity: major
See Also:
https://bugzilla.wikimedia.org/show_bug.cgi?id=66490
https://bugzilla.wikimedia.org/show_bug.cgi?id=68375
https://bugzilla.wikimedia.org/show_bug.cgi?id=68414

Details

Reference
bz62598

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

I think, too, that the actual situation of mobile uploadsmust be better, specially for the community, which review the uploads. But, i think that an editcount of 400 is too big. Isn't a minimum editcount of 100 enough? For the implementation of this, i would prefer to add a new configuration variable (with default 0), so that third party users and Wikipedia-projects can set it self.

Just in case it helps, Commons' Picture of the Year contest defines a voter eligibility of 75 live edits on any single Wikimedia project. Could this be considered a reasonable requirement to upload pictures from mobile?

@Quim Gil: That sounds good for me, but i think (for the reasons wrote above), that a fixed level in code of MF isn't the best way. For Wikimedia-Projects (in generally and Commons as an example) the Editcount can be 75, that sounds good, too :)

@Florian, yes, I fully agree that this is Wikimedia specific problem that deserves a Wikimedia specific solution. I'm also not implying that requiring a minimum Editcount will solve the entire problem. I was just trying to find a reasonable number that the Commons community is already familiar with.

Technical measures alone will not be sufficient to solve this.

A well-thought video clip or slideshow before anybody can start uploading mobile with a simple multiple-choice question that must be answered correctly after the clip was played could have an extraordinary positive impact on this problem. Get away from the idea that people can "quickly" upload files. Most of the time people do it "quickly", they get it wrong.

A well-thought video clip or slideshow before anybody can start uploading mobile

I personally disagree for that :/ I think alot of mobile users visit Wikipedia and Commons with a limited data plan. That means, that our way should be to reduce the transferred content as much as possible and not blow it up with videos or slideshows, where the user don't know, that they come. That's my personally thought about that :)

Maybe you're also interested in my first-time mobile user experiences reported in bug 68375.

Summary:

  • the upload tutorial doesn't work for me on my desktop browser, but that may be a quirk on my end. Please verify all the same.
  • maybe add a "respect third-party copyrights" line on the description page.

Part of the problem really appears to be that the process is so streamlined and simplified, that it is OK *only* for self-taken pictures but completely neglects that people can and do upload other images through it. It is indeed a very quick process, and it is right only in one particular use case, and completely wrong in many other use cases. Which explains the poor success-failure rate.

Use cases:

  • I want to upload an image I have just taken with my smartphone - OK, though the selfies are not really in scope and will be deleted unless used on a user page.
  • I want to illustrate this article with this image I've just taken myself - OK for uploading, but I didn't see any post-upload help to actually use the uploaded image anywhere, i.e., to include it in the article I wanted to illustrate.
  • I want to illustrate this article with this old image I've found at XYZ - process is wrong for that.
  • I want to illustrate this article with this image I saw on that other website - process is wrong for that, and user should be educated not to do this at all.
  • I saw this cool picture and want to share it with Wikipedia - process is wrong for that, and user should be educated not to do this at all.

I did mention in bug 68375 one idea to enforce that the process is really only used for self-taken images: do a server side Google image search for each Mobile/Web upload and refuse the upload if more than one hit. Just a crazy idea, but it would catch most copied-from-somwhere files.

Instagram uploads could probably be caught by looking at the width/height ratio; they're usually square. If the ratio is in the range [0.9, 1.1] or some such, refuse the upload.

(In reply to Florian from comment #78)

I think alot of mobile users visit Wikipedia and Commons with a limited data
plan

They probably want to avoid uploading at all because images are usually bigger than people would expect.

(In reply to Lupo from comment #79)

do a server side Google image search

I'd also like to endorse the idea of a Google Image Search for each upload that is claimed to be own work. On Desktop, this could be maybe a warning but on mobile I agree that refusal is okay. On top of that it'd be great if we would have a database of pHashes or similar techniques to allow similarity detection of files already uploaded (to address the duplicate issue).

to address the duplicate issue

Duplicate images will be detected already, maybe there is a problem in detection of scaled images.

But, like i said in german discussion on commons, (@mobile team, if i'm completly wrong, then please say anything :)): These suggestions are all good, and yeah, i would prefer to imrpvode the upload workflow and functionallity to make it more useful for the whole community and less work for admins and "cleaner" in the community. But this are all function suggestions, and in my opinion, mobile frontend is that: a forntend. This functionallity is good for desktop as well, so in my point of view it's better to make this changes in core, or with an extension, but not in mobile frontend (this isn't a mobilefrontend problem generally, it comes on mobile frontend only, because it's much easier to ignore the non-existing warnings in mobile as on desktop. And the upload in mobile is much easier as on desktop. Just my 2 cents :D

(In reply to Florian from comment #81)

Duplicate images will be detected already

If they are exact duplicates (i.e. their data-hash matches), true. Consider an App that allows altering metadata and this whole file hash becomes useless. Perceptual hashing would also allow finding similar images, thus allowing suggesting categories based on file contents.

These suggestions are all good

Which suggestions? Please clarify.

This functionallity is good for desktop as well

What? Blocking users with less than X edits in any WMF project from uploading? Maybe ... but make sure administrators at Commons are able to exempt users from this restriction.

it's much easier to ignore the non-existing warnings in mobile as on desktop

Huh, can you please clarify? Is it a good or bad thing? Should there be a timeout on mobile, for example before the warning can be dismissed? Please elaborate.

And the upload in mobile is much easier as on desktop

... comparing which aspects? Is this something you want to see changed or adopted to desktop, or ... ?

(In reply to Rainer Rillke @commons.wikimedia from comment #82)

(In reply to Florian from comment #81)

These suggestions are all good

Which suggestions? Please clarify.

E.g. a better detection of duplicate images (not only copied), or automatic search in image search engines, if imeages marked as "own work" are really not uploaded anywhere else.

This functionallity is good for desktop as well

What? Blocking users with less than X edits in any WMF project from
uploading? Maybe ... but make sure administrators at Commons are able to
exempt users from this restriction.

No, look above. Blocking users for less than X edits is in discussion for mobile only?!

it's much easier to ignore the non-existing warnings in mobile as on desktop

Huh, can you please clarify? Is it a good or bad thing?

It isn't a bad, nor a good thing. It depends on the devices the frontend is made for. On mobile it's terrible to scroll down a huge amount of text and warning messages on a (mostly) tiny screen. If i remember correct, that's the reason why there is no warning message above the upload formular.
There was a similar discussion about anon warnings in edit forms (if anonymous editing is enabled), see https://bugzilla.wikimedia.org/show_bug.cgi?id=59937

Should there be a
timeout on mobile, for example before the warning can be dismissed? Please
elaborate.

Maybe that, maybe there is only one line which indicates, that there are information/warnings about uploads or something else.

And the upload in mobile is much easier as on desktop

... comparing which aspects? Is this something you want to see changed or
adopted to desktop, or ... ?

On desktop you have much more text and information to read before you can upload (in general you read a the text before uploading). On mobile you simply choose the file, give a description and click upload -> finished. On desktop you have much more to do (e.g. with UploadWizard, the default uploader):

  • (can be hidden, but default you see this) Information, what you can upload and what not
  • Choose file
  • release rights
  • add description, category, date...
  • Finished

That i mean.

Can somone pleas set ip a patch for reqired usergroup to upload:
diff:

  • autoconfimred

autopatrolled

AND mimimun editcount on commons: 75

And then let uns see if works. I am sure this reduces the crapuploads.

Thanks.

sorry for posting again, i mean:
diff (required usergroup to upload with -75 edits):

  • autoconfimred

autopatrolled

OR mimimun editcount on commons is "75"

Thanks.

Note: since we deployed the last patch (to add the autoconfirmed threshold to the left nav Uploads feature as well as the in-article upload) on July 10th, unique mobile web uploaders dropped by half (from ~120-140/day to ~60/day),[1] and the number of deleted uploads this month is also extremely low compared to what we've seen in other months.[2] Let's focus on the actual data, please :)

  1. http://mobile-reportcard.wmflabs.org/#uploads_daily-graphs-tab
  2. http://mobile-reportcard.wmflabs.org/#monthly_reports-graphs-tab

(In reply to Steinsplitter from comment #85)

sorry for posting again, i mean:
diff (required usergroup to upload with -75 edits):

  • autoconfimred

autopatrolled

OR mimimun editcount on commons is "75"

Thanks.

You mean only Editcount on commons or globally?

@Maryana: Does we have a statistic of delted mobile uploads or is it what i see on german commons discussion?

Regarding the Google Image search idea, this is a frequent suggestion that we have looked into. The main problem is that the only sites that offer this as an API service either charge money for it or have strict usage limits (or both). If anyone knows of a service that would be usable for this, please let us know.

(In reply to Florian from comment #87)

You mean only Editcount on commons or globally?

Yes. Commons editcount.

@Maryana: Does we have a statistic of delted mobile uploads or is it what i see on german commons discussion

https://commons.wikimedia.org/wiki/Category:MobileUpload-related_deletion_requests
And a lot of "non logged" deletion.

(In reply to Ryan Kaldari from comment #88)

Regarding the Google Image search idea, this is a frequent suggestion that
we have looked into. The main problem is that the only sites that offer this
as an API service either charge money for it or have strict usage limits (or
both). If anyone knows of a service that would be usable for this, please
let us know.

Additional requests cost $5 per 1000 queries, up to 10k queries per day.
This is _not_ a lot of money for the wikimedia foundation.

@Florian - sorry, wrong link on #2 - here's live-updating deleted mobile (web and apps) uploads per month: http://mobile-reportcard.wmflabs.org/graphs/deleted-uploads

(In reply to Maryana Pinchuk from comment #86)

and the number of deleted uploads this month is also extremely
low compared to what we've seen in other months.[2] Let's focus on the
actual data, please :)

We do.

Again: A lot of files are in queue for deletion. We don't have engough admins to delete allteh crap.

FWIW, I think it's still too early to see what effect the latest change has on percentage of uploads deleted. There is one reason to be cautiously optimistic though. Typically, mobile web deletions outstrip mobile app deletions by at least a 10:1 margin. So far for July, mobile app deletions actually exceed mobile web deletions:

http://mobile-reportcard.wmflabs.org/graphs/deleted-uploads

Of course it's possible this is due to app uploads skyrocketing for some reason, but there haven't been any significant changes on the app side, so I doubt that's the case. We should have a clearer picture in the next couple weeks.

(In reply to Ryan Kaldari from comment #92)

FWIW, I think it's still too early to see what effect the latest change has
on percentage of uploads deleted. There is one reason to be cautiously
optimistic though. Typically, mobile web deletions outstrip mobile app
deletions by at least a 10:1 margin. So far for July, mobile app deletions
actually exceed mobile web deletions:

http://mobile-reportcard.wmflabs.org/graphs/deleted-uploads

Of course it's possible this is due to app uploads skyrocketing for some
reason, but there haven't been any significant changes on the app side, so I
doubt that's the case. We should have a clearer picture in the next couple
weeks.

We need to wait weeks again? Somone with access to prod can run a query to get actualo data / data from the last weeks.

This is _not_ a lot of money for the wikimedia foundation.

Argh :/ What is with third-party wikis? Yeah, maybe they haven't this problem, but in my opinion it's not the best way to introduce a function in an free and open source software that maybe cost you later something :) Maybe i'm the only one who thinks so :P

The main problem is that the only sites that offer this as an API service

Unhappily Google does not provide an API for that, has anyone a contact in Mountain View to request one? :P

That's a really problem :(

Yes. Commons editcount.

Basically this is much easier to implement as global editcount, mobile team? Give it a try? (look at https://gerrit.wikimedia.org/r/#/c/143751/ it's only a little change) :)
(Just for saying this ;))

@Florian - sorry, wrong link on #2 - here's live-updating deleted mobile

That's much clearer now, thx Maryana :)

(In reply to Maryana Pinchuk from comment #90)

@Florian - sorry, wrong link on #2 - here's live-updating deleted mobile
(web and apps) uploads per month:
http://mobile-reportcard.wmflabs.org/graphs/deleted-uploads

That graph is not up to date.

(In reply to Maryana Pinchuk from comment #86)

Let's focus on the actual data, please :)

Right, let's. Up-to-date manually collected statistics are at

https://commons.wikimedia.org/wiki/Commons:Forum#A_propos_.22mobile_upload.22

for days since July 16, 2014. From July 16 to July 21 (including), there were at least 330 speedy deletions of Mobile/Web uploads. Plus another 110 in various deletion queues (no permission, no source, no license, deletion requests).

Your limn statistics show only 149 right now, so they are clearly not up to date or plain incorrect.

Thank you for your attention.

(In reply to Ryan Kaldari from comment #92)

FWIW, I think it's still too early to see what effect the latest change has
on percentage of uploads deleted.

I disagree. It's not too early. Just see the link I gave in comment 95. Also see my comment 65.

Here's a rough summary of the situation since July 10:

We have roughly 100 Mobile/Web uploads a day. Roughly 60% are speedy deleted, another 20% are in 7-day-deletion queues (of which most *will* be deleted), and of the remaining 20% about half is arguably not useful (unused selfies, digitally altered retrica images, some instagram stuff that we can't pinpoint since Google doesn't index it and because it's usually also digitally altered, etc).

We get at most 10% halfway acceptable images, and 90% crap.

That's the same relations as before July 10, only the volume dropped by half. And having played with it myself, it's also clear why this is so: mobile/web uploads is a single-purpose process (upload self-taken photos) that is being used outside its specification boundaries (it's also and mostly being used to upload images copied from arbitrary websites).

From July 16 to July 21 (including), there were at least 330 speedy deletions of
Mobile/Web uploads.

Where do you get 330? I count 93. Does that list include both app and web uploads or just web?

(In reply to Ryan Kaldari from comment #97)

From July 16 to July 21 (including), there were at least 330 speedy deletions of
Mobile/Web uploads.

Where do you get 330? I count 93. Does that list include both app and web
uploads or just web?

Just mobile/web. They are tagged in the upload log as "mobile web edit".

July 16: 87 uploads; next morning 33 remained: 44 speedy deletions
July 17: 84 uploads; next day 44 remaining: 40 deletions
July 18: 91 uploads; next day 44 remaining: 47 deletions
July 19: 123 uploads; next day 42 remaining: 81 deletions
July 20: 104 uploads, next day 40 remaining: 64 deletions
July 21: 96 uploads; next day 42 remaining: 54 deletions

44 40 47 81 64 54 = 330 speedy deletions out of 585 total mobile/web uploads in these 6 days.

Of the remaining 255, 110 are in deletion queues. Or rather, were when I collected the statistics and posted them at

https://commons.wikimedia.org/wiki/Commons:Forum#A_propos_.22mobile_upload.22

The precise numbers may have changed a little in the meantime. For instance, some of the remaining files were already tagged for speedy deletion. And starting tomorrow or the day after, the ones from July 16 still in deletion queues should slowly go away, too.

Sorry for being dense, but where are you getting those numbers from? Here's what I see by counting what's in User:Didym/Mobile_upload/2014_July_17-21:

July 17: 104 uploads; 40 deleted
July 18: 71 uploads; 14 deleted
July 19: 107 uploads; 10 deleted
July 20: 87 uploads; 18 deleted
July 21: 82 uploads; 6 deleted

(In reply to Ryan Kaldari from comment #99)

Sorry for being dense, but where are you getting those numbers from? Here's
what I see by counting what's in User:Didym/Mobile_upload/2014_July_17-21:

July 17: 104 uploads; 40 deleted
July 18: 71 uploads; 14 deleted
July 19: 107 uploads; 10 deleted
July 20: 87 uploads; 18 deleted
July 21: 82 uploads; 6 deleted

These pages are not entirely correct, only files still existing at the bot run are listed, but a lot of files are already deleted at this point.

Thanks for the explanation! So where do Lupo's numbers come from?

I think it is OKAY to implant the things suggested in comment 85.
I hope we can resolve this in a few day (and not weeks)

:-)

I think it is OKAY to implant the things suggested in comment 85.
I hope we can resolve this in a few day (and not weeks)

That change would eliminate virtually all mobile uploaders. Only a miniscule percentage of mobile users have a significant edit count on Commons. I would be very reluctant to make such a dramatic change without waiting for more data to back it up with. I'm not saying I doubt your data, but it is a very tiny data set, and as soon as we would implement such a change, someone would inevitably file a new bug saying it is too hard to upload on mobile and we would go through the entire discussion again. I would favor:

  1. Waiting until the end of the month so that we have more of a basis for comparison and can solidly justify crippling the feature.
  2. Implementing a less dramatic change. For example, requiring a minimum 75 edit count locally rather than on Commons.

This is just my personal opinion though.

@Lupo @Rillke:

I think we can do something for mobile.js - It is annoying to ask the devs and to wait every time months.

(In reply to Ryan Kaldari from comment #104)

That change would eliminate virtually all mobile uploaders.

Why is that worse than wasting even more community members' time needed to delete the rather constant percentage of useless uploads?

What's the exact reason to not make a more drastic change and afterwards experiment with changing the threshold again, instead of the other way round?
Is it "only" about gathering more data wanted by developers, with the obvious trade-off that the involved community wastes more time deleting stuff though folks have patiently provided data here several times now to describe the underlying problem?

How about: ask uploaders where they took the image from?

  • With options like "I made it myself", "my grandma made it", "found it somewhere on the interwebz".
  • Until they've selected something, don't allow uploading.
  • If they selected something other than "I made it myself", show a very short "copyright for idiots" style tutorial and disallow the upload.

This should filter out people who care but are about to make a mistake and most of drones, leaving only persistent drones and malicious uploaders. Reperesentatives of both of these categories can be blocked fairly liberally.

This dos not help. Our experience. (In reply to Max Semenik from comment #107)

How about: ask uploaders where they took the image from?

This dos not help. It is our experience.

What's the exact reason to not make a more drastic change and afterwards
experiment with changing the threshold again, instead of the other way round?

There are two reasons:

  1. To get enough data to actually understand what's going on. The latest change only went into effect 12 days ago. Even if there was only a small effect on upload quality, it would be good to know exactly what that effect was. This will help to inform future decisions about things like anonymous editing on mobile.
  1. Because we are a wiki. Allowing people to share isn't just a motto we're supposed to give lip service to. We should exhaust as many plausible solutions as possible before we effectively disable a core feature like uploading. In many parts of the world, mobile phones are the predominant way people access Wikipedia.

Also, it would be helpful to have an actual goal in mind. What percentage of bad mobile web uploads can be tolerated? 75%? 50%? 25%? Does volume affect that number?

(In reply to Ryan Kaldari from comment #104)

  1. Implementing a less dramatic change. For example, requiring a minimum 75

edit count locally rather than on Commons.

Can somebody summarize what the current restrictions are? It's just autoconfirmed on local wiki, right?

Can somebody explain to me why this user

https://en.wikipedia.org/w/index.php?title=Special:Log&user=Owndesd

(account created 07:05 on 22 July 2014)

could then 9 minutes later upload a file through mobile/web

https://commons.wikimedia.org/wiki/Special:Log/Owndesd

??

Autoconfirmed at the English Wikipedia means 4 days AND 10 edits. (At the Commons, it is just 4 days AFAIK.) Yet this user had 1 edit when he made the upload, and his account was brand-new.

https://en.wikipedia.org/wiki/Special:Contributions/Owndesd

What's wrong?

Thanks, Ryan – totally agreed.

So, I just pulled the data for all deleted uploads in July, not just mobile web:

Total uploads to Commons contributed from 7/1-today (7/22) that have been deleted as of today: 2117
Total uploads to Commons contributed from 7/1-today *from mobile web* that have been deleted as of today: 103

So, mobile web is currently responsible for less than 5% of all deleted uploads on Commons. That really doesn't seem like an unfair burden on the community to me, and I don't see why we're treating this specific contribution funnel differently from any other that we have on Commons. If Commons admins truly feel overworked dealing with inappropriate uploads, why aren't they focused on improving the desktop upload workflow, which is contributing far more inappropriate/deleted images to the project?

(In reply to Ryan Kaldari from comment #109)

  1. Because we are a wiki. Allowing people to share isn't just a motto we're

supposed to give lip service to. We should exhaust as many plausible
solutions as possible before we effectively disable a core feature like
uploading. In many parts of the world, mobile phones are the predominant way
people access Wikipedia.

While I would tend to agree with you in general, in this case it's not "switching off a core feature". It'd be switching off a "relatively recently added feature that has proven problematic". It's not "Smartphones are evil, and poor countries be damned; we shut you out", it'd be more like, "Sorry guys, we tried, but this isn't it yet. We're going back to the drawing boards and hope to come back again later".

Also, it would be helpful to have an actual goal in mind. What percentage of
bad mobile web uploads can be tolerated? 75%? 50%? 25%? Does volume affect
that number?

The goal is actually quite simple: the ratio must be such that you don't get complaints from the Commons community. Only half joking.

The percentage of speedy deletions and deletion requests for mobile/web must not exceed that of other uploads. Imagine if we had a 80-90% rejection rate on normal desktop uploads... if mobile/web uploads blend in with other uploads in that respect, then they're acceptable and you won't get complaints about mobile/web uploads in particular. Maybe it is then discovered that the crap rate is too high in general, but that would then be a different problem.

(In reply to Maryana Pinchuk from comment #111)

Total uploads to Commons contributed from 7/1-today (7/22) that have been
deleted as of today: 2117
Total uploads to Commons contributed from 7/1-today *from mobile web* that
have been deleted as of today: 103

I don't know where you get your numbers from, but if you look at the upload logs (deletions are redlinked there), this is simply not true.

Stop denying the problem by presenting fake or wrong numbers.

@Lupo, I'm getting the numbers directly from the Commons database. Here's my SQL if anyone with access to the database wants to double-check:

select fa_name, log_timestamp, fa_deleted_timestamp from filearchive, logging where fa_name = log_title and log_action = "upload" and log_timestamp >= '20140701000000'

If you really feel that I'm misrepresenting the numbers, I'm afraid we can't continue this discussion productively.

I agree with Lupo. Stop denying the problem by presenting fake or wrong numbers.

Why it is so difficult to get a correct statistic? Why volunteer need to explain thousand times the same thing and why volunteers need to educate you how to make correct statistics?

NONONO... You get payed for doing mobile related work but i am wasting my time here with emxplain you thinks again and again... I have deleted THOUSANDS of uploads. Engough is Engougs. Really.

AND Please use commons sense. There are ~80% (or moor, see Lupos stats) crap uploads... AND the volonteers need to wast a lot of time to sort this out.

Important note:
Maybe i find time to emergency switch off this crap tomorrow using this mobile.js

Can somebody summarize what the current restrictions are? It's just
autoconfirmed on local wiki, right?

That is correct.

Can somebody explain to me why this user
https://en.wikipedia.org/w/index.php?title=Special:Log&user=Owndesd
(account created 07:05 on 22 July 2014)
could then 9 minutes later upload a file through mobile/web

Looks like the restriction is not working on lazy-loaded pages (pages loaded immediately after editing). I'll file a separate bug for this.

I don't know where you get your numbers from, but if you look at the upload logs
(deletions are redlinked there), this is simply not true.

Looks like there's definitely some discrepancy here. Let's all assume good faith and figure out how we can make sure everyone's looking at the same data. It may be that there's an upload pathway on mobile that is not hooked up for eventlogging.

mariadb> select count(log_timestamp) from filearchive, logging
where fa_name = log_title and log_action = "upload" and log_timestamp >=
'20140701000000';

1 row(s) returned

count(log_timestamp)

'14257'

"Total uploads to Commons contributed from 7/1-today (7/22) that have been deleted as of today" should be that number, right?

http://bots.wmflabs.org/~wm-bot/logs/#wikimedia-mobile/20140722.txt and http://bots.wmflabs.org/~wm-bot/logs/#wikimedia-mobile/20140723.txt

Mainly focusing on what I said (trimmed for relevance):

[16:51:53] <legoktm> MariaDB [commonswiki_p]> select count(*) from filearchive join logging on log_title=fa_name where log_action = "upload" and log_timestamp >= '20140701000000' group by log_title ;
[16:51:57] <legoktm> 11949 rows in set (7.40 sec)
[16:57:10] <legoktm> MariaDB [commonswiki_p]> select count(*) from filearchive join logging on log_title=fa_name join change_tag on ct_log_id=log_id where log_action = "upload" and ct_tag="mobile edit" and log_timestamp >= '20140701000000' group by log_title ;
[16:57:13] <legoktm> it's running
[16:57:18] <legoktm> 1583 rows in set (9.82 sec)
[16:57:47] JamesR (~JamesR@wikipedia/JamesR) left IRC. (Quit: Goodbye.)
[16:57:52] <legoktm> [16:57:47] <EarwigBot> legoktm: 1583/11949 = 1583/11949 (approx. 0.13247970541467905)
[16:57:57] <legoktm> 13% is a lot.
^ mobile deletions / total deletions

[16:58:35] <Maryana> but my point still stands :)
[16:58:47] <legoktm> how?
[16:58:55] <Maryana> the vast majority of deleted uploads are coming from desktop
[16:59:14] <legoktm> that's because the vast majority of uploads are from desktop
[16:59:24] <Maryana> yes
[16:59:34] <legoktm> what's the upload to deletion ratio of mobile versus desktop?
[16:59:37] <Steinsplitter> mobile uploads are only a fre usable
[16:59:44] <legoktm> that's a more interesting stat IMO

[17:01:00] <legoktm> there were 3060 mobile web uploads in July.
[17:01:04] <legoktm> over half were deleted.

[17:02:24] <legoktm> desktop had 360236 uploads
[17:02:39] <legoktm> [17:02:35] <EarwigBot> legoktm: 11949/360236 = 11949/360236 (approx. 0.033169921940061515)
[17:02:50] <legoktm> so 3% deletion rate on desktop, compared to over 50% on mobile.

Someone should also sanity check my SQL...

[17:05:11] <Maryana> legoktm: can you run those numbers for after july 10?
[17:05:36] <legoktm> sure.

[17:06:49] <legoktm> 219237 uploads overall
[17:07:04] <legoktm> 1308 in mobile
[17:07:26] <legoktm> 672 mobile deletions, still over half
[17:08:06] <legoktm> 6319 deletions overall

2.8% deletion rate overall
51% deletion rate on mobile
2.6% deletion rate on desktop (calculated by overall-mobile)

Patch for lazy-loading bug (bug 68414) checked in:
https://gerrit.wikimedia.org/r/#/c/148560/

There's a decent chance this loophole is responsible for the discrepancy in uploading statistics as well.

(In reply to Maryana Pinchuk from comment #114)

@Lupo, I'm getting the numbers directly from the Commons database. Here's my
SQL if anyone with access to the database wants to double-check:

select fa_name, log_timestamp, fa_deleted_timestamp from filearchive,
logging where fa_name = log_title and log_action = "upload" and
log_timestamp >= '20140701000000'

If you really feel that I'm misrepresenting the numbers, I'm afraid we can't
continue this discussion productively.

@Maryana: I wrote in comment 113 that you were presenting fake or wrong numbers. I did *not* say "you faked" the numbers. Small but important grammatical difference. I just pointed out that the numbers you were using have nothing to do with reality. I didn't say you were misrepresenting your numbers; I said your numbers were bad to begin with.

Besides, from the fact that you kept using and defending these numbers it's evident that you didn't read comment 95 and comment 97 and following, where I pointed out this discrepancy already.

It is not helpful to ignore bug reporters' comments.

BTW, since we're all so focused on "statistics" right now, a cautionary note:

If the statistics say X% were deleted, that does *not* mean that the remaining (100-X)% were OK.

I just went over the mobile/web uploads from July 5 and July 6, where already a lot had been deleted, and I had no trouble at all to find still numerous copyright violations that had slipped through the first screening. Including very obvious ones such as [[:commons:File:Photo of Hill 2014-07-06 19-59.jpg]] (which is a low-res jpg copy of the "fair use" TV screengrab https://it.wikipedia.org/wiki/File:Augustus_Hill.png ), or [[:commons:File:Scott kazmir 2014-07-06 09-52.jpg]], which is an AP photo: http://www.apimages.com/metadata/Index/Mariners-Athletics-Baseball/e89985f7b99645b48188914a5a60ee7b/47/0 , or [[:commons:File:Perišić with VFL Wolfsburg 2014 2014-07-06 11-54.jpg]], which is a crop of a Getty photo: http://gty.im/478110559 .

Take this just as anecdotal evidence that the curators at the Commons _are_ indeed overworked and stretched beyond their limits.

I just want to emphasize that hosting copyright violation is a crime in certain countries. There were a few bills (DMCA safe harbour) that allows WMF to operate, though, but this liberal legislation should not be exploited or overstressed. Not to mention how it harms Commons hosting copyright violations as it poses threats to re-users. Every single blatant copyright violation is one too much.

(In reply to Gerrit Notification Bot from comment #63)

Change 143822 merged by jenkins-bot:
Show Uploadbutton only when user has the permission

https://gerrit.wikimedia.org/r/143822

This patch (don't show uploadbutton to users without autoconfirmed status on special:uploads, e.g. if they navigate directly to the page) was merged at July, 15. That means that this patch isn't deployed at July 10 with wmf13. It was deployed yesterday with wmf14 to commons, so that can be a problem, too, when we discuss the next steps :)

@Mobile-Team: Maybe i overlook something, but in the changelog of MW1.24wmf14 for MobileFrontend isn't this change listed?! Is that only an documentation error?
https://www.mediawiki.org/wiki/MediaWiki_1.24/wmf14#MobileFrontend

Florian: Actually, that change doesn't really take effect until tomorrow. It was deployed to Commons yesterday, but very few mobile users go to Special:Uploads directly on Commons. Most use it from the Wikipedias, which get the change tomorrow. (Just to clarify, uploads from Special:Uploads on the Wikipedias actually get uploaded to Commons, not the Wikipedias.)

I have no idea why that isn't listed in the changelog. I've confirmed, however, that the change is in effect in wmf14, however.

Bug 66762 has been fixed, so this bug is no longer needed (by me at least). Reclosing.

Had a quick discussion with the PM, design, and the other developers. The outcome of the discussion was to go ahead and implement some minimum local edit count threshold across all projects (in addition to requiring autoconfirmed status). The threshold will be configurable. We're also considering having a sprint devoted to improving the mobile upload workflow in general (ala Lupo's bug 68375).

Thanks for your undieing patience. And yes, Rainer, we know that hosting copyvio images is a serious problem. I'm a Commons administrator myself, so I understand your pain (believe it or not).

Thank you, Ryan, and all others involved. Max already said so at bug 68375.

If you indeed do a sprint on that rather general "improvement" bug, don't forget the (quickly drawn up and surely incomplete) list of use cases in comment 79.

Change 143751 merged by jenkins-bot:
Add Uploadrestriction using edit count

https://gerrit.wikimedia.org/r/143751

@Ryan (and others): Do you think it's a good idea to backport change https://gerrit.wikimedia.org/r/143751 to wmf15? So it will deployed at July 29?

@Florian: I think it's fine to let it get deployed normally. FWIW, the uploads graph shows that unique uploaders have been halved again since the lazy-loading loophole was closed last week:
http://mobile-reportcard.wmflabs.org/#uploads_daily-graphs-tab
Also I just noticed that that graph shows "uploaders" not "uploads", which may have been the source of earlier confusion.

Change 150075 had a related patch set uploaded by Kaldari:
Add Uploadrestriction using edit count

https://gerrit.wikimedia.org/r/150075

Change 150075 merged by jenkins-bot:
Add Uploadrestriction using edit count

https://gerrit.wikimedia.org/r/150075

@Florian: Actually, Maryana asked me to go ahead and deploy it to wmf15 so it should be live on the Wikipedias on July 31. We'll also have to deploy a config update sometime between now and then for it to actually have an effect though.

@ryan: Thanks for info!

We'll also have to deploy a config update sometime between now and then for it to actually have an effect though.

Working on it...

Change 150145 had a related patch set uploaded by Florianschmidtwelzow:
WIP: Add Uploadrestriction for Commons in MF

https://gerrit.wikimedia.org/r/150145

Since the LIMN graphs are broken, I ran some numbers manually:

2014-01: 70% deleted (2598 of 3697)
2014-02: 74% deleted (2732 of 3705)
2014-03: 72% deleted (2698 of 3767)
2014-04: 56% deleted (2321 of 4174)
2014-05: 67% deleted (4200 of 6229)
2014-06: 63% deleted (3794 of 5976)
2014-07: 61% deleted (2160 of 3526) (will increase further)

Personally, I think we should shoot for less than 50% deletion rate as a short-term goal.

SQL queries used...

Images deleted:
SELECT EXTRACT(YEAR_MONTH FROM log_timestamp) AS month, count(*) AS count FROM logging INNER JOIN tag_summary ON log_id = ts_log_id WHERE tag_summary.ts_tags LIKE 'mobile edit%' AND log_action = 'upload' AND NOT EXISTS ( SELECT img_name FROM image WHERE log_title=img_name ) GROUP BY month;

Images uploaded:
SELECT EXTRACT(YEAR_MONTH FROM log_timestamp) AS month, count(*) AS count FROM logging INNER JOIN tag_summary ON log_id = ts_log_id WHERE tag_summary.ts_tags LIKE 'mobile edit%' AND log_action = 'upload' GROUP BY month;

Change 150145 merged by jenkins-bot:
Add Uploadrestriction for Commons in MF

https://gerrit.wikimedia.org/r/150145

mwalker wrote:

I don't know if I can resolve this fixed or not... but I did just deploy the configuration change.

I don't know if I can resolve this fixed or not

I think we can keep it on patch to review since we have feedback from Commons users and/or the new statistic after this change.

Thanks @Matt Walker for merge.

*On Commons, mobile upload should be only possible for users with an editcount
higher as 75* should fix this definitive. Closing this for now as *resoled*.

Thanks to all for commenting and contributing patches.

Perhaps I missed something, but is this live? For example, recent entry:

(Upload log); 17:49:57 . . R3k813 (talk | contribs) uploaded "File:R3K Official Grafitti Logo 2014-08-03 17-49.jpg" ‎(Contributed image from Special:Uploads) (Tags: Mobile edit, Mobile web edit)

That is the user's first edit to commons. My understanding is that the change presented by this bug should have prevented uploads like that (Or was the change only for the callout from article viewing, and not special:Uploads?).

Or was the change only for the callout from article viewing, and not special:Uploads?

No, with this[1] patch the upload restriction should be on Special:Uploads, too. I have tested it with my account (20 edits on commons) and i don't have the link to Special:Uploads (and no button to contribute an image, if i open Special:Uploads directly) and no button to add an image to an article.

In this special case i think this image isn't uploaded from commons to commons. I think, he uploaded the image from Special:Uplaods of his home wikipedia (en.wiki). Like Ryan said in Comment 125:
(In reply to Ryan Kaldari from comment #125)

(Just to clarify, uploads from Special:Uploads on
the Wikipedias actually get uploaded to Commons, not the Wikipedias.)

For wikis except Commons we set a minimum edit count (local, so from en.wiki) to 10, and the user has 12 Edits, so he can use Special:Uploads and the upload for an article on en.wiki.

[1] https://gerrit.wikimedia.org/r/#/c/143822/

(In reply to Florian from comment #146)

Or was the change only for the callout from article viewing, and not special:Uploads?

No, with this[1] patch the upload restriction should be on Special:Uploads,
too. I have tested it with my account (20 edits on commons) and i don't have
the link to Special:Uploads (and no button to contribute an image, if i open
Special:Uploads directly) and no button to add an image to an article.

In this special case i think this image isn't uploaded from commons to
commons. I think, he uploaded the image from Special:Uplaods of his home
wikipedia (en.wiki). Like Ryan said in Comment 125:
(In reply to Ryan Kaldari from comment #125)

(Just to clarify, uploads from Special:Uploads on
the Wikipedias actually get uploaded to Commons, not the Wikipedias.)

For wikis except Commons we set a minimum edit count (local, so from
en.wiki) to 10, and the user has 12 Edits, so he can use Special:Uploads and
the upload for an article on en.wiki.

[1] https://gerrit.wikimedia.org/r/#/c/143822/

Ah. I misunderstood the change. Thanks for clarifying.

(In reply to Bawolff (Brian Wolff) from comment #147)

Ah. I misunderstood the change. Thanks for clarifying.

Np :)

Maybe @Lupo can give us (after several days) a new statistic about the uploaded/deleted statistic. Maybe we have to adjust the 10 edits for non-commons-wikis, if needed :)

There is the question whether the restriction will only affect the web interface or also the Apps: [[:c:Commons:Village pump#Mobile upload restriction]]

If I understand it right, https://gerrit.wikimedia.org/r/#/c/150145/ should be active now? So people who are either not autoconfirmed locally or who have less than 10 edits at the local wiki (whether autoconfirmed or not, and 75 if the local wiki is Commons) should not be able to upload through the Mobile/Web interface.

Right so far?

Then why can [[:commons:Special:Contributions/Pmirosee]] upload [[:commons:File:Kristen stewart 2014.jpeg]] (a clear copyvio from Getty) in his first edit, when he has exactly one single edit at the Turkish wikipedia ([[:tr:Special:Contributions/Pmirosee]]), and the account was created the same day??

Appears to me that this feature doesn't work yet. Or I don't understand it.

I also do not see any effect on the uploads at all. See the continued statistics at https://commons.wikimedia.org/wiki/Commons:Forum#A_propos_.22mobile_upload.22

(In reply to Lupo from comment #150)

Then why can [[:commons:Special:Contributions/Pmirosee]] upload
[[:commons:File:Kristen stewart 2014.jpeg]] (a clear copyvio from Getty) in
his first edit, when he has exactly one single edit at the Turkish wikipedia
([[:tr:Special:Contributions/Pmirosee]]), and the account was created the
same day??

Hm, that upload does not have the characteristic edit comment, so maybe it came in via Special:Upload (without "s").

However, [[:commons:File:Jessiann Gravel Beland 2014-08-05 11-21.png]] was uploaded by a user whose home account at tr-Wikipedia was created yesterday and who has no edits anywhere except the Commons, where he had 1 edit (before the file got deleted).

So I still think this feature isn't quite working as it should.

No, it couldn't have gone live since July 31.

I reopen this, because of:
This[1] user, without 10 edit's on no wikimedia wiki uploaded yesterday (after the registration of the account yesterdaay) this [2] image. I have tested Special:Uploads (directly, the link in the sidebar isn't there, that's ok) on en.wiki. There i'm not autoconfirmed and have no 10 edits. But i can contribute an image, if i want. After relook the code here [3] it's because of $wgMFPhotoUploadEndpoint, which is set to commons by default [4]. For now, the user can still upload (using Special:Uploads directly), if he is in all wikis except Commons ($wgMFPhotoUploadEndpoint not empty). This needs to be rewritten.

[1] https://commons.wikimedia.org/wiki/Special:CentralAuth/Chetwilliams
[2] https://commons.wikimedia.org/wiki/File:Alan_Wall_Photography_2014-08-06_22-14.jpg
[3] https://gerrit.wikimedia.org/r/#/c/143751/9/includes/specials/SpecialUploads.php
[4] https://github.com/wikimedia/operations-mediawiki-config/blob/master/wmf-config/InitialiseSettings.php#L11764

Change 152748 had a related patch set uploaded by Florianschmidtwelzow:
Check userCanUpload when wgMFPhotoUploadEndpoint is set

https://gerrit.wikimedia.org/r/152748

Review needed at https://gerrit.wikimedia.org/r/152748 . Could somebody please go take a look?

This should be merged as soon as possible. The success ratio for mobile/web uploads is still very bad. I'd like to have that in 1.24wmf18. See continued statistics at

https://commons.wikimedia.org/wiki/Commons:Forum#A_propos_.22mobile_upload.22

Or -- since the limn statistics have been fixed -- see those:

http://mobile-reportcard.wmflabs.org/graphs/month-uploads 711 uploads in August
http://mobile-reportcard.wmflabs.org/graphs/deleted-uploads 962 deletions

at the time of this writing. (Looks like those statistcs are based on the deletion date, not the upload date. At the beginning of August a lot of uploads from July were still in deletion queues. Otherwise, how could there be more deletions than uploads?)

(In reply to Lupo from comment #155)

Otherwise, how could there
be more deletions than uploads?)

I think yes, the statistic shows, how many images are uploaded and how many images are deleted (no matter when uploaded) per month.

Change 152748 merged by jenkins-bot:
Check userCanUpload when wgMFPhotoUploadEndpoint is set

https://gerrit.wikimedia.org/r/152748

Change 154364 had a related patch set uploaded by Florianschmidtwelzow:
Check userCanUpload when wgMFPhotoUploadEndpoint is set

https://gerrit.wikimedia.org/r/154364

Change 154364 merged by jenkins-bot:
Check userCanUpload when wgMFPhotoUploadEndpoint is set

https://gerrit.wikimedia.org/r/154364

The last patch is now backported to MW1.24wmf17 and works for me on commons and wikivoyage.

Closing again as resolved. @Lupo (or/and others) can you reopen this, if there is still no change? :)

Info: Patch successfully deployed on all WMF Wikis (including Wikipedias)

It seems, that there are still very useless images. The last days statistic:

  1. Aug: 90% useless
  2. Aug: 82% useless
  3. Aug: 100% useless

The rule implementation itself works good, but it seems, that the rules itself doesn't help to improve the quality of uploads :/

(In reply to Florian from comment #162)

It seems, that there are still very useless images. The last days statistic:

  1. Aug: 90% useless
  2. Aug: 82% useless
  3. Aug: 100% useless

The rule implementation itself works good, but it seems, that the rules
itself doesn't help to improve the quality of uploads :/

That is absolutely correct.

All the measures implemented so far have helped reduce the volume of uploads through mobile/web, but the have not improved the success ratio at all.

The hope had been, if I understood this correctly, that by introducing minimum limits for this functionality we would get less clueless people using it, and of those remaining more would have a clue what this is about.

However, this did not happen. Most people using this feature still have no clue about copyrights, free licenses, or the Commons, and still upload the same kind of crap as before July 10. Just that we now get instead of 200 problematic uploads daily around 25-60 per day.

I see nearly no experienced uploaders using the feature. I don't know why that is so, but I might think that more experienced people typically upload more than just one or two pictures at a time and prefer to use the desktop upload wizard or one of the external batch upload tools for that. So mobile/web upload is completely uninteresting for experienced users. (That's my personal speculation, though.)

Most uploaders using the feature are still clueless; many are drive-by uploaders.

As a result, we still have an upload channel through which nearly no useful but many problematic images come in, and that still has to be monitored closely.

Increasing the minimum limits for using the feature will only reduce the volume even more. It will _not_ attract experienced people who know about copyrights and free licenses and the Commons to use mobile/web uploads, and it will also _not_ solve the systemic problem that the workflow is utterly wrong for anything but really self-taken photos. (The workflow caters well to that one use case, but it does nothing at all to discourage the undesirable use cases that are, unfortunately, the norm.)

I therefore would like to take up Max Semenik's promise in bug 68375 comment 7:
Please switch off mobile/web uploads altogether until a better approach has been developed.

Change 156523 had a related patch set uploaded by MaxSem:
Disable mobile uploads

https://gerrit.wikimedia.org/r/156523

I wouldn't disable it completely, if you have autopatrolled rights (or higher) on Commons you should be able to use mobile uploads. Could be extended to similar rights on home (or at least one?) wiki.

(In reply to Denniss from comment #165)

I wouldn't disable it completely, if you have autopatrolled rights (or
higher) on Commons you should be able to use mobile uploads. Could be
extended to similar rights on home (or at least one?) wiki.

Please provide statistics about the number of autopatrolled users who actually used this feature in the past two moths, and about the number of images they uploaded, and the number of images retained from those.

And what are the rules about becoming autopatrolled anyway? At the Commons? At the English Wikipedia? At the Arabic Wikipedia?

It makes no sense to throw still more technical hacks and patches and warts at this very fundamentally broken feature. (It's not broken technically -- but its process in very flawed and it's broken socially.)

Change 156523 merged by jenkins-bot:
Disable mobile uploads

https://gerrit.wikimedia.org/r/156523

As Max said, we've decided to remove all uploading features from the mobile site until the team can devote more focused time to improving the workflow. Please see the thread on the mobile mailing list for more details on the rationale and next steps: https://lists.wikimedia.org/pipermail/mobile-l/2014-August/007927.html

(In reply to Max Semenik from comment #107)

How about: ask uploaders where they took the image from?

  • With options like "I made it myself", "my grandma made it", "found it

somewhere on the interwebz".

  • Until they've selected something, don't allow uploading.
  • If they selected something other than "I made it myself", show a very

short "copyright for idiots" style tutorial and disallow the upload.

This should filter out people who care but are about to make a mistake and
most of drones, leaving only persistent drones and malicious uploaders.
Reperesentatives of both of these categories can be blocked fairly liberally.

At minimum, it would be good to look to the desktop UploadWizard for guidance on how to communicate what is wanted, and what is required.

It's worthwhile to imagine yourself in a user's shoes (as I hope software designers already do!) Imagine this:

  • You have never heard of Wikimedia or Wikimedia Commons, but you like Wikipedia, and downloaded the Commons app for Android because you saw it was made by the same people.
  • The app offers you, essentially, one feature: upload media.
  • Where in the app's interface, even if you desperately WANTED to, could you gain any insight into WHERE it is uploading to, WHY, or what is desired by that mysterious entity?

I don't think there's any opportunity to learn these things. So are we surprised that the uploads are mostly junk?

Are we concerned about the amount of volunteer effort it takes to process these uploads?

Will the Wikimedia Commons app be taken out of the Android marketplace along with the other Mobile Upload capability?