Better protect against data loss and corruption during file uploads
Open, MediumPublic
Actions

Assigned To

None

Authored By

	daniel
	Jan 8 2021, 11:19 AM

Description

Recording file uploads is a complex process involving updates to different systems that need to be kept consistent. The current transaction logic is flawed and can lead to data loss and corruption under certain circumstances, see T263301#6487019.

This could be prevented by re-structuring the upload process as follows:

Stage one:

start db transaction (do not use a deferred update)
determine archive name of the file, insert a row into oldimage, based on data in the image table
copy the current version to the archive name. Do not use "quick" operations.
commit db transaction (really flush! we must know this is permanent before overwriting the current version of the file!)

Stage two:

start db transaction (do not use a deferred update)
determine meta-data for the new version, insert update the row in the image table
copy the new version into the primary location. Do not use "quick" operations.
commit db transaction (really flush!)

Stage three:

schedule jobs for thumbnail generation (in a deferred update?)

This should prevent any data loss. However, if we fail before stage two is committed, we end up with an extra row in oldimage, which is visible to users. It would point to a copy of the current version, and have the same meta data. We could try to detect this during the next upload, and remove such a row. This would be even easier with a unique index over oi_name and oi_timestamp (plus perhaps oi_sha1).

Related Objects

Mentioned In: T279982: Add index on oi_timestamp
T263301: Old image unexpectedly overwritten by a revision several years later (after Internal server error)
Mentioned Here: T264189: Prepare a proof of concept of the minimum setup capable of backup and recover testwiki media files
T263301: Old image unexpectedly overwritten by a revision several years later (after Internal server error)

Event Timeline

daniel created this task.Jan 8 2021, 11:19 AM

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptJan 8 2021, 11:19 AM

daniel mentioned this in T263301: Old image unexpectedly overwritten by a revision several years later (after Internal server error).Jan 8 2021, 11:19 AM

Hey, @daniel thanks for creating this!

As part of my work on T264189, I detected a few drifts between the image (file) metadata database and the file backend (swift). While this is a bit offtopic for this ticket, I wonder if we could work together to try to make those old drifts tend to zero. I can do most of the work regarding identifying those, but I may need assistance correcting them.

• holger.knust triaged this task as Medium priority.Jan 12 2021, 2:15 PM

• holger.knust edited projects, added Platform Team Workboards (Clinic Duty Team); removed Platform Engineering.

• holger.knust moved this task from Inbox to Later on the Platform Team Workboards (Clinic Duty Team) board.

tstarling mentioned this in T279982: Add index on oi_timestamp.Apr 13 2021, 5:32 AM

Better protect against data loss and corruption during file uploadsOpen, MediumPublicActions

Description

Related Objects

Event Timeline

Better protect against data loss and corruption during file uploads
Open, MediumPublic
Actions