Jump to content

Wikipedia talk:WikiCup/Archive/2021/3

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia


Guide to restoring images

Apropos of nothing, I kind of want to talk about my latest image restoration, as I know I work in a field most of you don't, and it might help give some idea of how to find images for articles and such.

The image we're going to be looking at is not the most difficult restoration; sheer size made it take a while (3 days' work, about 4-6 hours a day), but it had a nice set of challenges that might help show how I work.

So, let's start with a quick before and after.

Fun fact: Most damage isn't that visible at thumbnail level. You still need to fix it, though! You want your work widely used, right? That means that it might be printed at poster size, and it needs to look good at that size. The original poster was two feet wide, so a lot of the detail is lost at thumbnail size.

Step one happens before the image is uploaded: Finding it. This was a chance find: Gallica shows related images when you're looking at things, and this came up when I was looking through my file of preselected images next to a poster for the opera Louise. Specifically, File:Poster Louise Opera By Charpentier.jpg - and that's not a bad poster, but it's nowhere near as visually compelling, so I jumped ship.

However, this one wasn't very well documented. I know all of Georges Rochegrosse's work is out of copyright in France - he died in 1938 - however, Gallica doesn't list a date for this poster, and I needed to establish its copyright status in America. So I started off by checking when Rochegrosse retired. No luck. Well, it's a poster for an opera. Those are all over the internet, so I searched for "Pénélope Fauré poster" or words to that effect, and found plenty of sites claiming it to be 1913. Which is promising, but they weren't exactly reliable sites. Then I found Getty Images had a copy of it with Théâtre des Champs-Elysées on the top.

That's enough to identify it. You see, the Théâtre des Champs-Elysées nearly went bankrupt during its run of Pénélope in 1913, and when Pénélope finally came back, it was in other opera houses, and it's one of those operas that weren't revived a lot. This Master's Thesis details its history from the end of the Champs-Elysées production to WWII, at which point we can ignore it because Rochegrosse was dead. Long story short, if this poster is for the Théâtre des Champs-Elysées, it must be from the Paris première, and as that opened and closed in 1913, it must be from 1913. American copyright status is safe.

There's a limit to how much of an image you can fit into a screen and see all the pixels. On my laptop, leaving space for the toolbar right of the image, I get a working area about 1200 pixels wide. Especially with lithographs, where spots and specks in the image itself are... basically the definition of a lithograph... the worst part of the image tends to be the paper outside the lithograph. This image has drawn borders, and elements like Odysseus' hand and the Publisher's name that go outside of it, so there needs to be a certain margin outside the borders. As the final image is 7012x9532 px, the borders are not insubstantial, measuring about 300px wide (which is pretty close to the original printed effect of the poster), but any marks or stains in the border will be visible, but a lot of the ones in the lithograph will just... blend in. Not to say there weren't several microtears, scuffs and other damage to the image, but the borders were the first priority. I uploaded the original at 12:24, 14 January 2022, and we can start to follow my work from then on by following the upload history of File:Georges Rochegrosse - Poster for Gabriel Fauré's Pénélope (1913).png - PNG's are lossless files, so, whereas repeatedly saving my work in a JPEG would cause progressive degradation, a PNG will keep my work accurate if, for example, GIMP manages to crash while saving my file causing it to be corrupted. That only needs to happen once for you to never want it to happen again.

First upload at 16:38, 14 January 2022 is representative of your standard first preparations. I'd gotten a feel for the image, corrected some large patches of damage - some long tears and scuffs in the paper which are a pain to fix when you're doing a more systematic pass through, and cleaned up the whole top border (with some bits of the image I could see under that). As I said, borders... are the most annoying part, so getting them done first helps a lot. This time was also spent getting a feel for the image, getting an idea of what's normal (a certain amount of white spelling for lighter colours, for instance), and what needs to change. Grounding yourself before you start is very important, or you will be throwing work out.

I was having some really weird sleep issues that day, so my next upload happened after I had slept for a bit, at 02:31, 15 January 2022. This one has the comment "Done for x<2400 and x>6200." I tend to do my general editing in long, overlapping vertical strips. It keeps me focused and makes sure all of the image receives work. I started by doing the left border, and I believe the bottom border while I was at it, roughed in the right border, did a few stripes on the left hand side (x as in x axis: x<2400 means that I cleaned up the leftmost 2400 pixels - actually, probably more like 2700, as I always remove 300 px or so to make the next vertical strip overlap a bit more. I think there was a short break, and when I came back, I did the right-hand border, and worked left from there. This was actually working fairly well - I like focusing on the right hand side of the monitor a little more, so you'll see that on the upload two hours later, I was still working my way left, and continued doing so, with the left hand side of the image getting no more work for a while after that, but three hours later I had done everything right of x=4800. I then stopped for the night. Yes, my sleep schedule is terrible.

Somewhere in that, by the way, I noticed that Media Viewer was still stripping my name from the credit information of images, and finally wrote Wikimedia Legal, kind of pointing out that the coders have been stalling on this for literally 7 years, and, given some of my images are Attribution-required, maybe they need to get on to fixing this? Since I live in Britain, which gives new copyrights at the drop of a hat, I can't help but gain a new copyright, and while I want my work used, it's kind of weird that the only person who might have an active copyright in these things, much as I'm willing to waive it, is the only person who's getting stripped out of the attribution credit by Media Viewer. We'll see how that goes. Seriously, though... I care more about my work being available to people than my moral rights being violated, but it'd be nice if Wikipedia cared a fucking jot about them.

When I woke up, I started working from the left hand side again, because the figure of Odysseus had a lot of the worst damage, and, psychologically, it's kind of easiest to do the most difficult work last, as it's one of those "Once I've done this, I'm done" things that helps motivate you. So, sometime before 5 pm on the 15th, I went back to work, and - with a few breaks - largely finished at 9pm. The last things to do were a rotation - the image was slightly off - and a colour adjustment to remove some of the aging of the poster's paper so it looked more fresh and new. I had mostly worked out the colour adjustment and saved it when uploading a mockup JPEG early in this process - a good idea if you want to get things to articles before you're done with the work.

That said, this part took ages. 7000x9500 pixels is a very, very big image, and I have a five year old laptop. It took half an hour to process it each time. And I had to do it multiple times because I wanted to get it perfect. So, babysitting it and going away, leaving it to process, making some changes, going back to the start, re-rotating it and doing a slightly different colour palette because I decided I had over-adjusted but hadn't saved the rotated version.... I was pretty much having my laptop process some sort of image adjustment constantly between 9pm and a bit before the 3:47 am time today when I got the final image uploaded. I'd say the last hour or so was the final few fixes.

Lastly, check the original site and copy over any missing information. Publisher and size of the poster hadn't been put on the image page, so I added them.

And that is my process for doing an image restoration. Adam Cuerden (talk)Has about 7.5% of all FPs 05:08, 16 January 2022 (UTC)

@Adam Cuerden: You're awesome and I'm glad you're back! Kingsif (talk) 05:20, 16 January 2022 (UTC)
Ach, thanks. As long as this frustration doesn't get to me - and it shouldn't - I should be here a fair bit until the next inevitable burnout. Really can't help burning out sometimes - it's a lot of work for seemingly ever-decreasing reward, but I'll always be back eventually. Adam Cuerden (talk)Has about 7.5% of all FPs 06:14, 16 January 2022 (UTC)
Thanks for your most interesting account. Amazing. Cwmhiraeth (talk) 06:50, 16 January 2022 (UTC)
A really interesting account—surprised by the order of a few things, like the colour adjustment coming last. In a way, lots of the steps/motivation/feelings are the same as how I feel in a solid 7-hour chunky article creation or bringing-to-GAN session, and yet it's such a different set of actions. Though in a flight of fancy I might consider some of my writing a work of art, carefully shaped and refined and generously bequeathed to readers across the world. ;) — Bilorv (talk) 18:38, 16 January 2022 (UTC)
@Bilorv: Color adjustments are quite subjective, and if someone disagrees with them, it's nice to have a restored copy without the adjustment. Adam Cuerden (talk)Has about 7.5% of all FPs 02:15, 17 January 2022 (UTC)
The image in question

So,I'm not going to cover the image restoration for this one. But I think it shows how one can find images for articles.

So, the spark here was the process of distributing the Pénélope image I discussed before. Since I knew the near bankruptcy of the Théâtre des Champs-Elysées was what ended Pénélope's run, I visited there, and, reading only slightly further, learned it was the site of the Congress of Allied Women on War Service.

Now, that sounded interesting. Feminist activity during the first World War is one of those things not heavily reported nowadays, but interesting and arguably consequential: there's reasons a lot of women's suffrage happened after the First World War.

In France, so Gallica is the obvious archive. Now, one thing about these sorts of meetings is that history tends to give them definitive titles, whereas at the time, they have loads of variations, especially in translation. But we have the French name: "Congrès des Femmes alliées au service des œuvres de guerre". Plug that in and... Nothing. Okay. Let's try "Congrès des Femmes alliées 1918". Still a load of nothing. Well, it sounds like that was the only notable thing happening in the Théâtre des Champs-Elysées at the time. "Théâtre des Champs-Elysées 1918"? Looks semi-promising, limit to images....

Bingo. https://gallica.bnf.fr/ark:/12148/btv1b53019500h/f1.item.r=Théâtre des Champs-Élysées 1918.zoom. "Allied Women on War Service Conference". Date matches, place is right, the poster's vague "watch this week" isn't quite accurate, but there were apparently delays. Running it by a subject expert to confirm.

The signature is much easier to identify when you can just look at the filename.

The next problem is the signature. Gallica doesn't list an artist, and the signature is a little hard to read. I first read it as "A. Mein" and tried that. No-one from the right era. It looked a bit like "McMein", but I thought I was just imagining things. So I just googled some text from the poster, found a credit for "McMein". Oh. Well, there we are then. Search for "McMein WWI" and Neysa McMein comes up. A moment looking at her incredibly distinctive signature on a few other works by her, and I knew I had solved the riddle. Adam Cuerden (talk)Has about 7.5% of all FPs 07:34, 18 January 2022 (UTC)

Very interesting. Cwmhiraeth (talk) 07:26, 19 January 2022 (UTC)
I wouldn't count on this one immediately, by the way. Remember how Pénélope was almost too much for my laptop when I did colour adjustments? Well... This is about twice as big. Adam Cuerden (talk)Has about 7.5% of all FPs 13:56, 19 January 2022 (UTC)