Wikisource:Scriptorium

From Wikisource
Latest comment: 3 hours ago by Richard Arthur Norton (1958- ) in topic No redirects from Portals to Author
Jump to navigation Jump to search
Scriptorium

The Scriptorium is Wikisource's community discussion page. Feel free to ask questions or leave comments. You may join any current discussion or start a new one; please see Wikisource:Scriptorium/Help.

The Administrators' noticeboard can be used where appropriate. Some announcements and newsletters are subscribed to Announcements.

Project members can often be found in the #wikisource IRC channel webclient. For discussion related to the entire project (not just the English chapter), please discuss at the multilingual Wikisource. There are currently 467 active users here.

Announcements

[edit]

Proposals

[edit]

Bot approval requests

[edit]

For meta:Global reminder bot - the bot will rarely run here, but this wiki requires explicit authorisation, so putting it here. Please ping me in a response. The bot flag is NOT required. Leaderboard (talk) 09:34, 7 November 2024 (UTC)Reply

Repairs (and moves)

[edit]

Designated for requests related to the repair of works (and scans of works) presented on Wikisource

See also Wikisource:Scan lab

The existing scan is incomplete, so I will be replacing it with a complete version. To support this, please carry out the following page moves.

  • Index page name = Index:Mathematical collections and translations, in two tomes - Salusbury (1661).djvu
  • Page offset = 1 (i.e. /10 moves to /11)
  • Pages to move = "10-456"
  • Reason = "inserted missing pages"

Thanks Chrisguise (talk) 16:28, 4 September 2024 (UTC)Reply

@Chrisguise: Done Xover (talk) 16:47, 4 September 2024 (UTC)Reply
Thanks. I've uploaded the new file. Chrisguise (talk) 18:25, 4 September 2024 (UTC)Reply
Hi, really sorry about this but I missed a couple of variations when I requested the move above. Could you please do the following:—
  • Index page name = Index:Mathematical collections and translations, in two tomes - Salusbury (1661).djvu
  • Page offset = 1 (i.e. /115 moves to /116)
  • Pages to move = "115-274"
  • Page offset = -1 (i.e. /409 moves to /408)
  • Pages to move = "409-454"
  • Reason = "realigned pages"
  • Delete = /705 & /706
  • Reason = "pages not in work"
Thanks Chrisguise (talk) 14:21, 5 September 2024 (UTC)Reply
Hi (again). Could you hold fire on this request. There's something odd going on with the index page. When I click on some pages they show a page image from the new file (gold trim to covers is visible) and some show the original file (plain brown cover edges visible). I've tried purging the Commons page where the file resides and the individual pages on WS, and things seem to be improving (slowly) but everything has clearly not properly updated yet. Chrisguise (talk) 14:42, 5 September 2024 (UTC)Reply
@Xover Whilst there are still one or two pages of the old scan showing, they do not affect the pages needing to be moved. Could you do the two moves and deletion set out above? Thanks, Chrisguise (talk) 15:41, 10 September 2024 (UTC)Reply

Recently found that a near-complete scan of the fanzine this story appeared in (and confirming that it was printed with no copyright notice) was on the internet, so the existing partial scan can be moved to the new index.

-ei (talk) 00:38, 26 October 2024 (UTC)Reply

Spectacles and eyeglasses, their forms, mounting, and proper adjustment

[edit]

I foolishly renamed the PDF on Commons while transcribing, and it broke stuff.

What should I have done? HLHJ (talk) 15:12, 11 November 2024 (UTC)Reply

Thanks to MarkLSteadman for fixing it. HLHJ (talk) 15:53, 11 November 2024 (UTC)Reply
Done with tranclusion updated. Let me know if I missed anything or you have any issues. MarkLSteadman (talk) 16:30, 11 November 2024 (UTC)Reply
Will do, but it all seems to be working perfectly now. HLHJ (talk) 03:47, 12 November 2024 (UTC)Reply

I think these files can be moved to Commons as they should be in the public domain in the UK too. Among all the editors and authors for volume 1, the latest death year is 1938 which puts it in the public domain in the UK and its publication date of 1902 puts it in the public domain in the US. The list of authors by chapter can be found here. Can someone validate this and move the files to Commons?

For volume 2, the latest death year is 1948, which should also be in the clear. Ciridae (talk) 17:59, 14 November 2024 (UTC)Reply

Other discussions

[edit]

Looking for Indiana/Indianapolis pages to transcribe or proofread

[edit]

In preparation for the WikiConference North America 2024 editing challenge, I'm looking for any pages within the Indiana or Indianapolis (where the conference will be held) to transcribe or proofread. Can someone point me to outstanding tasks or categories which fall within this scope? OhanaUnitedTalk page 14:33, 1 September 2024 (UTC)Reply

Two other ones, more specifically about Indianapolis: Early Indianapolis, 1919 (27 p.), and Centennial History of Indianapolis, 1920 (72 p.). — Alien333 ( what I did
why I did it wrong
) 22:34, 1 September 2024 (UTC)Reply
Works by Jacob Piatt Dunn could be of interest. See c:Category:Jacob Piatt Dunn. —Justin (koavf)TCM 19:26, 2 September 2024 (UTC)Reply
Thanks for these ideas. I think I will use Index:A standard history of Lake County, Indiana, and the Calumet region (IA standardhistoryo01howa).pdf as demonstration and tutorial. Can someone proofread a few pages for this book so that I can also demonstrate validation to the audience? @Alien333: I like your suggestion for Early Indianapolis, 1919 as the total number of pages is small enough that it can be tackled by all conference participants. Can you set up that publication? OhanaUnitedTalk page 22:25, 2 September 2024 (UTC)Reply
Done, here it is: Index:Early Indianapolis.djvu. (You may want to familiarize yourself with WS:SG and H:T for formatting.) Cheers, — Alien333 ( what I did
why I did it wrong
) 23:19, 2 September 2024 (UTC)Reply
The text doesn't seem to be automatically transcribed. Are there additional steps needed to OCR the text? OhanaUnitedTalk page 02:54, 3 September 2024 (UTC)Reply
There's a button on the top right, marked "Transcribe text". — Alien333 ( what I did
why I did it wrong
) 11:23, 3 September 2024 (UTC)Reply
If you update the file description page with the actual IA identifier I can regenerate the DjVu with a OCR text layer (I have custom tooling for that). Xover (talk) 12:27, 3 September 2024 (UTC)Reply
Done, assuming you only wanted it to be written somewhere. — Alien333 ( what I did
why I did it wrong
) 12:48, 3 September 2024 (UTC)Reply
@Alien333: Done Xover (talk) 14:11, 3 September 2024 (UTC)Reply
(@Koavf: Early Indianapolis was supposed to be for an editing contest next month. Don't do the next one we set up, will you :) ?)
@OhanaUnited: We're going to have to set up another one, as Koavf already did this one. Fine with Centennial History of Indiana?
@Xover: For curiosity's sake, what do you use for OCR? — Alien333 ( what I did
why I did it wrong
) 15:55, 3 September 2024 (UTC)Reply
Oh sheesh. What a moron. :/ Sorry guys, I just totally misread this. If you really want, I wouldn't be offended if you deleted my work. What an idiot. —Justin (koavf)TCM 15:57, 3 September 2024 (UTC)Reply
I think A standard history of Lake County, Indiana, and the Calumet region should have sufficient number of pages for proofreading and Early Indianapolis for validating. It actually makes the contest verification process simpler by splitting tasks into different books.
@Koavf: it's fine. We will let the editing contest participants focus validation on Early Indianapolis.
@Alien333: I don't see the button that says "transcribe text". Do you need specific userrights to use this tool? Or only on pages with no text transcribed? We are (ok, it's just me at the moment) planning to do an introduction workshop to different sister projects and the newbies are going to ask the same kind of questions as myself. OhanaUnitedTalk page 17:30, 3 September 2024 (UTC)Reply
Once again, I have failed upward. Thanks for your graciousness: I'll be sure to not touch that other work. —Justin (koavf)TCM 17:32, 3 September 2024 (UTC)Reply
Errm, are you sure that when you edit a page in Page: namespace, you don't see anything that looks like what's described there? It's not a very visible button, but normally it should be there. This is supposed to work with all skins, and to need nothing special.
If we're lucky, Xover or someone else will graciously do the OCR beforehand, but this really, really isn't supposed to happen. — Alien333 ( what I did
why I did it wrong
) 17:40, 3 September 2024 (UTC)Reply
Nope. I don't see anything like that on the top right. I remember seeing this button during demonstration session at Wikimania Singapore last year. I see buttons like page logs, analysis, search and subpages. I'm on Firefox and even tried Chrome (and god forbid, Microsoft Edge). But definitely nothing related to OCR on my end! That's why I thought I didn't have the userrights to do OCR yet. OhanaUnitedTalk page 05:16, 4 September 2024 (UTC)Reply
Would you mind uploading (locally) a screenshot of what the editing window looks like for you in Page: namespace? If the OCR button is missing you might have other problems. — Alien333 ( what I did
why I did it wrong
) 07:33, 4 September 2024 (UTC)Reply
Here is how it looks on my end. OhanaUnitedTalk page 21:53, 4 September 2024 (UTC)Reply
"facepalm" Aaand in the end it was just a question of enabling the editing toolbar (also called 2010 wikitext editor, in the Editing section of preferences). I'd forgotten that that toolbar was not default. Note: while we're at it, gadgets that are not default but that I think greatly facilitate editing experience and should be recommended to new editors (by label):
  • Preload useful templates such as header, textinfo and author in respective namespaces.
  • Add a toolbar button to check for and insert a paragraph-breaking {{nop}} at the end of the previous page.
  • Running headers: Load running headers from surrounding pages
Alien333 ( what I did
why I did it wrong
) 22:25, 4 September 2024 (UTC)Reply
Yes. It is now showing on my end. Why isn't the editing toolbar enabled by default? And is there a particular "preferred" OCR? Or pick the one that does the best job for that page?
I think what you described regarding headers and textinfo may be too advanced for complete newbies to the project. The purpose of the tutorial session and the editing challenge is to get existing editors to try their hands on editing in other sister projects. OhanaUnitedTalk page 16:56, 5 September 2024 (UTC)Reply
For normal text, use the Google OCR mode, it's god the best accuracy, most of the time. It has a few drawbacks, in which cases it's better to use Tesseract:
  • It's quite bad at locating columns of text, e.g. gives "texta textb textc textd" in a column layout like this: texta
    textc
    textb
    textd
    rather than the correct "texta textc textb textd". This is also a problem for TOCs.
  • It always transcribes Small Caps as CAPITALS, which means you have to retype it, whereas Tesseract at least tries to render it in the correct case, so if you've got lots of smallcaps you might want tesseract.
Alien333 ( what I did
why I did it wrong
) 17:38, 5 September 2024 (UTC)Reply
Just finished sister projects presentation which included Wikisource. 3 new editors joined during the session, including one who's fluent in multiple languages and participated in non-English Wikisource language projects. There will be more activities by new editors between now until Sunday to participate in the editing challenge. OhanaUnitedTalk page 18:51, 4 October 2024 (UTC)Reply
The first 48 pages of Index:Constitution of the state of Indiana and of the United States (IA constitutionofst00indi 0).pdf have the 1851 constitution of Indiana, with footnotes. We are always looking to back mosre copies of constitutional documents with scans of published copies. --EncycloPetey (talk) 21:17, 4 October 2024 (UTC)Reply

We have finished the challenge. About 6-8 editors contributed to Wikisource part of the challenge. And out of those contributors, 2-4 are brand new editors. During the tutorial session, I have observed Japanese, Korean and Spanish Wikisource being edited as well. Next year's event will take place in New York City, and you will find me bugging you guys again this time next year :) OhanaUnitedTalk page 21:24, 21 October 2024 (UTC)Reply

Bars and manicules and other old timey items

[edit]
☞

Do we have a page that shows the bars and manicules and other old timey page flourishes, so I can match the closest ones to use in a transcription? They are easier to match by sight rather than by name. Do we have a Help:Flourishes or Help:Visual Elements, or something similar? Is there a collective name for all these types of visual elements? RAN (talk) 01:52, 16 October 2024 (UTC)Reply

I'm not aware of such a page, and it sounds like a good idea! The following pages might be helpful:
CalendulaAsteraceae (talkcontribs) 22:19, 16 October 2024 (UTC)Reply
I'm not exactly sure how they could be organized. But, it'd be an extremely helpful page. I'd say, let's be bold and create it. SnowyCinema (talk) 23:03, 16 October 2024 (UTC)Reply
I just looked these up. They don't look very old timey though. I made a list of things I needed for a while. User:RaboKarbakian/Symbols. There was an extra special challenge ( ) to get the text (and not emojified) astrological symbols.--RaboKarbakian (talk) 01:35, 17 October 2024 (UTC)Reply
It's a shame that Unicode Charcter charts aren't necessarily license compatible with Wikisource.. (Or possibly in scope , for that matter).. ShakespeareFan00 (talk) 20:36, 18 October 2024 (UTC)Reply
There are now lots of (partial) license-compatible Unicode implementations. They would have charts we can use. HLHJ (talk) 15:41, 18 November 2024 (UTC)Reply
See also Dingbat and Commons:Category:Typographic ornaments. The alternate name "printers' ornament" may also be useful for searching. HLHJ (talk) 15:48, 18 November 2024 (UTC)Reply
IIRC theres also a BIG chart/list of symbols attached to a US -GPO style manual that various contirbutors here tried to put the relevant unicdoe symbols in? . ShakespeareFan00 (talk) 13:42, 17 October 2024 (UTC)Reply
ShakespeareFan00, you mean this: U.S. Government Printing Office Style Manual/Signs and Symbols, also, thank you for fixing the sun and moon!--RaboKarbakian (talk) 14:21, 17 October 2024 (UTC)Reply
The "floral heart" in Unicode is termed an "aldus leaf" on Commons, and there is a category of various orientations there. --EncycloPetey (talk) 20:37, 18 October 2024 (UTC)Reply
Side note, but while searching I found both {{manicule}}: , and {{finger}}: . These two probably need to be merged. — Alien  3
3 3
05:16, 17 October 2024 (UTC)Reply
Good catch! —CalendulaAsteraceae (talkcontribs) 06:03, 17 October 2024 (UTC)Reply
I recently created {{fleuron}}, which may be of interest to this discussion. —Beleg Tâl (talk) 22:33, 22 October 2024 (UTC)Reply
Also of note: Category:Special character templatesBeleg Tâl (talk) 22:34, 22 October 2024 (UTC)Reply
[edit]

Due to formating issues and other problems. Help would be appreciated due to its gargantuan size. Booklover09097 (talk) 19:26, 18 October 2024 (UTC)Reply

Can you clarify? —Justin (koavf)TCM 23:38, 18 October 2024 (UTC)Reply
Moved from Wikisource:News/2024-10 where it was placed incorrectly. --Jan Kameníček (talk) 22:00, 18 October 2024 (UTC))Reply
This project, being very close to my heart, begs the question. What are you talking about, and what are the specific issues? — ineuw (talk) 04:20, 24 October 2024 (UTC)Reply

how accurate is transcribe?

[edit]

Whenever i'm on a page and the text is juttery and clunky, i press transcribe and the text looks pretty good. But how accurate is the transcribe button in relation to the text? Booklover09097 (talk) 09:29, 19 October 2024 (UTC)Reply

That is mostly dependent on the quality of the image. With lower resolutions, it gets garbled.
It also depends on the OCR engine used. For on-site OCR, Google OCR is the best one for character accuracy, although it has trouble with columns.
What you see when you create a page is the OCR that was embedded in the file before upload. The best engine off-site is probably Tesseract. The people who did that might have not used the best OCR (or it was not available at the time).
Even though it is sometimes pretty good, there is no guarantee of accuracy, and editors are expected to check. — Alien  3
3 3
09:51, 19 October 2024 (UTC)Reply
@Booklover09097: Which book are you talking about? There are a lot of PDF files exported from https://archive.org (IA), which had been optimized for extreme size reduction at the expense of their quality. But thankfully IA typically also has the original high quality images of the scanned book pages for download, and they can be used to create a better quality PDF or DjVu files. --Ssvb (talk) 13:10, 22 October 2024 (UTC)Reply
In my experience with IA and familiarity with OCR technology, five factors influence the quality. Image clarity, scanning method, scanning equipment, scanning software, and optics. Many early documents at IA were scanned manually. With automation, the technical particulars became available on the IA download page.
One additional note. Initial IA scanning equipment used one OCR software for English, and another for accented Latin languages. Since English academic documents reference other languages, look closely before applying Tesseract OCR. It is not always wanted. — ineuw (talk) 05:12, 24 October 2024 (UTC)Reply

I added Show Boat as a book to Wikisource

[edit]

Please help me format it. Thankfully it's in the public domain because it was published 1926. Blahhmosh (talk) 04:18, 21 October 2024 (UTC)Reply

I added a few others. Please proofread. Blahhmosh (talk) 04:35, 21 October 2024 (UTC)Reply
Also, is Amerika: The Missing Person translation in Public domain? I know the Original german version is, but the American one was published 1938 or something. Blahhmosh (talk) 04:50, 21 October 2024 (UTC)Reply
Also, how does "Hunting for Hidden Gold" work? How do you just record the original version as well as the other versions? Blahhmosh (talk) 04:52, 21 October 2024 (UTC)Reply
Note that Wikisource no longer accepts second-hand transcriptions, e.g. from Project Gutenberg. All new works must be proofread to a source text. See Wikisource:What_Wikisource_includes#Second-hand_transcriptions. For example, Show Boat is available to proofread here Index:Show boat - 1926.djvu and An American Tragedy here: Index:An American Tragedy Vol 1.pdf. MarkLSteadman (talk) 05:12, 21 October 2024 (UTC)Reply
I see. How do I submit new PDF files for transcription? Blahhmosh (talk) 07:12, 21 October 2024 (UTC)Reply
You can upload them to Wikimedia Commons, there and then follow the procedure described over there. — Alien  3
3 3
07:30, 21 October 2024 (UTC)Reply
So are digital versions of the books pdf banned? Blahhmosh (talk) 08:02, 21 October 2024 (UTC)Reply
If that's your question (I'm not sure I understood), PDF isn't banned, it's just that it works less well, and has many issues, so DjVu is more convenient. You can perfectly upload a PDF and work on it, it's your choice. — Alien  3
3 3
08:10, 21 October 2024 (UTC)Reply
No, I'm saying sometimes the PDF doesn't contain the actual images of the book and instead contains just plain text of the book (standard Times New Roman, Ariel, Corier, etc.) font that isn't the font used in the original book. Should we use those? Blahhmosh (talk) 13:46, 21 October 2024 (UTC)Reply
No, "source texts" here mean a scan of the physical edition of that book. — Alien  3
3 3
13:47, 21 October 2024 (UTC)Reply
What if it's the physical edition of that book but you can copy the text of the book? Blahhmosh (talk) 13:54, 21 October 2024 (UTC)Reply
Source text, means that it is proofread against a known published version, which is almost always a physical copy (or for recent government works, a PDF release). Although scans are strongly preferred, they are not strictly required per policy, but a clear record of what is the source. Also given the ubiquity of high-quality digital cameras, it shouldn't be too hard to image the pages (even if not able to process them) so that there is a record for someone in the future. MarkLSteadman (talk) 04:31, 22 October 2024 (UTC)Reply
Re taking a physical work --> typing it up --> releasing it as say a PDF on say Internet Archive --> retranscribing it at WS, that typically counts as secondhand / self-published. Several reasons why they are problematic as they effectively create a new "edition": 1. They can introduce issues with any omissions / decisions / additions causing divergences with the physical work, even if just in things like pagination 2. editor's copyright not being released / verified 3. source text uncertainty, are any divergences in the text caused by lack of record in exactly which is the source edition, which may itself introduce copyright concerns. 4. They typically end up being duplicative of a future proofread against the scans version of the text anyways (which then can be used for referencing page numbers, authority control, source text comparison etc.), and will be deleted. MarkLSteadman (talk) 05:02, 22 October 2024 (UTC)Reply

Tech News: 2024-43

[edit]

MediaWiki message delivery 20:52, 21 October 2024 (UTC)Reply

[edit]

Hello everyone, I previously wrote on the 27th September to advise that the Wikidata item sitelink will change places in the sidebar menu, moving from the General section into the In Other Projects section. The scheduled rollout date of 04.10.2024 was delayed due to a necessary request for Mobile/MinervaNeue skin. I am happy to inform that the global rollout can now proceed and will occur later today, 22.10.2024 at 15:00 UTC-2. Please let us know if you notice any problems or bugs after this change. There should be no need for null-edits or purging cache for the changes to occur. Kind regards, -Danny Benjafield (WMDE) 11:29, 22 October 2024 (UTC)Reply

I messed up a title

[edit]

When I made the page Index:The Last Post (1928), it should've been Index:The Last Post (1928).pdf. What do I do? Blahhmosh (talk) 18:33, 22 October 2024 (UTC)Reply

I've moved it for you. It's easiest if an admin does this because we can suppress the automatic redirect that would happen if you moved it. Beeswaxcandle (talk) 18:42, 22 October 2024 (UTC)Reply
I see. Also, based on the nature of the .pdf file, is it valid for Wikisource? Blahhmosh (talk) 18:45, 22 October 2024 (UTC)Reply
It is, again, a second-hand transcription, as it says on page 3: This ebook is the product of [...] Standard Ebooks, based on a transcription by Faded Page Canada, so no. What would be valid in this case, for example, would be to go get the original page scans from Google Books (mentioned still on page 3). — Alien  3
3 3
18:50, 22 October 2024 (UTC)Reply

Wikisource: We preserve publishers typos

[edit]

I think that any person, bot or other software reply mechanism here should never say the words "as published" until current policy is at least softened; as it is a lie.

Table of exceptions
We preserve publishers typos As published
exceptions: exceptions:
None
paragraph indentation
type family
images that start and end chapters
long s
margins

Feel free to add to either list.

Perhaps there are more. While it sounds (and reads as) so 'leet to say "as the publisher" it is simply not true and over the border which makes it a lie. A simple softening of the policy, so that the occasional editor cannot drop in, validate a page that has one image on it and then ravage the style sheet, would perhaps give you back that 'leet feeling you get when you utter that lie. Without the softening of policy on those point, it is simply a lie.--RaboKarbakian (talk) 20:21, 22 October 2024 (UTC)Reply

Also, I beg of you. Please find for me an English text from between 1650 and 1750 that is not using a serif type face!!--RaboKarbakian (talk) 20:23, 22 October 2024 (UTC)Reply
There's nothing like holding an actual book from the 17th century. However, that's quite different from holding it as published, which no one has done in centuries. Good color PDF scans can preserve some of the qualities of an old book, but miss out on a lot of others. It seems quite weird to say "type family" as opposed to "type face"; you think you can just replace Caslon or Baskerville with Times New Roman? Given that Caslon was the old-school and Baskerville was the new wave when they were competing, how does even replacing one with the other qualify as "as the publisher"?
We are not making digital facsimiles. If you want a digital facsimile, use the PDF. I don't see your second column of exceptions as basically changing anything as to the truthhood or falsehood of "as the publisher".--Prosfilaes (talk) 21:38, 22 October 2024 (UTC)Reply
You are completely correct about that, if we are talking about a text file -- which is the one format the exporter does not do! I am asking simply that the policy be changed to be more "in general" and not so "against". I am also not insisting that the policy be changed so that everything on that list has to be reflected. I would prefer the occasional editor to be a little less enabled. My style sheet just said "serif" because it is not a facimile.--RaboKarbakian (talk) 22:40, 22 October 2024 (UTC)Reply
What did I say about a text file? (Which, by the way, drops stuff that's integral to most works, like italics.) Serif/san-serif is an irrelevant distinction. As you say, a work published in 1750 is going to be in a serif typeface. But you're ignoring many of the other features a work published in 1750 would have, e.g. [10], like the font size, the very different looking fonts, the signatures and tail word. You've also ignoring choices of publishers that are distinct choices, like which serif font to use, in exchange for removing the default font of the reader.
I'm against making working on pages more complex; I've found that PGDP's total separation of proofing from formatting to simplify things a lot, and the more formatting we add is just going to make it worse. I'm also against making more per-project idiosyncrasies. I see preserving publisher typos as more making a standard, undisputable format for pages, and not terribly important in and of itself.--Prosfilaes (talk) 20:32, 25 October 2024 (UTC)Reply
(For images that start and end chapters, I always add them, and have trouble understanding why apparently no one does.)
Leaving ls apart as that's another debate, to me what you listed is perfectly compatible with the fact that we transcribe the work as published, not the physical book as published.
After type families, paragraph indentation, and margins, the same arguments would also lead us to replicate the relative height and width of letters, the width of the page, heck, even the color of the paper. At that point, it'd be much more reasonable to get a 600 or more DPI scan, feed it to the OCR, which respects layout as it places the text on the page, and at this resolution if we take the right engine it's going to be near-perfect.
Also, what would you mean by a softening of policy, which would make the occasional editor to be a little less enabled? You said above that you do not think policy should make those things mandatory, but then, do what else? — Alien  3
3 3
06:49, 23 October 2024 (UTC)Reply
(struck above after seeing recent messages) There is, indeed, a fairly long list of things we don't do, but so is the list of things we do, it's not only typos. A non-exhaustive list: images, page layout, TOC dot leaders, text styling in general ({{sc}}, {{sm}}, {{bl}}, ...), {{di}}s, {{***}}s, &c. — Alien  3
3 3
16:03, 23 October 2024 (UTC)Reply
love the passion about transcription; don't like the "as it is a lie." in any project, there will be compromises between verisimilitude and usability, and calling compromises lies is unhelpful. --Slowking4digitaleffie's ghost 13:21, 23 October 2024 (UTC)Reply
Slowking4, or should I say "font-family:UnifrakturMaguntia"? Personally, I miss the rants of Rama's revenge; and perhaps I am just filling in for the lack of those. That said, I was indenting paragraphs when in elementary school and I was not the only one doing that. The indents help for reading. They are a challenge in markup tho', no doubt. What I did try to say was that "As the publisher published it" is a lie, because it is really just "all of the publishers typos" and anything else might get an editor harassed because there is policy against it. So, I am suggesting that if we are to continue on as is, that we stop lying and simply proclaim "we preserve publishers typos and misspellings".
And I don't want to be misunderstood that new policy should insist upon that list; I want the option without the potential harassment. 'Tis a huge and capable layout engine; policy wants it to be used like a bulldozer to go the 20 feet to get the mail.--RaboKarbakian (talk) 15:52, 23 October 2024 (UTC)Reply
@RaboKarbakian: From the technical perspective, a configurable layout with switchable paragraph indentations (on/off) and switchable typos preservation (on/off) in the browser is very realistic and relatively easily doable. For example, see the Wikisource:Scriptorium/Archives/2024-08#Dynamic_Layouts_and_Template:SIC_/_Template:Errata_possible_interaction discussion topic. Currently we don't have these features in the browser because the consensus of the Wikisource contributors is firmly against having them. Exporting to EPUB/PDF is another part of the puzzle, because right now there's only one non-configurable way to do that as well. But this is again not set in stone and it's the community's desire to preserve the status quo that is the decisive factor. --Ssvb (talk) 11:21, 24 October 2024 (UTC)Reply
Ssvb: Too many images, tables, and formulas appear in the middle of paragraphs for indentation to be considered "easy" or "realistic" technically. You would still have to leave a mark where the "new paragraph" does not indent, and that puts the "technical doable" into the same problems that people have with this. Automatic paragraph indentation is confusing to see. Heck, sentences get interrupted for image and the like. I just cannot agree with the "easily doable" part.--RaboKarbakian (talk) 12:16, 24 October 2024 (UTC)Reply
(Also, not all paragraph starts are indented, see for example this.) — Alien  3
3 3
12:18, 24 October 2024 (UTC)Reply
This doesn't seem difficult to solve: just needs one template that marks up non-indented paragraphs; this could just add a css class that does nothing if the CSS displays it as paragraphs with gaps between, but prevents any indentation if the reader selects "original paragraph indentation mode". In fact, we already have {{No indent}} and {{Nodent}} for just these situations. Pretty sure there's also a template that prevents the gap between paragraphs for continuations of the same paragraph (e.g. that have been interrupted by poems, tables, etc.) – but I can't find it at the moment! --YodinT 16:26, 24 October 2024 (UTC)Reply
Putting every non-indented paragraph in a template would make for quite a lot of lot, wouldn't it?
Also, indentation is not always the same, depends on the period, publisher, etc, so we'd have to add something to the index styles too, which would make more stuff to do.
The most problematic part would probably be updating all we've done so far to make it compatible with the changes.— Alien  3
3 3
16:29, 24 October 2024 (UTC)Reply
Most books I've come across that have indented paragraphs only have a few exceptions to that rule as far as I've seen, so not a huge amount of work while proofreading to add {{ni}} in those cases. And I think the idea would be to make this opt-in, so in most cases (including all the books currently transcribed), they'd just display as they currently are. If an editor wants to give readers the options to view it with the original paragraph indentation (or other options, like long-s, original margins, etc. etc.), the editor could add those options to the Index CSS, and just add the {{ni}} exceptions as they were proofreading (again, not too much more work, and entirely their choice). Editors could choose to go back through the works they've already done, and add indentation options, etc. if they wanted, but again this would be completely optional, just allowing those who want to to do so, but no obligation for editors who aren't interested in this. And, as mentioned below, if editors added this option, readers could still choose whether to view it either as it currently is (i.e. modern paragraph spacing, no long-s – this could be the default option for logged-out users), or with something closer to the original typography (could even give more granular toggles, so font style as one option, page margins another, etc.). --YodinT 17:09, 24 October 2024 (UTC)Reply

A dissenting opinion here: I'm personally more interested in producing an accurate digital version of the text itself than this approach, but I think it's both technically feasible, and also not a terrible idea to allow editors to create essentially "vectorised facimilies" (i.e. the precise fonts and typography used, page margins, etc. etc.) – if this was provided for as a separate stylesheet for example (so /styles.css for the normal web edition, and /facimilie.css for this), it would be straightforward for a parser to let the reader choose which version they wanted to see (another option could be annotated versions; again all using the same Page:s). This would let editors produce whichever version they wanted (facimilie, text, or annotated), without having to revert/ban/tell editors that it has to be done in a certain way, and producing standardised texts that the majority of readers would find useful regardless of which approach the editor uses. --YodinT 14:16, 23 October 2024 (UTC)Reply

Yodin When I export my highly stylized works to epub, most of the style goes away, and it is usually a good experience to read these things there. Having the exporter export to text would allow picky readers to impose their own style to it, or not. Exporting to text would also (hold your horses here!!): preserve publishers typos, which is what we do here (by consensus). The 18th century Arabian Nights I have been working on--there is a late 19th century version that is so much more readable: so that to me, having the earlier one "modernized" and streamlined for reading is silly. Having it look the museum piece that it is kind of nice in a documentation sense.
A howto for setting your personal browser's style would settle most concerns, without the need for multiple style sheets.
Also, the long-s option. Really, people should be required to log in to turn them into s. That way, we get the email addys for getting the donations.--RaboKarbakian (talk) 15:52, 23 October 2024 (UTC)Reply
There's already a howto at Help:Layout. But such howto for setting your personal browser's style is beyond the abilities of the vast majority of the Wikisource users. Moreover, many of the existing wiki templates would benefit from becoming a bit more CSS-aware to enable such customization. --Ssvb (talk) 11:41, 24 October 2024 (UTC)Reply
Would be great to have something along the lines of French Wikisource, which has a tab at the top of the page, next to "Page | Source | Discussion" that allows readers to automatically switch between original spelling and modernised spelling (e.g. this page), and even a toggle to highlight the changes that have been made. In our case it could be things like original typography (long S etc.) instead; could even have an option to toggle between original typos and SIC corrected spellings. --YodinT 13:28, 24 October 2024 (UTC)Reply
Yodin At French wikisource, the "Source" just links to the Index page, and this wiki has the same link. "modern" is also dated, like tomorrow it will be different things that "modern" describes; so in some ways, modernization is an editorialization of the spelling and its punctuation and such of that time it was transcribed. I really really like "As it was published", which was probably thoroughly modern at its time.--RaboKarbakian (talk) 15:33, 24 October 2024 (UTC)Reply
Yep, the modernisation option is next to Source (the Index: link) and the talk page (Discussion) tab at the top of the page. It absolutely is editorialisation, but follows predictable rules (this isn't the same in English), and they update the "modernisation" algorithm when there's spelling reform. But the main thing is that it still completely preserves the original "as it was published" version of the text as well, and just allows an automatic option for people who want to read the texts using current spelling conventions. That's what I'd like to see here: an option for readers to easily choose whether they want to see the long-s, original fonts, etc. etc., and original as-is typos, or switch these off. Handling annotations the same way (rather than copy-pasting the text, and adding hyperlinks/footnotes to this copy, which will be extremely difficult to sync with the original if further proofreading/validation improves the quality of the original text) – it seems to me it would be much easier to use templates to markup annotations in Page: space, and switch them off by default – but that's another discussion! --YodinT 16:03, 24 October 2024 (UTC)Reply
Yodin I was wrong and I would strike my paragraph except that I enjoyed the rant about "modern". Also, that French module is very cool. If we use it here, maybe I might still be around for the "Post-modernization" module!! As it is for me here {{ls}} never displays long s; no matter the preference toggle, no matter the namespace; so I find myself being very firmly on the other side of "No options, this way" pasting the long s so that I can see it that way. I think that in the page namespace it always displayed the s, and that was also not helpful for editing. Also, once, I used one of the wikimedia fonts (via @font in the stylesheet) and since then, my browser displays the wrong font size, always; well, not at first (with a vanilla configuration) but at second; just like something is grabbing it and using its configuration instead of mine. I think these and (many, many) other problems are all related, but the long s one did me in. Another thing, I really hate using those words "I was wrong" just so you know.--RaboKarbakian (talk) 19:51, 24 October 2024 (UTC)Reply
Generally I support the as-it-was-published attitude, but I am sceptical we will be able to reach an agreement or change the en.ws approach towards all the mentioned subtopics within one discussion. Maybe we should discuss individual problems like indentation, long s, fonts, etc. one by one. BTW: I do miss paragraph indentation here very much, and do not like the modern inter-paragraph spacing that replaces it at all. --Jan Kameníček (talk) 15:50, 24 October 2024 (UTC)Reply
Jan Kameníček: For group projects, especially those that beginners have been directed to start with, the simpler the better. Individual projects or those having just a few contributors should not have to suffer policy intended (heh, I typoed "indented" first here) for beginners. Another thing, How and where to discuss things where capable and interested hackers might be that can enable things. Poor CalendulaAsteraceae will be coding until the post modern module is needed and maybe still won't be done with everything that is wanted. Also, some of the best coders I know have little interest in policy discussions and might even run from anything using the word "consensus". Phab tickets seem to sit there; although it might just be the tickets I look at. I'ma gonna call what we have now .--RaboKarbakian (talk) 19:51, 24 October 2024 (UTC)Reply
  • We also do not include at all times decorative elements/flourishes that may appear in news articles, because we do not have stock svg versions of all of them. That would be the same as "images that start and end chapters", but with news articles, especially from the 1800s. We have some simple rule elements, but not all. We also do not include boxes. Some news articles or advertisements appear in a box. --RAN (talk) 23:04, 12 November 2024 (UTC)Reply

Where to start and end a book's file

[edit]

I've noticed that traditionally, files of books are assembled from cover to cover. I'm considering deviating from this tradition by creating a djvu file of this hathitrust book that starts from the first page that contains transcribable material and ends at the last page containing such. So I wonder if there is any purpose to having empty pages at the start and end besides to act as a placeholder. Is there a need to preserve a book's integrity that necessitates having the entire book? Prospectprospekt (talk) 23:43, 22 October 2024 (UTC)Reply

Having all pages from cover to cover shows that nothing was omitted. If you start chopping off pages, then how can the readers be sure that you haven't arbitrarily decided to delete something important? Such as the toc or preface or errata notes or anything else. Some books include advertisement pages of questionable value, but if you remove them, then this would look suspicious. --Ssvb (talk) 05:54, 23 October 2024 (UTC)Reply
To me, the empty pages in themselves don't have an important interest (though I'd leave them anyway for integrity), but it's much easier to check if something else has been removed in violation of WS:NPOV if they have been left there. If the empty pages have been left, verification is as simple as the number of pages, but if we remove them, it gets much more complicated, and it's not even just substracting the number of empty pages at the beginning and end, as some may also remove the back of plates, so in any way you have to check all the pages. — Alien  3
3 3
06:59, 23 October 2024 (UTC)Reply
Agree, empty pages should not be removed from the scans of the book. Besides the reasons above I will add some more: 1) the scans uploaded to Commons do not serve only Wikisource, but to anybody, and I can imagine that somebody might like to create an exact facsimile of the original publication including the cover etc., and so they would miss the omitted pages then. 2) Although we do not transcribe e.g. the library tags attached to books, for somebody it might be useful to know in which library this particular specimen was stored, so we should not cut it off from the scan. --Jan Kameníček (talk) 09:20, 23 October 2024 (UTC)Reply
  • Prospectprospekt: That file is somewhat unusual. The actual, printed item is /3 to /52; /1, /2, /53, and /54 are all a cover which was added to the pamphlet by the library which owns the item. In this case, properly, those four pages should be excluded; but I generally do not exclude them because I do not think it is necessary to do so. Jan Kameníček, does it change your opinion to know that the covers (of this work) were not original to the publication? TE(æ)A,ea. (talk) 17:43, 23 October 2024 (UTC)Reply
    Well, there still remains my "library argument", which, I admit, is not too strong, so although I personally would not remove these pages, I would not object too much if somebody else would. --Jan Kameníček (talk) 17:50, 23 October 2024 (UTC)Reply

Manual news article aggregation (manual indexing) versus automatic news article aggregation (automatic indexing)

[edit]

See: Jersey Journal (manually curated, always missing entries) versus The Washington Post (newspaper) (automatically curated, always complete) to see the difference. I have identified at least 6 different ways that news articles are manually aggregated in different formats from calendars to various table formats to lists by year. Is there a hard rule that prevents us from having both manual and automatic curation. The best analogy would be Commons which has Commons:Category:Abraham Lincoln for automatic aggregation and Commons:Abraham Lincoln for manual aggregation. I don't see why we cannot have both methods to satisfy both needs. We could have Portal:Jersey Journal or Periodical:Jersey Journal for manual indexing and a link to Jersey Journal for the automated list, just like is done at Commons; or, we could have The Jersey Journal versus Jersey Journal with one automatic and the other hand curated, and a link between the two. A third option would be a hybrid where both appear on the same page like here: New York Tribune. RAN (talk) 17:12, 23 October 2024 (UTC)Reply

Page access request

[edit]

Hello, I have a small request. I've been addressing some specific priority syntax errors here on Wikisource, and have dropped two error types down to near zero. The Tidy Font Bug (78 remain), and Misnested tags (42 remain). 77 and 41 of these are on Full protected pages, and I wondered if I could have access to these Tidy font and these misnested pages for a brief time to address these issues. I have 2 years of experience on Wikipedia with handling these (and other) tracked syntax errors in an respectful and knowledgeable manner, and currently have a temporary adminship (Sept-Dec) on Wikivoyage, where I addressed 99.99% of their 30k syntax errors in 5k edits (Aug-Sept). I've asked Xover and Encyclopety on their talk pages about the possibility of my accessing these few pages, but neither have been very active here since my messages, and have not replied, so I figured the next step was to ask here since it had been a few days. I am happy to discuss or answer any questions admin may have. Thanks, and hope you have a great weekend. Zinnober9 (talk) 19:54, 25 October 2024 (UTC)Reply

Crossposted to WS:AN since no reply here after a week, and only an admin could grant this request. Zinnober9 (talk) 05:41, 3 November 2024 (UTC)Reply

Qid

[edit]

Is there any place we can add a Wikidata qid so a bot adds the portal or the news article to the corresponding Wikidata entry? RAN (talk) 03:53, 26 October 2024 (UTC)Reply

Wouldn't that be adding a sitelink to the wikisource page to the item? — Alien  3
3 3
08:15, 26 October 2024 (UTC)Reply
  • Yes, we could have: {{author | firstname = Tirey Lafayette | lastname = Ford | last_initial = Fo | '''wikidata = Q7809200''' | description = American lawyer and politician; California Attorney General, District Attorney from California }}

That way a bot at the Wikidata end could add the Wikisource link automatically, and they would be paired at both projects.

What I'm saying is that is already done, you just have to add it at the other end, at WD. When an item has for example a sitelink to one of our authors, {{author}} picks it up, takes the data, and puts a link to the item.— Alien  3
3 3
14:39, 26 October 2024 (UTC)Reply
  • I found that "wikidata=" is already in every header template for both portals and individual books/news articles, so the first part is already in place. Any function that can be performed by a bot is superior to doing it by hand. I found that there are already >50 entries that have Wikidata ids but do not appear in Wikidata. "Pi bot" performs this function linking wikidata to Commons categories when it finds matching Qids. --RAN (talk) 22:58, 12 November 2024 (UTC)Reply
    (I previously didn't get that you meant having a bot search, rather than already give it the ids.) We could do that. — Alien  3
    3 3
    12:24, 17 November 2024 (UTC)Reply
  • I should have written it more clearly, I contacted the creator of pi_bot, but they have not responded. One thing I noticed is that an occasional wikidata number at Wikisource is incorrect, probably from a cut and paste error, would the bot fix an error if we replace the Wikidata number here with the correct one after it already has added the incorrect one? --RAN (talk) 18:05, 17 November 2024 (UTC)Reply

I'm kind of new to wikipedia and need some help

[edit]

I would like to know how to better edit this code and what exactly it is for:

{{header
| title = {{subst:}}
| author =
| section =
| previous =
| next =
| year =
| notes =
}}

WikiEducationalVol (talk) 03:19, 27 October 2024 (UTC)Reply

Hi @WikiEducationalVol: It seems that you might be confusing us for Wikipedia. We are actually not Wikipedia—we are their sister site Wikisource.
You recently submitted an article about "Adolescence", which I deleted because it is not within our project scope. We host a collection of transcriptions of already-existing texts, mostly old books and government documents. But your article about the definition of adolescence might be better suited for Wikipedia itself or maybe even Wikibooks. Try those communities next. SnowyCinema (talk) 04:45, 27 October 2024 (UTC)Reply
to answer your question, this is the header template for completed transcribed works, for example Beethoven (Rolland). we have a works namespace rather than article namespace. it is backed by a side by side transcription stitched together at an index page, for example Index:Rolland_-_Beethoven,_tr._Hull,_1927.pdf. --Slowking4digitaleffie's ghost 00:09, 28 October 2024 (UTC)Reply
@WikiEducationalVol: If you're used to the Wikipedia way of thinking, try thinking of this template as a sort of little infobox for each book or article.
  • The title parameter is for the full title of the work as originally published. (This might be different from the title of the page in some instances!)
  • The author parameter is for the author of the work.
  • The year parameter is for the year the work was originally published.
  • The section parameter is for the title of the chapter (if you are working on a chapter subpage; don't fill this out on the main page).
  • The previous parameter holds a wikilink to the previous chapter, so you can go back to the previous chapter if you want.
  • Similarly, the next parameter holds a wikilink to the next chapter, so you can skip ahead.
  • Lastly, the notes parameter holds any other info you might want the reader to know. This might be a brief summary of the work, or a comment describing how the formatting of this version differs from that of the original.
Duckmather (talk) 21:06, 29 October 2024 (UTC)Reply

Scans are now migrated to the talk page

[edit]

Scans are now migrated to the talk page, when did that start? See: Talk:The_Indianapolis_News/1937/4_American_Pilots_Quit_Spanish_War_as_Loyalists_Fail_to_Pay
Also, is every news article here at Wikisource supposed to get an entry at Wikidata? RAN (talk) 20:24, 28 October 2024 (UTC)Reply

@Jan.Kamenicek: Any comments? SnowyCinema (talk) 01:51, 29 October 2024 (UTC)Reply
  • If you do not want the scan to appear on the text page anymore, the best thing would be to create a Wikidata entry for each news article and the image will appear there and we will link to the text here from Wikidata, the link will then appear in the upper right corner. See for instance: Wikidata:Q86172138 --RAN (talk) 05:02, 29 October 2024 (UTC)Reply
    Main namespace is supposed to contain the transcribed text, sometimes accompanied by original illustrations of the text. The best place for scans is the page namespace. It is redundant to have both the transcribed text and the scan in the mainspace page. It is not being done with other works and there is no reason why it should be done with news articles. Such practice is not supported by any of our rules or help pages. For example Help:Digitising texts and images for Wikisource#Images and illustrations contradicts this approach by stating that images should be extracted from the work and uploaded as separate files, not like here]. Proper work with scans is described at Help:Proofread, work with .jpg scans is described at Help:Index pages#Using individual image files. I have moved some thumbs of images of a few of such scans to the talk pages so that they are not lost if anybody wanted to use them for proper scanbacking.
    As for Wikidata entries, they are not required but their creation is certainly supported. --Jan Kameníček (talk) 11:23, 29 October 2024 (UTC)Reply
    BTW: One more thing should be said, and that is general appreciation for the work with transcribing interesting and useful news articles. --Jan Kameníček (talk) 11:30, 29 October 2024 (UTC)Reply

Tech News: 2024-44

[edit]

MediaWiki message delivery 20:56, 28 October 2024 (UTC)Reply

Final Reminder: Join us in Making Wiki Loves Ramadan Success

[edit]

Dear all,

We’re thrilled to announce the Wiki Loves Ramadan event, a global initiative to celebrate Ramadan by enhancing Wikipedia and its sister projects with valuable content related to this special time of year. As we organize this event globally, we need your valuable input to make it a memorable experience for the community.

Last Call to Participate in Our Survey: To ensure that Wiki Loves Ramadan is inclusive and impactful, we kindly request you to complete our community engagement survey. Your feedback will shape the event’s focus and guide our organizing strategies to better meet community needs.

Please take a few minutes to share your thoughts. Your input will truly make a difference!

Volunteer Opportunity: Join the Wiki Loves Ramadan Team! We’re seeking dedicated volunteers for key team roles essential to the success of this initiative. If you’re interested in volunteer roles, we invite you to apply.

  • Application Link: Apply Here
  • Application Deadline: October 31, 2024

Explore Open Positions: For a detailed list of roles and their responsibilities, please refer to the position descriptions here: Position Descriptions

Thank you for being part of this journey. We look forward to working together to make Wiki Loves Ramadan a success!


Warm regards,
The Wiki Loves Ramadan Organizing Team 05:11, 29 October 2024 (UTC)

Commision/Commission bug

[edit]

If you look at Van Cise exhibits to the Commission on Industrial Relations regarding Colorado coal miner's strike and click the "Source" button, you will get to Index:Van Cise exhibits to the Commission on Industrial Relations regarding Colorado coal miner's strike.djvu, which doesn't exist. The actual index is at Index:Van Cise exhibits to the Commision on Industrial Relations regarding Colorado coal miner's strike.djvu (note the missing "s" in "Commision" [sic]). There's a similar problem on some of the pages. For example, clicking the up arrow on Page:Van Cise exhibits to the Commision on Industrial Relations regarding Colorado coal miner's strike.djvu/1 also leads you to the nonexistent "Commission" index page.

I see two ways out of this:

  • Move the file, the index page, and all individual pages to use the "Commission" spelling, and make sure that no pages are still using the old "Commision" spelling.
  • Use the "Commision" spelling, and somehow fix all the redlinks (maybe they're due to ProofreadPage going wonky somehow?).

Duckmather (talk) 21:01, 29 October 2024 (UTC)Reply

The problem was caused in Commons about 2 years ago when User:Armbrust moved the file to the new name without taking care of our index page. I have moved the index and all the individual pages to the new title so now it should be fixed. --Jan Kameníček (talk) 22:12, 29 October 2024 (UTC)Reply
@Jan Kameníček: Thanks! Now I can get back to validating it. Duckmather (talk) 02:27, 30 October 2024 (UTC)Reply

Android app for Wikisource

[edit]

Hi, is there an Android app for Wikisource? How does it work? I have been advised that there is no infrastructure for push notifications for Android apps for sister wikis and I would be interested to know more. Related: phab:T378545. Thanks! Gryllida (talk) 23:14, 29 October 2024 (UTC)Reply

ſ to Template:ls

[edit]
  • purpose: I want to use a bot to replace the ſ with {{ls}}
  • scope: Arabian Nights Entertainments (1706)
  • programming language or tools: I've never done a wiki bot before so I'm not sure yet, I'm open to ideas.
  • degree of human interaction involved: semi-automated I think?

Eievie (talk) 05:57, 31 October 2024 (UTC)Reply

Eievie:
  1. use of {{ls}} is not mandatory; some extra things come from using it.
  2. there is a bot that runs here whose purpose is to release drag on templates: {{ae}} and {{black-letter}} are instances that I know of. {{ae}} gets reverted to its utf equivalent æ and black-letter just gets removed.
  3. the project has been accomplished more than 90% one person. It is the custom at en.wikisource to follow the precedence set by the main contributor, unless it is a book that is within a collection of works that should have similar typographic customization. An example of this is a recent conversion of '' to ‘’ in the Lang Coloured Fairy Books.--RaboKarbakian (talk) 17:49, 11 November 2024 (UTC)Reply
(I'm not aware of a bot removing {{bl}}. Which one would it be?) — Alien  3
3 3
18:01, 11 November 2024 (UTC)Reply
The style guide says to use {{ls}} (see Wikisource:Style guide/Orthography). If it's usage isn't actually encouraged, then the guide needs to be changed. Eievie (talk) 19:44, 11 November 2024 (UTC)Reply
 Oppose—without inserting into this discussion any personal opinion of mine on the issue of {{ls}} versus ſ, it is a quite contentious issue in the enWS community. It would be better not to fuel that flame. If you must, maybe a broader discussion on the issue (across all texts) would be in order instead of on an individual work. SnowyCinema (talk) 01:25, 12 November 2024 (UTC)Reply
Wikisource:Style guide/Orthography makes it look like a settled issue, like there's policy — or at least guidelines — that say use {{ls}}. If that's not true, if it's actually contentious, then the style guide should be updated. Otherwise its super misleading. I'm newish to this site and I read that guide and was left thinking, "Ok, so that's a preestablished policy/guideline, so I should implement it when possible." Eievie (talk) 02:02, 12 November 2024 (UTC)Reply
This is a bit offtopic, but if standardizing characters to follow a single precedent while editing page-by-page, see WS:Regex. You can type long-s and short-s both as a "s", and then automatically convert the ones that should be {{ls}}, or convert all the {{ls}}s to ordinary "s". Either can be done with two clicks (one to open the tool). HLHJ (talk) 03:23, 12 November 2024 (UTC)Reply
I'm familiar with it; I did Arabian Nights Entertainments volume 1 that way. There are a lot of volumes though, which is why I asked about bots. Eievie (talk) 03:31, 12 November 2024 (UTC)Reply
  • Eievie: At the very least, your edits broke volume 1 (did you not even notice all of the newly-introduced red links)? In any case, because you have not “fixed” the rest of the Arabian Nights, please revert your changes to the first volume so that all of the text has a consistent style. TE(æ)A,ea. (talk) 03:48, 12 November 2024 (UTC)Reply
    The main person behind that work and I are discussing the use of {{ls}} privately, and we will settle this between us. But the main person behind the page did not automatically want then {{ls}} removed. Eievie (talk) 05:04, 12 November 2024 (UTC)Reply
Eievie I wanted to wait until I was not angry to address this. By that time, I believe I could intelligently state my reasons here. If you paste any user name (example: [[User:Eievie|Eievie]]) they will get a notification. Also, see Wikisource:Scriptorium#Wikisource:_We_preserve_publishers_typos where I was probably annoyed, mostly from pasting all of those darn ſ.--RaboKarbakian (talk) 17:44, 12 November 2024 (UTC)Reply
I'm fine dropping this whole thing — I'm just asking that Wikisource:Style guide/Orthography#Phonetically equivalent archaic letter form be changed then. I maintain that trying to implement an explicitly stated site style guideline is not unreasonable. If its not something people are actually supposed to do on this site, I need there to not be instruction pages saying that's how things ought to be done. Since the question of bot usage is long over, can this thread also be ended and someone point me to what thread handles questions of altering policy and making it clear? Eievie (talk) 21:21, 12 November 2024 (UTC)Reply

So, even though this has been dropped, I just learned from Eievie that purpose was not to script a wikibot (as indicated here) but to make it easier to proof already existing pages. See User_talk:Eievie#rh_vs._c I suggest that an admin deny this request, point Eievie to Category:Proofread and recommend that if there is anything on one of those pages, to simply pick a different one.--RaboKarbakian (talk) 19:33, 13 November 2024 (UTC)Reply

Completely aside from the content discussion here, a regex that operates on a whole work would be very useful. I've been dealing with OCRs that make errors so consistently that an autoreplace on the entire work would save a lot of time, and sometimes you format a work one way, and then realize that you really ought to replace all instance of one template with another, which must also happen when old templates get replaced and deprecated; there are lots of non-controversial use cases. It would be necessary to have a way to revert the whole thing with a click, though. HLHJ (talk) 19:42, 17 November 2024 (UTC)Reply
For non-controversial requests (e.g. € to e), you can always ask at WS:BR. — Alien  3
3 3
19:47, 17 November 2024 (UTC)Reply

Help with handwritten letter

[edit]

I'm working on Index:T. C. E. Laugesen to Carl Laugesen, am mostly done but would appreciate someone validating my work and helping decipher a word on page 3 I couldn't work out. —CalendulaAsteraceae (talkcontribs) 09:04, 31 October 2024 (UTC)Reply

Zyephyrus

[edit]

Unfortunately, bad news arrived: Zyephyrus, our long-term contributor and admin, passed away last September 8th. -- Jan Kameníček (talk) 16:57, 2 November 2024 (UTC)Reply

More in French Wikisource. --Jan Kameníček (talk) 19:04, 2 November 2024 (UTC)Reply
Very sorry to hear this. I remember them being kind and encouraging when I joined Wikisource (as well as having a wonderful username!). Rest in peace Zyephyrus. --YodinT 13:26, 3 November 2024 (UTC)Reply

Portals in headers

[edit]

The portals were traditionally listed in the portal parameter and divided by slashes. Now CalendulaAsteraceae started replacing this with individual portal1, portal2... parameters, see e. g. here, and plans to stop splitting portals at the slashes in the long run completely. As this is going to influence a really large number of pages, I think it should be discussed first, and so I am posting it here. Jan Kameníček (talk) 12:01, 3 November 2024 (UTC)Reply

Very long run; I don't actually want to take that project on anytime soon because (as you mention) it would be a lot of work. —CalendulaAsteraceae (talkcontribs) 12:04, 3 November 2024 (UTC)Reply
Well, you have already taken some steps, and the discussion should have preceded them.
As for the replacement itself, in my opinion it is not only unnecessary, but also unnecessarily more complicated for contributors. Slashes work well and are easy and quick to write. --Jan Kameníček (talk) 12:07, 3 November 2024 (UTC)Reply
A possible advantage of this would be to allow for the about a thousand portals that include slashes in their names to be used, though I don't know if that's a major loss. (After all, at this point there are probably more pages that use / to include multiple portals than portals that are incompatible with that.) In case anyone else is interested by the technical side of it, it's with this edit at module:plain sister.Alien  3
3 3
14:08, 5 November 2024 (UTC)Reply
True. It is definitely not necessary to deprecate it, it can stay optional, but should not replace the older way. And unless there is a reason in specific cases, like this one, it should not be being replaced massively by a bot. --Jan Kameníček (talk) 15:38, 5 November 2024 (UTC)Reply

Tech News: 2024-45

[edit]

MediaWiki message delivery 20:50, 4 November 2024 (UTC)Reply

Switching to the Vector 2022 skin: the final date

[edit]
A two minute-long video about Vector 2022

Hello everyone, I'm reaching out on behalf of the Wikimedia Foundation Web team responsible for the MediaWiki skins. I'd like to revisit the topic of making Vector 2022 the default here on English Wikisource. I did post a message about this in March, but we didn't finalize it back then.

What happened in the meantime? We built dark mode and different options for font sizes, and made Vector 2022 the default on most wikis, including all other Wikisources. With the not-so-new V22 skin being the default, existing and coming features, like dark mode and temporary accounts respectively, will become available for logged-out users here.

If you're curious about the details on why we need to deploy the skin soon, here's more information
  • Due to releases of new features only available in the Vector 2022 skin, our technical ability to support both skins as the default is coming to an end. Keeping more than one skin as the default across different wikis indefinitely is impossible. This is about the architecture of our skins. As the Foundation or the movement in general, we don't have the capability to develop and maintain software working with different skins as default. This means that the longer we keep multiple skins as the default, the higher the likelihood of bugs, regressions, and other things breaking that we do not have the resources to support or fix.  
  • Vector 2022 has been the default on almost all wikis for more than a year. In this time, the skin was proven to provide improvements to readers while also evolving. After we built and deployed on most wikis, we added new features, such as the Appearance menu with the dark mode functionality. We will keep working on this skin, and deployment doesn't mean that existing issues will not be addressed. For example, as part of our work on the Accessibility for Reading project, we built out dark mode, changed the width of the main page back to full (T357706), and solved issues of wide tables overlapping the right-column menus (T330527).
  • Vector legacy's code is not compatible with some of the existing, coming, or future software. Keeping this skin as the default would exclude most users from these improvements. Important examples of features not supported by Vector legacy are: the enriched table of contents on talk pages, dark mode, and also temporary account holder experience which, due to legal reasons, we will have to enable. In other words, the only skin supporting features for temporary account holders (like banners informing "hey, you're using a temp account") is Vector 2022. If you are curious about temporary accounts, read our latest blog post.

So, we will deploy Vector 2022 here in three weeks, in the week of November 25. If you think there are any remaining significant technical issues, let us know. We will talk and may make some changes, most likely after the deployment. Thank you! SGrabarczuk (WMF) (talk) 15:46, 6 November 2024 (UTC)Reply

To any admins passing by: Could someone take a look at MediaWiki talk:Gadget-Preload Page Images.js? (with V10, since the last codex change, the green border's broken so the arrow shifts down but it's still a noticeable change, whereas in V22 it will be plain undistinguishable, so it'd be nice to fix it.)
@SGrabarczuk (WMF): Why would dark mode and temporary accounts need V22? I already use dark mode on V10, and if we have a banner for IPs editing I don't see why we couldn't have a banner for temp editing.
I can only think of one significant technical issue, and that is paragraph spacing, also mentioned in March without an answer.
On one hand, why? what is the supposed advantage of spacing paragraphs further from each other?
On the other hand, here at ws we often need to make text fit into fixed boxes, and making the height of text that different across skins is a bad idea. Out of my hat, the most common issue I can think of is {{overfloat image}}s that make some kind of border around multiline text that does not already override paragraph spacing, e.g. Page:Salomé- a tragedy in one act.djvu/7, Page:Poems Tree.djvu/9, Page:Poems Jackson.djvu/7, &c. — Alien  3
3 3
18:49, 6 November 2024 (UTC)Reply
  • As someone who has seen Vector 2022 in action, I don’t know how you can say this. The use of Vector 2022 is not possible here; it makes Wikipedia much worse at it is, and at Wikisource it is completely untenable. There is no reason to make potential contributors make an account and change their setting configurations to be able to edit here without great difficulty. We have a lot of highly specialized formatting here, and if recent “fixes” are anything to go by, whoever makes technical changes thinks of Wikisource last in making them. Our site was rendered practically unusable because of an “accessibility” change recently, and it took days to get that patched—and it was only partially patched, at that. You mention “new features” for your shiny new toy, but I’m not sure why they’re necessary (or even not harmful here on Wikisource); the big push towards “dark mode” mirrors the tech industry’s general push towards AI, in that it is being done without consideration of the actual userbase (who, of course, has no need for such a feature). Your list of “[i]mportant … features” showcases the lack of connection to our community (despite your evident desire to force this unwanted and harmful change upon us): tables of contents are usually produced manually here, with templates; dark mode is a fad, and in any case would clash with any of the many texts here with images; and “temporary accounts” are a terrible idea that I can’t even imagine a justification for. I’ve only heard of them now, but I do remember the suggestion from a few years back; this change will make vandalism significantly worse without any demonstrable benefits whatsoever. Luckily, we don’t have much vandalism here, (and we have good administrators to deal with it,) but it seems (to me, at least) obvious that changes should not be made which will encourage and facilitate vandalism while making the prevention of vandalism harder (and in many cases fruitless). Of course, you’ve saved the best for last: changes will happen “most likely after the deployment.” You people, who do no good to Wikisource, Wikipedia, or any other project that actually drives traffic (beyond the moral good of writing articles, transcribing texts, &c.) see fit to make changes—without our consent—to the detriment of our work, and when problems inevitably arise force their solutions on the people you so ungraciously “helped” in the first place. I shouldn’t have bothered writing this, but your attitude in “suggesting” this change was enough to encourage me to write this quick statement down. TE(æ)A,ea. (talk) 22:51, 6 November 2024 (UTC)Reply
  • Just to be clear SGrabarczuk: If you think there are any remaining significant technical issues, let us know. We will talk and may make some changes, most likely after the deployment. – are you saying that you're planning to deploy to a live production website with over half a million views per day, without having addressed any of the issues that prevented you from deploying in April, without carrying out any user testing, and with plans only to possibly fix any breaking changes after carrying this out? What on earth is your deployment process (please link if you have one)? And what is the WMF policy about pushing changes on some communities that have serious unaddressed concerns, but not others (such as de.Wikipedia) – again, please link this. Very concerned that you're rushing this through without realising that it will greatly impact the website. --YodinT 11:25, 7 November 2024 (UTC)Reply
@Alien333, @Jan.Kamenicek, @Slowking4, @TE(æ)A,ea., @Yodin - thank you for taking the time to share your concerns and apologies for the late reply. Many of the team working on this were traveling for a work event this week. My name is Olga and I’m the product manager for the Web team (the team that build the skin). Hopefully I can help answer some of your questions.
In the short term, we’re reviewing the more explicit requests we’ve received from Wikisource wikis to see which, if any, we can address prior to deployment. We’ll try to let you know next week on which fixes (if any) we’re planning on making and what the timeline for those fixes is. It’s possible that some of them might come after the deployment itself.
More generally, I want to reassure you that we do read through the requests and questions here, and also underline that the deployment of Vector 2022 won’t be the end of the conversation here. We’ll continue working with you as people begin using the skin - answering questions, filing tickets, fixing bugs, and improving the skin based on your feedback. Our plan is not to deploy and then leave immediately. I can’t promise that we’ll fix or work on every request - that depends on what specific issues Wikisource users have, how large they are, and how many people are exposed to those issues - but we will try to at the least reply to everything and give a status update (we specifically want to look into and continue discussing the accessibility concerns you’ve raised above) .
In general, we understand that Wikisource has unique needs. This is why we’ve introduced some Wikisource-specific customizations (such as the full width for the main namespace) in the first place. In addition - while each Wikisource community is different, there are oftentimes almost if not perfectly identical in terms of design. Almost every other Wikisource community has been using Vector 2022 for quite some time now (years for some) and we haven’t seen major issues flagged there by communities in terms of the usability of the site. Hopefully that can help ease worries around bringing the skin to a production Wikisource - it’s already live on most of them.
More specifically I wanted to address:
  • Dark mode: the dark mode gadget available in Vector legacy relies on an invert method, unlike the feature-level dark mode in Vector 2022. This means that it’s easy to break or represent information inaccurately, especially in the cases of graphics, templates, or any manual color selections. This could potentially lead to common issues like content being displayed as white text on a white background, disappearing images, inaccurate graphs and data visualizations, etc. Either way, dark mode is an optional feature - we are not turning on dark mode for anyone, even if they have their browsers set to use dark mode (although for those interested that is an option that can be turned on using the setting called “automatic” in the dark mode menu)
  • Temporary accounts: While we are not representing the temporary accounts team, we can connect you to folks on that team that can provide a lot more detail on why the change is important, especially as it concerns the safety and privacy of editors and communities
  • Timing: As we mentioned above, this announcement is in part due to the technical burden in supporting two different default skins for logged-out users from a maintenance perspective. We are accelerating the timeline for the remaining wikis because we are no longer able to provide this support across wikis for logged-out users (logged-in users, who do not use cached pages can continue to access any skin as before)
Thanks again for sharing your thoughts - hope some of this was helpful! OVasileva (WMF) (talk) 15:46, 15 November 2024 (UTC)Reply
And what about the breaking technical issues mentioned? Specifically, the paragraph spacing? — Alien  3
3 3
16:43, 15 November 2024 (UTC)Reply
@OVasileva (WMF), @SGrabarczuk (WMF):: That is what I am really interested in too: The spacing problems which break our pages were mentioned as early as in March, why has it not been still solved until now, i.e. more than 7 months later? Why did you not answer this concern in March and avoid answering it now again? Why is the skin which we did not ask for planned to be deployed without solving this issue? If your team was not able to pay any attention to this until now, could you rectify it and solve the issue now before the deployment, or postpone the deployment until you solve it? --Jan Kameníček (talk) 14:27, 16 November 2024 (UTC)Reply
@OVasileva (WMF), @SGrabarczuk (WMF):: And what is most frightening is the statement that "...I can’t promise that we’ll fix or work on every request ... but we will try to at the least reply..." Sounds like you are making just fun of us, and your answers above are in fact the embodiment of this approach: instead of solving our concerns you just "try to reply" to calm people down without any real action taken. It is not the first time I have met with this approach here, and I could see too often that it drives various zealous contributors out of WMF projects. --Jan Kameníček (talk) 14:40, 16 November 2024 (UTC)Reply

Translations

[edit]

After I do a few translations am I supposed to create an author page for myself and list the entries I translated? RAN (talk) 01:11, 7 November 2024 (UTC)Reply

As far as I know, they should just be marked as translated by Wikisource.
I think you should use {{translation header}}, that does this automatically. — Alien  3
3 3
06:06, 7 November 2024 (UTC)Reply
Exactly. Wikisource translations are created in a similar way as Wikipedia articles, anyone can later edit them and change/improve the translation, so the translations are marked just as translated "by Wikisource". BTW: Before starting such translations, take a close look at WS:T#Wikisource original translations, especially the part stating that "A scan supported original language work must be present on the appropriate language wiki, where the original language version is complete at least as far as the English translation." --Jan Kameníček (talk) 17:27, 7 November 2024 (UTC)Reply
  • I don't think that creating an original translation is the same as editing an original translation, at least from a legal and copyright perspective, otherwise Stephen King would have to share a copyright credit with all the editors at Simon & Schuster that changed a word here and there. When I use the translation header it gives credit to the original translator then it adds "and Wikisource". At Wikipedia the entire biography may be rewritten many times over years so nothing remains of the original text, but a translation would only have a few words changed over the years, if at all. --RAN (talk) 19:57, 9 November 2024 (UTC)Reply

IA Upload Status?

[edit]

Since internet archive has come back this seems to not be working, with no recent uploads and it apparently not able to find the metadata from Internet Archive, even though it seems to be available (e.g.[17] is returns a JSON response). MarkLSteadman (talk) 14:39, 7 November 2024 (UTC)Reply

@MarkLSteadman: Yep the IA is mostly back now it looks like (even uploads are working again), but it looks like they're blocking the Toolforge IP address and so IA Upload is unable to fetch any items. I emailed the IA about it yesterday, so will see if they have any ideas. I'm assuming they're tightening up their systems for blocking single IPs that use lots of resources, and I can imagine that ours looks a bit odd on the surface. Sam Wilson 01:46, 12 November 2024 (UTC)Reply
Thanks for mentioning this, I ran across it myself and I'm glad of an update. HLHJ (talk) 03:42, 14 November 2024 (UTC)Reply

Surname categories

[edit]

Is there anything preventing us from having surname categories like "Category:Smith (surname)" for portals to match Commons? It would make it much easier to find news articles and portals for someone where we know their last name but they may appear as James Smith or Jack Smith, once you see them in the list you will figure out the correct person. RAN (talk) 20:14, 9 November 2024 (UTC)Reply

Tech News: 2024-46

[edit]

MediaWiki message delivery 00:07, 12 November 2024 (UTC)Reply

Template:Image frame

[edit]

I have a use case for Wikipedia:Template:Image frame on Wikisource. It would let me center a caption under two figures with their own sub-captions. Is this reasonable? Is there a better way? Would there be any objections to having that template here? HLHJ (talk) 01:59, 12 November 2024 (UTC)Reply

How about
{|
|-
|
{{class figure
 |num= 36
 |image= Spectacles and eyeglasses- their forms, mounting, and proper adjustment 1895 (2nd edition) Fig. 36.jpg
 |alt=A circular lens with a vertical T centered behind, placed so the crossbar of the T is just tangent to the top of the circle. The image of the T seen through the lens is not displaced.
}}
|
{{class figure
 |num= 37
 |image= Spectacles and eyeglasses- their forms, mounting, and proper adjustment 1895 (2nd edition) Fig. 37.jpg
 |alt=A circular lens with a vertical T centered behind, placed so the crossbar of the T is just tangent to the top of the circle. The image of the vertical of the T, seen through the lens, is displaced to the side.
}}
|-
| colspan="2" | {{c|{{sc|Method of Finding the Apex of a Prism.}} (''After Maddox.'')}}
|}

that gives
Fig. 36.
Fig. 37.

Method of Finding the Apex of a Prism. (After Maddox.)

Alien  3
3 3
09:19, 12 November 2024 (UTC)Reply
Thank you. That worked perfectly. I got stuck thinking on semantics of nested captions. It should read decently with a screenreader, too. HLHJ (talk) 00:41, 13 November 2024 (UTC)Reply

Crediting across editions

[edit]

Template:copied is good for crediting copying between individual pages, but I'm involved in two projects with multiple editions with substantial overlap. I just copied the css stylesheet from one edition to another wholesale, as the text has changed but the formatting conventions seem identical. I credited in the edit summary, but I'm liable to be copying bits of formatting in one case, and music scores in another, extensively. Is there a more general (work-level, not page-level) way to say that the contributors of work X indirectly contributed to work Y? HLHJ (talk) 03:28, 12 November 2024 (UTC)Reply

The Wikipedia version allows multiple from fields Wikipedia:Template:Copied#Examples; the version copied (with acknowledgement of the irony) to Wikisource by MJL seems not to. Neither seems to allow multiple "to" fields, but one could use {{FULLPAGENAME}}. This is a bit of a kludge, harder to write and read than a work-level copied template, but it would do it. Or should I make a works-level template? HLHJ (talk) 15:45, 18 November 2024 (UTC)Reply

Need help with index

[edit]

At New York Post the calendar style index does not work, but at The New York Times it works, can anyone fix New York Post calendar style index? RAN (talk) 00:12, 13 November 2024 (UTC)Reply

You need to create the linked pages with a template; see the source of New York Post/1849, which I created, and which is therefore now autolinked from the index. HLHJ (talk) 04:42, 13 November 2024 (UTC)Reply

No redirects from Portals to Author

[edit]

I was always told there are to be no redirects from Portals to Author, but why? I don't see any valid reason. I only see the value of knowing that someone is an author and not the subject of works. RAN (talk) 23:48, 13 November 2024 (UTC)Reply

Wikisource:Deletion_policy#Miscellaneous would include "unneeded redirects". Yet redirects from Portals to Author as crossing the namespaces do not automatically fall into them. Need wider talks on this.--Jusjih (talk) 04:11, 17 November 2024 (UTC)Reply
I can imagine that a redirect from the Author NS to Portal NS can be useful. E. g. an author does not have any works eligible to be hosted in English Wikisource and so works about the author are gathered in a portal. However, some people might be searching for the author in the Author NS, because that is the place where we usually have pages on authors, and so a redirect can be helpful. For example we used to have the page Author:Socrates redirecting to Portal:Socrates, until it was deleted a few years ago as redundant, which was imo not necessary. However, the other way, i. e. redirect from Portal NS to Author NS does not really seem useful to me. --Jan Kameníček (talk) 10:22, 17 November 2024 (UTC)Reply
I see two reasons that this might exist:
  • Subject portals about particular authors. Like you want to subclass "Russian Literature" into Portals for Tolstoy, Gogol and Pushkin, or "Biology" into Mendel, Darwin, Aristotle, etc. What is the need for creating these subportals as opposed to just listing the author? And if you wanted a sub portal for separation, (e.g."U.S. Presidential Administrations" --> "Obama Administration" separate from works Obama authored) then you are trying to make a distinction that a redirect is wrong, (e.g. a memo from one official to another) And subclassing these portals with an Author creates problems anyways (is "Aristotle" a subportal of Biology, Ethics, Political Science...).
  • Non-human authors, e.g. pseudonyms. I can see some situations where this might be appropriate (e.g. a letter from a fictitious company) but this seems extremely narrow.
MarkLSteadman (talk) 19:10, 17 November 2024 (UTC)Reply
  • I work in news articles about people, mostly obits, so almost everyone has a portal as the subject of a news article, but occasionally someone will have written an editorial or a got a letter published in a newspaper, I would like to have the portal redirect to the author page when this occurs. Normally if someone just wrote an editorial, I would keep them as a portal, but several times others have moved them into author space. --RAN (talk) 04:31, 19 November 2024 (UTC)Reply

Digital-native article

[edit]

I'm thinking I might add a digital-native article[22] to Wikisource. Last time I helped someone with that it was extremely tedious. Are there any semi-automated ways to scrape articles off PMC and upload their images? The article diagrams are also monochrome line drawings, for which their jpg encoding is rather unsuitable. Would, say, an svg version, especially if it is the author's original, be a suitable replacement? HLHJ (talk) 03:36, 14 November 2024 (UTC)Reply

News articles with no titles

[edit]

We can use descriptive names like New York Tribune editorial on the Dred Scott case or my preferred would be New York Tribune/1857/It is Impossible to Exaggerate to use the first few words of the article. In the 1800s the front page of most papers were hundreds of small articles with no titles. RAN (talk) 18:12, 17 November 2024 (UTC)Reply

Tech News: 2024-47

[edit]

MediaWiki message delivery 02:00, 19 November 2024 (UTC)Reply