Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Top level tags differ between XML/TeX flavors #508

Open
alerque opened this issue Dec 26, 2017 · 5 comments
Open

Top level tags differ between XML/TeX flavors #508

alerque opened this issue Dec 26, 2017 · 5 comments
Assignees
Labels
enhancement Software improvement or feature request

Comments

@alerque
Copy link
Member

alerque commented Dec 26, 2017

I'm trying to sort out the mess I made with #465, #505 and also looking at #502 and one thing that's been bothering be for a while is....

  • The top level opening tag expected in XML-flavor markup is <sile>.
  • The top level opening tag expected in TeX-flavor markup is \start{document}.

I don't see any reason these should be different. Am I missing some logic improved by this nomenclature?

If not I propose normalizing them over two stages.

  1. Deprecate whichever one we don't like as much, but still parse for it and give a warning.
  2. Later after a version or two, stop parsing for it.
@alerque alerque added the enhancement Software improvement or feature request label Dec 26, 2017
alerque added a commit to alerque/sile that referenced this issue Dec 26, 2017
The spaces added on these lines are unwanted side effects related to sile-typesetter#105.
See also sile-typesetter#508 for discussion of whether top level tags should even be a thing
in included files.
@alerque
Copy link
Member Author

alerque commented Dec 27, 2017

If, per my comment on #505, we didn't have to worry about guessing the markup flavor and could just assume it based on context, we could not require top level tags at all for the case of includes or other fragments:

SILE.doTexlike = function (doc)
  doc = "\\begin{document}" .. doc .. "\\end{document}"
  SILE.process(SILE.inputs.TeXlike.docToTree(doc))
end

Would become:

SILE.doTexlike = function (doc)
  SILE.process(SILE.inputs.TeXlike.docToTree(doc))
end

And depending on the calling context the parsed tree would just make sense to matter what it started with. Basically it would be a tree appropriate to splice into the calling context.

One of the issues with requiring a wrapping document tag is that it's not intuitive what extra space will come it. Really this should be fixed in #105, but as an example a TeX-like document for inclusion in another document will bring in leading and trailing space that would have been dropped had it been processed on it's own. See also 53807e6 for a case I had to work around this.

@simoncozens
Copy link
Member

XML flavour should allow arbitrary top-level tags, because SILE is fundamentally a typesetting system for arbitrary XML files.

alerque added a commit to alerque/sile that referenced this issue Dec 29, 2017
The spaces added on these lines are unwanted side effects related to sile-typesetter#105.
See also sile-typesetter#508 for discussion of whether top level tags should even be a thing
in included files.
@alerque alerque self-assigned this Dec 29, 2017
@alerque
Copy link
Member Author

alerque commented Sep 10, 2020

Just a note that #1052 has significant examples and discussion about what use cases need this to be fixed.

@alerque
Copy link
Member Author

alerque commented Sep 10, 2020

How about this for an idea:

  1. Allow any root tag in both formats.
  2. Setup the classes to handle whatever the root tag is using whatever function they have for it. In this case we're parsing for document in the case of SIL or sile in the case of XML input, but in some cases we also attach attributes to those tags. How this is handled between input formats and fragments (SILE.doTexlike()) is completely inconsistent. In this proposal all options to the root tag would be passed to the classes's root tag handler, something determined by the class not by any preconceived idea about what that tag will be called.
  3. Allow fragments in Texlike SIL format to not have any root tag at all. Any tag they do have would be processed as a command. In order to ease the pain of this being a breaking change we could probably shim the document command to be a no-op. This would make pre-pocessors that included file fragments 100% compatable with sile itself processing fragments.

@Omikhleia
Copy link
Member

Omikhleia commented Jul 14, 2023

Let's resurrect this old issue.

  • How about document is the only top-level tag for SIL (as \begin{document} or \begin[... options...]) and its XML flavor (<document ...>)? KISS...

  • Regarding other XML files, how about all tags are prefixed with the xmlns of the root if set, or the root tag if unset, at least by default? Say my document starts with <TEI.2> or <article xmlns="http://docbook.org/ns/docbook">...

    This point would be "a bit" breaking, but is kind of necessary when handling random XML formats that would have tags in their schema overriding SILE commands with different semantics (and likely different attributes even when semantics sort of match by mere luck)

    • It would make things easier for implementers not to have to "save" commands (which could also depend on loaded packages) that would conflict with the schema, and guess when they have to restore them (which is tricky). Consider DocBook's <info> vs. SILE \info nodes (different semantics) or DocBook's <footnote> (same semantics, but different attributes and different type of content, white space handling is an issue there)...
    • The XML inputter(s) could also have a way to load necessary packages based on that information. In the above example, it could automatically load necessary packages to support the appropriate tag set...

I have dabbled into supporting several XML schemas from the start of my using SILE (TEI dictionary subset, TEI critical apparatus subset, USX, USFX, to name a few) and it's pretty messy with the current XML inputter. Of course I can also replace that with my own inputter and address the above. Yet it feels the wrong way to go...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Software improvement or feature request
Projects
None yet
Development

No branches or pull requests

3 participants