Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Font Unicode Range #3331

Open
peng1999 opened this issue Feb 3, 2024 · 18 comments · May be fixed by #5305
Open

Font Unicode Range #3331

peng1999 opened this issue Feb 3, 2024 · 18 comments · May be fixed by #5305
Labels
proposal You think that something is a good idea or should work differently. styling About set and show rules or style properties text Text layout, shaping, internationalization, etc.

Comments

@peng1999
Copy link
Contributor

peng1999 commented Feb 3, 2024

I propose a new CSS unicode-range like mechanism, to allow more precise control over character's font.

Proposed syntax

Allow text(font) to receive a dict (family, unicode-range) or a array thereof.

E.g.:

#set text(font: (
  (family: "Architects Daughter", unicode-range: (0x2d, 0x3d)),
  "Just Another Hand"
))

Use Case

Why not show regex

show regex has many caveats. It will break CJK punctuation adjustments. It also have some bad effects to other show rules. It deserve a better solution.

@peng1999 peng1999 added the feature request New feature or request label Feb 3, 2024
@Enivex
Copy link
Collaborator

Enivex commented Feb 3, 2024

I would go a step further and move many of the text parameters into font. Especially things like stylistic sets, which doesn't make sense as a global setting for all fonts.

@Enivex Enivex added styling About set and show rules or style properties text Text layout, shaping, internationalization, etc. proposal You think that something is a good idea or should work differently. and removed feature request New feature or request labels Feb 3, 2024
@Myriad-Dreamin
Copy link
Contributor

I would go a step further and move many of the text parameters into font. Especially things like stylistic sets, which doesn't make sense as a global setting for all fonts.

@Enivex is an early breaking change okay/better? I guess there are no quite many users using things like stylistic sets.

But this is not about proposing the font unicode range.

@Myriad-Dreamin
Copy link
Contributor

An urgent use case is to set correctly font fallback for these marks:

  • the left single quote mark: , U 2018.
  • the right single quote mark: , U 2019.
  • the left double quote mark: , U 201C.
  • the right double quote mark: , U 201D.

Though a English first font fallback works well on most cases:

#set text(font: (
  "Linux Libertine",
  "Source Han Serif SC",
), lang: "zh", region: "cn")

It cannot help font fallback for quote marks, because both fonts have these chars.

@PgBiel
Copy link
Contributor

PgBiel commented Feb 4, 2024

I would go a step further and move many of the text parameters into font. Especially things like stylistic sets, which doesn't make sense as a global setting for all fonts.

I don't think we have to go that far. After all, oftentimes you'll only want to apply certain changes within a certain scope, and having to change font settings for that would be annoying (you'd have to keep track of the current font settings and stuff).

Instead, we could use something like

#show text.where(font: "X"): set text(stylistic-set: 1)

However I'm not sure if the current progress on the styling rework allows for this. If not, it'd be an interesting addition to consider.

@laurmaedje
Copy link
Member

@PgBiel What you wrote does work on main, but it will be rather inefficient if the font is used a lot in the document.

@laurmaedje
Copy link
Member

I do sort of agree with @Enivex though. Maybe we need a reusable font type that wraps a buffer or string resource with settings rather than a dictionary.

@Enter-tainer
Copy link
Contributor

Enter-tainer commented Feb 5, 2024

I do sort of agree with @Enivex though. Maybe we need a reusable font type that wraps a buffer or string resource with settings rather than a dictionary.

i think this is orthogonal with the original proposal. even if we are ok with both of them, these 2 feature will very likely be done in 2 sepreate pr.

@Enter-tainer
Copy link
Contributor

An urgent use case is to set correctly font fallback for these marks:

To better explain the problem, I made this little doc:

#set page(height: auto, width: 25em)

Quotation marks should be 1em wide per #link("https://www.w3.org/TR/clreq/#glyphs_sizes_and_positions_in_character_faces_of_punctuation_marks")[clreq]. This is usually achieved by fonts: CJK fonts have fullwidth quotation marks, and Latin fonts have proportional quotation marks.

== When using NCM and Source Han Serif SC

Note that the quotation marks are not 1em wide. This is because NCM also has glyph for quotation marks, so it doesn't fallback to Source Han Serif SC.

#set text(font: ("New Computer Modern", "Source Han Serif SC"), lang: "zh", region: "cn")
引号#highlight[“]引号”引号

== When using Source Han Serif SC

#set text(font: ("Source Han Serif SC"), lang: "zh", region: "cn")

引号#highlight[“]引号”引号

test

@peng1999
Copy link
Contributor Author

peng1999 commented Feb 6, 2024

Maybe we need a reusable font type that wraps a buffer or string resource with settings rather than a dictionary.

What the difference between a dict and a "font object"?

Compare (using dict)

#set text(font: (family: "Stix Two Math", stylistic-set: 1))

and (using a function font)

#set text(font: font(family: "Stix Two Math", stylistic-set: 1))

@laurmaedje
Copy link
Member

laurmaedje commented Feb 6, 2024

It is true that the difference isn't big at the moment, mostly that it could have methods, can be documented better and similar minor things. My statement from above isn't really substantiated.

In the future, a font type could potentially (I am not yet sure whether set rules for types that don't end up in the content stream are really achievable) support set rules directly on it. Finding the right place to create a new type rather than a dictionary (also e.g. for strokes which have both) is a tricky design problem across Typst.

@mattfbacon
Copy link
Contributor

Also with a font type I think Typst would allow writing #set text(font(...)) based on special behavior of that type. Not sure if we would want this though.

@peng1999
Copy link
Contributor Author

peng1999 commented Feb 24, 2024

Maybe we need a reusable font type that wraps a buffer or string resource with settings rather than a dictionary.

I now realized that a font function do provides a better API and gives more possibility. I've opened #3488 to discuss that.

However adding a font function requires more code changes, which is unfortunate given the urgent need for mixed CJK-Latin documents as described in #3331 (comment).

@refparo
Copy link

refparo commented Mar 16, 2024

This is a workaround for #3331 (comment) that works on 0.11.0:

#let sans-font = ("TeX Gyre Heros", "Noto Sans CJK SC").map(lower)
#let serif-font = ("Tex Gyre Termes", "Noto Serif CJK SC").map(lower)
#set text(font: serif-font)
#show strong: set text(font: sans-font)
#show regex("[·“”‘’…]"): it => {
  show text.where(font: sans-font): it => text(font: "Noto Sans CJK SC", it)
  show text.where(font: serif-font): it => text(font: "Noto Serif CJK SC", it)
  it
}
文字“文字”文字\
*文字“文字”文字*

(assuming your document don't use different font lists everywhere, which should be a bad practice...)

@Enter-tainer
Copy link
Contributor

afaik this will break punctuation sequeeze. so it's not perfect at this moment

@peng1999
Copy link
Contributor Author

peng1999 commented Mar 16, 2024

afaik this will break punctuation sequeeze.

FYI the underlying problem is discussed here. Maybe worth its own issue.

@Enter-tainer
Copy link
Contributor

I would go a step further and move many of the text parameters into font. Especially things like stylistic sets, which doesn't make sense as a global setting for all fonts.

From efforts in #4093 it looks like including everything like stylistic sets and ligatures can be hard. I think we can start implement and design unicode range incrementally without depending on font object. We can keep it simple and focused at this moment.

I personally feel proposed API is good.

#set text(font: (
  (family: "Architects Daughter", unicode-range: (0x2d, 0x3d)),
  "Just Another Hand"
))

@peng1999
Copy link
Contributor Author

Half a year has passed since #4093, and still no one has found the Unified Font Solution. Given the situation, I propose to move forward by first implementing the proposed unicode-range functionality in the beginning. Few reasons here:

  • This is just a small and compatible user API change: In addition to strings, we also allow dict in the font list. Not breaking any existing code.
  • Since it is not a breaking change, it will not be a burden for a better font solution in the future, which will likely be a breaking change.
  • In Microsoft Word and LaTeX, setting Chinese and English fonts separately is a basic feature, and thus most serious Chinese formatting guides require this. I believe that the improvements brought by unicode-range provide a sufficiently significant enhancement.

@laurmaedje Do you have other concerns about the dict-in-font-list proposal?

@peng1999 peng1999 linked a pull request Oct 27, 2024 that will close this issue
@laurmaedje
Copy link
Member

I responded in #5305

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
proposal You think that something is a good idea or should work differently. styling About set and show rules or style properties text Text layout, shaping, internationalization, etc.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants