Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Language-dependant font configuration #794

Open
zylthinking opened this issue Apr 14, 2023 · 7 comments
Open

Language-dependant font configuration #794

zylthinking opened this issue Apr 14, 2023 · 7 comments
Labels
feature request New feature or request text Text layout, shaping, internationalization, etc.

Comments

@zylthinking
Copy link

Say, if typ file is using unicode, it is possible to know which language any character belongs to.
Then can typst provide a method to set different font for different language.

The code seems does not work even for texts explicitly specified language:

#show text.where(lang: "zh"): text.with(font: "Hiragino Sans GB",  size: 6pt)
#text(lang: "zh")[ 汉语 ]

@laurmaedje laurmaedje added feature request New feature or request text Text layout, shaping, internationalization, etc. labels Apr 17, 2023
@poetlife
Copy link

Seems we can use regex expression to set Chinese Character.

// 设置中文字体 Setting Style of Text which is Chinese Character (or CJK?)
#show regex("[\u4e00-\u9fa5]"): set text(
  font: ("SimSun"),
  lang: "zh"
)

This is the rendering result:
image

@alerque
Copy link
Contributor

alerque commented Nov 15, 2023

Won't that result in typst setting the font explicitly for every character separately? Won't that screw with shaping and possibly bloat the output?

@laurmaedje
Copy link
Member

Indeed. Adding a to the regex would help somewhat, but it's probably still a bit brittle.

@poetlife
Copy link

Maybe using Font Family is a better choice.

#set text(
  font: ("Times New Roman", "SimSun"),
)

The font SimHei contains English letters, but Times New Roman doesn't contain any Chinese Character. So, this is a variant method of setting both Chinese and English font.

@Leedehai
Copy link
Contributor

Leedehai commented Dec 22, 2023

While the idea of language-dependent font settings works for the language pair illustrated above, since Unicode code points for Chinese characters are not in English alphabets at all, it could be a bit challenging/infeasible for Typst to scale this solution to other language pairs like Chinese/Japanese, since they have overlapping Unicode code points. Other language pairs may have similar issues as well.

To address this, maybe Typst needs to go a level lower: provide Unicode code range matching utilities (like CSS for web) so that users or package authors can configure fonts for different Unicode code points as they wish.

@mkpoli
Copy link
Contributor

mkpoli commented Jan 8, 2024

I just wrote a detailed article related to this issue in Japanese, for a temporary solution.
https://zenn.dev/mkpoli/articles/6234c1d2a595bd

To summary, it's the same to poetlife's approach, but with the usage of Unicode Properties.

For Japanese, because it uses three scripts Han (Chinese characters, 漢字 Kanji), Hiragana and Katakana, we need to select all of them.

#set text(font: "Noto Serif CJK JP") // Set base font
#show regex("[\p{scx:Han}\p{scx:Hira}\p{scx:Kana}]"): set text(font: "Noto Sans CJK JP") // Set Japanese font
*Typstにおける和欧混植のフォント設定法*

For Chinese, we only need Han.

#set text(font: "Noto Serif CJK SC") // Set base font
#show regex("[\p{scx:Han}]"): set text(font: "Noto Sans CJK SC") // Set Chinese font
*利用Typst中西文混排不同字体*

image

@mkpoli
Copy link
Contributor

mkpoli commented Jan 8, 2024

After some research, I find out that the real problem is #1024 where selectors do not work on set rules, so that text.where(...) is not selecting anything.

The ideal descriptive approach of language selection is to firstly set the default language for each script used, then specify the language explicitly if the script is shared by multiple languages. As for now, this is not working because of the mentioned issue.

#set text(font: "Noto Serif CJK SC", lang: "zh", script: "HanS")

// #show regex("\p{sc:Latn}"): set text(lang: "en") // You cannot use regex here because of the following reason below

#let en = text.with(lang: "en")
#let fr = text.with(lang: "fr")
#let ja = text.with(lang: "ja")
#let HanT = text.with(script: "HanT")

#show text.where(lang: "en"): set text(fill: red)
#show text.where(lang: "fr"): set text(fill: blue)
#show text.where(lang: "ja"): set text(fill: yellow)
#show text.where(lang: "zh", script: "HanT"): set text(fill: green)

中文中混有#en[English]和#fr[français]以及#ja[日本語]的时候,#HanT[我們默認拉丁文字是英文,漢字是簡體字中文,除非例外指定]。

For now, you can use let keyword to specify style directly as an imperative approach:

#set text(font: "Noto Serif CJK SC", lang: "zh", script: "HanS")

#let en = text.with(lang: "en", fill: red)
#let fr = text.with(lang: "fr", fill: blue)
#let ja = text.with(lang: "ja", fill: yellow)
#let HanT = text.with(script: "HanT", fill: green)

中文中混有#en[English]和#fr[français]以及#ja[日本語]的时候,#HanT[我們默認拉丁文字是英文,漢字是簡體字中文,除非例外指定]。

image

For the regex approach to set a default language / style, because of show regex / show str has a higher priority (somewhat similar to specificity in CSS) than set rules, so even if you set English as default saying English和#fr[français] with #let fr = text.with(lang: "fr", fill: blue), since français matches /\p{sc:Latn}/, it will always be red same as English rather than blue. I wonder if it is possible to overwrite the priority (ignore text/regex matches inside a function).

@laurmaedje laurmaedje changed the title setting different font with different language? Language-dependant font configuration Jul 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request New feature or request text Text layout, shaping, internationalization, etc.
Projects
None yet
Development

No branches or pull requests

6 participants