Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support ligature using harfbuzz #1659

Draft
wants to merge 15 commits into
base: master
Choose a base branch
from
Draft

Conversation

Mandarancio
Copy link

Support to ligature (#66) rendering using harfbuzz

20231029_20h49m40s_grim

@Guldoman
Copy link
Member

As a note, I have a library in the works that's quite a bit smaller than harfbuzz.
It obviously doesn't do as much as harfbuzz (and is simpler to use too), but ligature support is almost 100% complete (basically only Lookup Flags are missing).

Feel free to continue working on this tho, might be a nice comparison point.

@jgmdev
Copy link
Member

jgmdev commented Oct 29, 2023

Looking good! I would be interested on integrating this change into Pragtical once it is complete. There is a meson.wrap for harfbuzz that can be easily added for CI builds.

@Mandarancio
Copy link
Author

Peek.2023-10-30.15-00.mp4

The recording is pretty crappy, but it shows that everything seems to work, I am not sure if I am missing something, maybe someone else could try it and let me know if every is fine?
Otherwise on my side it seems pretty good, everything is working smoothly and I do not see any drop in performances.

@Mandarancio Mandarancio reopened this Oct 30, 2023
@Mandarancio
Copy link
Author

I added the wrap to harfbuz, now it should work (not 100% sure as I never used wraps or github CI).

meson.build Outdated Show resolved Hide resolved
@jgmdev
Copy link
Member

jgmdev commented Nov 1, 2023

Otherwise on my side it seems pretty good, everything is working smoothly and I do not see any drop in performances.

Tested a little bit and it is working pretty nice, the only performance issue I see so far is with font loading. Try ctrl wheelup or ctrl wheeldown and you will notice it is slower than before.

@Mandarancio
Copy link
Author

Mandarancio commented Nov 1, 2023

Otherwise on my side it seems pretty good, everything is working smoothly and I do not see any drop in performances.

Tested a little bit and it is working pretty nice, the only performance issue I see so far is with font loading. Try ctrl wheelup or ctrl wheeldown and you will notice it is slower than before.

First of all thanks for the help with the meson dependencies! Second of all now I slighty changed the way I load the hb fonts and changed the caching parameters. On my machine I can not see any significant difference between the lite-xl/lite-xl and mine, but it could be my configuration. Can you try it now and let me know if it feels better?

@jgmdev
Copy link
Member

jgmdev commented Nov 1, 2023

Second of all now I slighty changed the way I load the hb fonts and changed the caching parameters.

I think you forgot to push that change, I only see the meson build options (eagerly waiting for the improvement 😃). Also, I noticed that now fallback fonts are not working. For example I have the following fonts:

  1. NotoSansMono-Medium.ttf
  2. NotoEmoji-Medium.ttf
  3. unifont-15.0.01.ttf

Emojis and other stuff is not rendering because only the first font is been used.

@jgmdev
Copy link
Member

jgmdev commented Nov 1, 2023

Just in case this is what I mean with slower font loading:

Non Ligatures:

non-ligatures.mp4

With Ligatures

with-ligatures.mp4

@jgmdev jgmdev closed this Nov 1, 2023
@jgmdev jgmdev reopened this Nov 1, 2023
@Mandarancio
Copy link
Author

Mandarancio commented Nov 1, 2023

Second of all now I slighty changed the way I load the hb fonts and changed the caching parameters.

I think you forgot to push that change, I only see the meson build options (eagerly waiting for the improvement 😃). Also, I noticed that now fallback fonts are not working. For example I have the following fonts:

1. NotoSansMono-Medium.ttf

2. NotoEmoji-Medium.ttf

3. unifont-15.0.01.ttf

Emojis and other stuff is not rendering because only the first font is been used.

You are right! I forgot to push the commit! Ops! I will look into the fallbacks fonts issue (I believe to know the source of the problem and I will look into it).

@jgmdev
Copy link
Member

jgmdev commented Nov 1, 2023

You are right! I forgot to push the commit! Ops! I will look into the fallbacks fonts issue (I believe to know the source of the problem and I will look into it).

Whoa now loading performance is much better 🥳, now the only missing issue is restoring font fallbacks support and I would say this is pretty good to go :), would be nice if ligatures could be made optional in the same vain of bold/strike/smooth, etc...

@jgmdev
Copy link
Member

jgmdev commented Nov 1, 2023

Was experimenting with GLYPHSET_SIZE and it seems that setting it to #define GLYPHSET_SIZE 8 is enough now with the use of harfbuzz, loading is even faster than originally.

@jgmdev
Copy link
Member

jgmdev commented Nov 2, 2023

Playing a bit more with the code and was able to get fallback fonts working again here is the patch:

pragtical/pragtical@ed28c3a

Edit: had to repush removal of debugging artifact

@Mandarancio
Copy link
Author

Mandarancio commented Nov 2, 2023

Playing a bit more with the code and was able to get fallback fonts working again here is the patch:

pragtical/pragtical@ed28c3a

Edit: had to repush removal of debugging artifact

I also succeeded to add back the fallback fonts support (let me know what do you think about my solution) :)
Screenshot_20231102_150156

@jgmdev
Copy link
Member

jgmdev commented Nov 2, 2023

I also succeeded to add back the fallback fonts support (let me know what do you think about my solution) :) Screenshot_20231102_150156

Great, this way seems more efficient. And yay it is working:

fallbacks-working

The other issue is character fallbacks are not working, this is a screenshot of master without an emoji font added to the list of fallbacks:

character-fallback

And this is what I'm getting with the harfbuzz changes:

no-character-fallback

Also some fonts like Noto Sans Mono Medium seem to have half baked ligature stuff on them which causes wrong rendering with ligatures enabled (notice the extra spacing on the &&):

noto-sans-mono-ligatures-enabled

With ligatures disabled it looks normal:

noto-sans-mono-ligatures-disabled

Maybe making ligatures a toggleable style property as in Bold/Italic/Underline would be useful in such cases.

@Mandarancio
Copy link
Author

spacing

About the missing characters squares is my fault, I removed the code to draw the empty boxes when debugging my code (I will put it back).
About the wired && ligature I noticed that some fonts (such as Noto) has strange behaviors with it, but is very font dependents, I will look into it and try to understand the source of the problem!

hb_buffer_add_utf8(buf, text, -1, 0, -1);

RenFont * font = fonts[0];
hb_shape(font->font, buf, NULL, 0);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question, in case that a fallback font is used for some codepoints not available on main font, shouldn't that portion get shaped also with the fallback font instead of using the main font?

Copy link
Author

@Mandarancio Mandarancio Nov 2, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question, as it is implemented now, ligatures are supported only for the main font and so the shaping is only applied to the main one, for the fallaback case I am using the fb_codepoint that is computed in src/renderer.c:413 that is then converted if needed to the correct font index using FT_Get_Char_Index at src/renderer.c:181.
I hope the explanation is clear.
It would be possible to shape for all the available fonts but I suspect will be more computational intensive and you could end up with half a ligature of a fallback font and half a normal character form another.
What do you think?

edit: also I am not sure that the mapping between text and glyph would be the same and could be possible that some kind of mapping would be needed between different fonts and the string.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The function shared on pragtical/pragtical@ed28c3a iterates a string giving you the range of a string (thru the char** next out param) and returns the font that has all of its codepoints for rendering:

static RenFont* utf8_iterator(
  RenFont** fonts, const char* start, const char* end, char** next
) {
  if (start == end) return NULL;

  unsigned int codepoint;
  RenFont* current_font = fonts[0];

  bool iterated = false;
  *next = (char*) start;
  char* prev_next = *next;
  while(*next < end) {
    *next = (char*) utf8_to_codepoint(*next, &codepoint);
    if (!FT_Get_Char_Index(current_font->face, codepoint)) {
      if (!iterated) {
        bool codepoint_found = false;
        for (short i = 1; i < FONT_FALLBACK_MAX && fonts[i];   i) {
          if (FT_Get_Char_Index(fonts[i]->face, codepoint)) {
            current_font = fonts[i];
            codepoint_found = true;
            break;
          }
        }
        if (!codepoint_found)
          return fonts[0];
      } else {
        // set next to the end of previously iterated text range
        *next = prev_next;
        return current_font;
      }
    }
    // Allow emojis to be rendered with different fonts, this is needed because
    // sometimes a single glyph can be queried, which could be on the first
    // font and also on previously used fallbacks when rendering.
    if (!iterated && (*next - start) > 3) {
      return current_font;
    }
    prev_next = *next;
    iterated = true;
  }

  return current_font;
}

This way you can shape string segments using the assigned font (which is the strategy used on that patch). When tested I didn't notice a performance drop but I didn't benchmark the thing :)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I start to have a look at your commit but I did not finish yet as I am quite busy now. However I am still not 100% sure that is a good idea to support ligatures in fallback fonts, maybe can @Guldoman gives us his opinion?

@Mandarancio
Copy link
Author

I also succeeded to add back the fallback fonts support (let me know what do you think about my solution) :) Screenshot_20231102_150156

Great, this way seems more efficient. And yay it is working:

fallbacks-working

The other issue is character fallbacks are not working, this is a screenshot of master without an emoji font added to the list of fallbacks:

character-fallback

And this is what I'm getting with the harfbuzz changes:

no-character-fallback

Also some fonts like Noto Sans Mono Medium seem to have half baked ligature stuff on them which causes wrong rendering with ligatures enabled (notice the extra spacing on the &&):

noto-sans-mono-ligatures-enabled

With ligatures disabled it looks normal:

noto-sans-mono-ligatures-disabled

Maybe making ligatures a toggleable style property as in Bold/Italic/Underline would be useful in such cases.

Both issues should be solved now! :D

@jgmdev
Copy link
Member

jgmdev commented Nov 2, 2023

Both issues should be solved now! :D

Whoa, I can confirm it works now! The fixes to tab size handling code now cause tabs to not be rendered as spaces (notice on the screenshot 2 tabs rendered with the glyph provided by fallback font unifont):

tabs-not-as-spaces

This could be fixed on the lua side by replacing tabs with the amount of spaces set on the document tab size when sending text to renderer but it would perform slower I guess because of all the strings allocations and lookups that would be performed by string.gsub. Or just perform a glyph replacement of tab -> space to the amount of required spaces, maybe by dividing the size_of_tab_set_with_set_tab_size / width_of_single_space, don't know...

You will come up with a better solution :)

Edit:

Also forgot, one have to take into consideration the drawwhitespace plugin.

@jgmdev
Copy link
Member

jgmdev commented Nov 3, 2023

Just in case this change fixes the tab issue mentioned above, which I didn't properly look before commenting 😞

diff --git a/src/renderer.c b/src/renderer.c
index bd6bf993..ba3398d3 100644
--- a/src/renderer.c
    b/src/renderer.c
@@ -176,14  176,18 @@ static RenFont* font_group_get_glyph(GlyphSet** set, GlyphMetric** metric, RenFo
   if (!metric) {
     return NULL;
   }
   bool is_tab = false;
   if (fb_codepoint == '\t') { is_tab = true; fb_codepoint = '\0'; }
   if (bitmap_index < 0)
     bitmap_index  = SUBPIXEL_BITMAPS_CACHED;
   for (int i = 0; i < FONT_FALLBACK_MAX && fonts[i];   i) {
     unsigned cp = i == 0 ? codepoint : FT_Get_Char_Index(fonts[i]->face, fb_codepoint);
     *set = font_get_glyphset(fonts[i], cp, bitmap_index);
     *metric = &(*set)->metrics[cp % GLYPHSET_SIZE];
-    if ((*metric)->loaded || fb_codepoint == 0)
     if ((*metric)->loaded || fb_codepoint == 0) {
       if (is_tab) (*metric)->xadvance = fonts[i]->tab_advance;
       return fonts[i];
     }
   }
   if (*metric && !(*metric)->loaded && fb_codepoint > 0xFF && fb_codepoint != 0x25A1)
     return font_group_get_glyph(set, metric, fonts, 0x25A1, 0x25A1, bitmap_index);

@Mandarancio
Copy link
Author

Mandarancio commented Nov 6, 2023

@jgmdev the colliding codepoint issue between \t and other char (such as & for the Noto sans example) it has been fixed (at least to me look like its working fine). Can you check?
Screenshot_20231106_145510

@jgmdev
Copy link
Member

jgmdev commented Nov 6, 2023

Can you check?

On a default install when pressing tab with detected indentation and when pressing tab after setting indentation to tabs:

tabs-on-default-install.mp4

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants