Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make string handling unicode-preserving #25

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

m7a
Copy link
Contributor

@m7a m7a commented Apr 13, 2024

Hello,

I tried to get cecho to work with non-ASCII characters but it looked as if they were always mangled. After some digging into the code I came up with a solution for the string-processing functions that keeps UTF-8 data as-is resulting in correct display on my terminal (urxvt).

Feel free to include it into the upstream if it makes sense to you 😄

I think this patch is good, but it could still be incomplete e.g. I did not test with non-ASCII border drawing characters yet. A similar issue might come up there, too, but it could then possibly be resolved in a separate commit?

Thanks in advance
Linux-Fan (@m7a)

Commit Message

Previously, it was impossible to output non-ASCII characters using cecho for multiple reasons:

  • Strings were processed as flattened lists. This caused encoding information to not be preserved properly.
  • Data was transmitted as string from the Erlang to the C side causing encoding information not to arrive correctly.
  • The library linked against libncurses rather than libncursesw and as a result, no “wide character” support was available.

This commit fixes this by switching the string handling to be based on iolists and binaries. String values are transferred to the C-side as binaries rather than strings now.

Additionally, upon compilation, the C part is linked against libncursesw in favor of the previously chosen libncurses.

The API remains compatible with preceding invocations and still allows strings to be passed to all of the string functions.

Previously, it was impossible to output non-ASCII characters using cecho for
multiple reasons:

 * Strings were processed as flattened lists.
   This caused encoding information to not be preserved properly.
 * Data was transmitted as string from the Erlang to the C side causing
   encoding information not to arrive correctly.
 * The library linked against libncurses rather than libncursesw and as
   a result, no “wide character” support was available.

This commit fixes this by switching the string handling to be based on iolists
and binaries. String values are transferred to the C-side as binaries rather
than strings now.

Additionally, upon compilation, the C part is linked against `libncursesw` in
favor of the previously chosen `libncurses`.

The API remains compatible with preceding invocations and still allows strings
to be passed to all of the string functions.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant