Make string handling unicode-preserving #25

m7a · 2024-04-13T19:32:38Z

Hello,

I tried to get cecho to work with non-ASCII characters but it looked as if they were always mangled. After some digging into the code I came up with a solution for the string-processing functions that keeps UTF-8 data as-is resulting in correct display on my terminal (urxvt).

Feel free to include it into the upstream if it makes sense to you 😄

I think this patch is good, but it could still be incomplete e.g. I did not test with non-ASCII border drawing characters yet. A similar issue might come up there, too, but it could then possibly be resolved in a separate commit?

Thanks in advance
Linux-Fan (@m7a)

Commit Message

Previously, it was impossible to output non-ASCII characters using cecho for multiple reasons:

Strings were processed as flattened lists. This caused encoding information to not be preserved properly.
Data was transmitted as string from the Erlang to the C side causing encoding information not to arrive correctly.
The library linked against libncurses rather than libncursesw and as a result, no “wide character” support was available.

This commit fixes this by switching the string handling to be based on iolists and binaries. String values are transferred to the C-side as binaries rather than strings now.

Additionally, upon compilation, the C part is linked against libncursesw in favor of the previously chosen libncurses.

The API remains compatible with preceding invocations and still allows strings to be passed to all of the string functions.

Previously, it was impossible to output non-ASCII characters using cecho for multiple reasons: * Strings were processed as flattened lists. This caused encoding information to not be preserved properly. * Data was transmitted as string from the Erlang to the C side causing encoding information not to arrive correctly. * The library linked against libncurses rather than libncursesw and as a result, no “wide character” support was available. This commit fixes this by switching the string handling to be based on iolists and binaries. String values are transferred to the C-side as binaries rather than strings now. Additionally, upon compilation, the C part is linked against `libncursesw` in favor of the previously chosen `libncurses`. The API remains compatible with preceding invocations and still allows strings to be passed to all of the string functions.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make string handling unicode-preserving #25

Make string handling unicode-preserving #25

m7a commented Apr 13, 2024

Make string handling unicode-preserving #25

Are you sure you want to change the base?

Make string handling unicode-preserving #25

Conversation

m7a commented Apr 13, 2024