jed-users mailing list

[2003 Date Index] [2003 Thread Index] [Other years]
[Thread Prev] [Thread Next]      [Date Prev] [Date Next]

Re: slang: UTF-8 and strlen


On Sun, May 11, 2003 at 01:27:37PM -0400, John E. Davis wrote:

>    At the same time, I am adding support for UTF-8 to jed, which will
> serve to test the library.  (See http://www.jedsoft.org/images/jedutf8.png 
> for an image) 

Nice! I'd really can't wait to have it. Right now I have to do strange
contorsion between jed and yudit when I want to answer in the linguistic
newsgroup (using IPA etc...). 

A nice thing would be to have everytime under your eyes the encoding you are
using --- a nice "UTF-8" or "ISO-...-15" in some place in the main windows.
And function to switch encoding, too. 


> In doing so, I came across the following
> "issue".  What should the interpreter's strlen function return?



> Currently, it knows nothing about the encoding and returns the number
> of bytes making up the string.  However, it could be modified to
> return one of the following:
> 
>    1.  The number of bytes in the string.
>    2.  The number of characters in the string, including combining
>        characters.
>    3.  The number of characters in the string, not counting the
>        combining characters.
> 

Well, the problem is: if strlen is mainly used to count "how much visual
space" the string occupy on screen, option #3 is the correct one; not only,
but you should take into account wide char that occupy 2 places. But I do
not know how this can mix with searching etc etc.

I would like to suggest to borrow "wcswidth", "wcslen" (man 3 wcswidth) 
and company,  aka the the POSIX wide-char string visual lenght attribute. Or
add to strlen an "encoding" optional parameter. 

The real neat thing would be to deprecate strlen and force the
user/programmer to use the correct function, but I understand this is
practically unviable. 

Thanks,
           Romano 


-- 
Romano Giannetti             -  Univ. Pontificia Comillas (Madrid, Spain)
Electronic Engineer - phone +34 915 422 800 ext 2416  fax +34 915 411 132

--------------------------
To unsubscribe send email to <jed-users-request@xxxxxxxxxxx> with
the word "unsubscribe" in the message body.
Need help? Email <jed-users-owner@xxxxxxxxxxx>.


[2003 date index] [2003 thread index]
[Thread Prev] [Thread Next]      [Date Prev] [Date Next]