jed-users mailing list

[2008 Date Index] [2008 Thread Index] [Other years]
[Thread Prev] [Thread Next]      [Date Prev] [Date Next]

Re: iconv module documentation or examples?


John E. Davis spake unto us the following wisdom:
> [4:16pm] /tmp>gcc foo.c
> [4:17pm] /tmp>./a.out
> UTF-8
> [4:17pm] /tmp>printenv LANG
> en_US.UTF-8
> 
> For this reason, I think that something like this might work:
> 
>    define get_encoding ()
>    {
>       if (_slang_utf8_ok) return "UTF-8";
> 
>       variable lang = getenv ("LANG");
>       if (lang == NULL)
>         return NULL;
>       variable fields = strchop (lang, '.', 0);
>       if (2 == length (fields))
>         return fields[1];
>       return NULL;
>    }

Unfortunately, this won't be reliable, for several reasons; one,
locale names are up to the system to at least some extent, so they can
choose to stuff other information in that space (look at the list of
locale on a non-Linux non-386BSD-derived system; they're often weird
and wonderful).  Two, even on systems with regular locale name syntax
like above, the character set is not always present.  The "C" locale,
for example, is required to exist, and its associated character set
will always (if I'm not mistaken) be whatever the current system calls
ASCII ("ANSI_X3.4-1968" on recent glibc, "646" on Solaris, etc.).
Finally, even when that string does represent the character set in
some way, it may not be in the canonical form required by the system
iconv.  (GNU iconv is pretty liberal in what it accepts, but many
other systems are much more strict.  UTF-8, UTF8, and utf8 may not all
be valid encodings on all systems, for example.)

Ethan

-- 
The laws that forbid the carrying of arms are laws [that have no remedy
for evils].  They disarm only those who are neither inclined nor
determined to commit crimes.
		-- Cesare Beccaria, "On Crimes and Punishments", 1764

Attachment: signature.asc
Description: Digital signature


[2008 date index] [2008 thread index]
[Thread Prev] [Thread Next]      [Date Prev] [Date Next]