jed-users mailing list

[2008 Date Index] [2008 Thread Index] [Other years]
[Thread Prev] [Thread Next]      [Date Prev] [Date Next]

RE: iconv module documentation or examples?


> -----Original Message-----
> From: G. Milde
> Sent: mercoledì 27 agosto 2008 16.53
> Subject: Re: iconv module documentation or examples?
> 
> On 26.08.08, Dino wrote:
> > While I'm very very busy both at work (my company has been 
> > merged with
> > another) and at home (I have a baby to play with, so no 
> > more free time to
> > hack Jed), 
> 
> Congratulations!

Thanks!
> 
> > I took a little bit of time to write a simple script using the iconv
> > module. This is an 'iconv' command line program implementation in
> > Slang, and tries to be compatible with the iconv program present in
> > almost all linux distributions.
> 
> Thanks, this will serve me as a starting point to use the iconv module
> for utf8helper.sl (http://jedmodes.sf.net/mode/utf8helper/). Before
> re-inventing the wheel, I will have to check your charset.sl 
> script, too.
> 
> > Using is very simple: run it as 
> 
> > slsh testiconv.sl -f encoding_in -t encoding_out [filein] [fileout]
> 
> According to the man page here, the interface is slightly different:
> 
>   iconv -f encoding [-t encoding] [inputfile]...
>   
> i.e. the "-t encoding" is optional, defaulting to the current locale's
> encoding. 
> 
> Just curious: how could I find out the (iconv-compatible) name of the
> current locale's encoding in a S-Lang script?

Well, this is why I didn't implement exactly the same interface of the
standard 'iconv' program :-) I don't know how to get the iconv-compatible
name of the character set used by current locale (not in SLang, niether in
C), but if somebody finds a way to do this, it should be easy to update the
script.

> > Also, I would like to have the 'testiconv.sl' script added to the
> > 'slsh/scripts' directory in the Slang repository (maybe 
> > renamed simply
> > to 'iconv').
> 
> This would clash with iconv.sl the already existing module wrapper.
> How about slsh/examples/iconv-cli.sl?
> 
> > BTW, I still think that always working in UTF-8, and converting when
> > loading or saving files. 
> 
> This is what utf8helper.sl provides (for latin1 vs. UTF-8 encoded
> files) with high configurability.
> 
> > A detailed conversation about this can be found on message
> > http://ruptured-duck.com/jed-users/msg01836.html, and other 
> > messages linked
> > there. There is also an attachment containing my 
> > 'charset.sl' script, this
> > may be useful as extracting the files from a windows 
> > installer my not be so
> > easy.
> 
> I only found a zip file containing a patch with 
> 
> --- ..\john\jed/lib/_charset.sl	1970-01-01 
> 01:00:00.000000000 +0100
> +++ jed/lib/_charset.sl	2005-08-01 14:13:35.795000000 +0200
> 
> so I would appreciate if you could send me a copy of your charset.sl.

Well, if you look a bit harder you will see that the patch contains also the
file 'charset.sl' and changes some lines in 'site.sl'.

But maybe I'm too used to read patches, so there is a new zip file
containing:
- jed-prepare-input-charset-support.patch: patch to site.sl.
- charset.sl
- _charset.sl

Let my try to explain how all this works:

- the patch to site.sl simply adds a new empty function
('charset_guess_buffer_encoding'), and calls this funtion before exiting
mode_hook(). So the function is called everytime a file is opened in Jed,
after the correct mode has been selected.

This maybe can be avoided using the "_jed_set_mode_hooks" hook, I have tried
it a bit (you can see some commented parts in _charset.sl) but I don't
remember if it worked.

Now the function 'charset_guess_buffer_encoding' does nothing, so I need to
override it to do something useful. As many things can fail, I must be
careful, so:

- somewhere in my .jedrc (jed.rc in windows), I put these lines:

#ifexists set_import_module_path
set_import_module_path("C:\\Program Files\\JED\\slsh\\modules");
load_file("charset.sl");
#endif

First I test that jed has the function I'm about to use (not every version
of Jed is compiled with modules support). Then I add the slsh modules path
to Jed (I think and hope that recent versions of Jed do this automatically,
so maybe is not needed). Then I load the 'charset.sl' script. 'load_file()'
is this simple function:

public define load_file(file)
{
	variable fullfile = expand_jedlib_file(file);
	
	if (andelse { fullfile != NULL } { fullfile != "" })
		() = evalfile(fullfile);
}

This is similar to using '() = evalfile('charset.sl');' but does not
generates an error if the file does not exists.

Now lets look at the charset.sl content. As I still don't know if importing
will work, this file creates stubs for every function.

If importing "iconv" works, it loads the '_charset.sl' script, where the
funtions are finally implemented.

The magic is made using the jed hooks. To override the file reading, the
"_jed_insert_file_hooks" and "_jed_read_file_hooks" are set. To Write in the
correct format "_jed_write_region_hooks" and "_jed_append_region_hooks" are
set. 

The only change to the user interface is adding a menu entry to change the
on-disk enconding of a file: so for example I can read a file in UTF-16, it
will be converted to UTF-8 on loading, then opening the 'Buffers' menu,
selecting 'Change Charset', writing 'ISO-8859-1' and saving the file, it
will be converted to latin1.

Right now the default is loading as UTF-8. far better would be loading as
the character set from the user locale settings, so answering your question
above can be useful even for charset.sl :-)

I hope the implementation details are easy enough for an experienced SLang
coder, if not, ask me for more info.

Dino





Attachment: charset.zip
Description: Zip compressed data


[2008 date index] [2008 thread index]
[Thread Prev] [Thread Next]      [Date Prev] [Date Next]