jed-users mailing list

[2005 Date Index] [2005 Thread Index] [Other years]
[Thread Prev] [Thread Next] [Date Prev] [Date Next]

UTF-8 mode and WJed

Subject: UTF-8 mode and WJed
From: SANGOI DINO <SANGOID@xxxxxxxxxxxxxxxxx>
Date: Wed, 27 Apr 2005 11:11:50 +0200

Hi, 
 
While looking at wterm.c, I realized that enabling a *simple* unicode
drawing (limited to non-combining, single width characters with values below
0xFFFD, skipping surrogates) should be easy: simply taking all UTF-32
characters below 0xFFFD to build a WCHAR (2 bytes long) array, and use
TextOutW() instead of TextOut().

My problem is how to update vterm: I have a SLsmg_Char_Type or a WCHAR
array, and should call vterm_write_nchars() or the new vterm_write_nbytes(),
but in both cases I should convert the string to another format...

Maybe we can add an easy vterm_write_slsmg_chars() (after all, it's the
format used by vterm...). But I start to think that keeping slsmg, vterm and
video in sync is too hard, for too little gain. Vterm is needed only to be
able to refresh the window for expose events. But we already have all the
information we need in SLsmg, so why don't take it from there?

I know that vterm works in the same way an hardare terminal (or a terminal
emulator) works, but SL_Screen array is not going away, and we already have
exported functions to peek characters on SL_Screen: SLsmg_char_at() and
SLsmg_read_raw().

So this is a proposed patch:
- remove every reference to vterm.
- implements two functions (msw_write_smgchars() and msw_write_smgchar()).
These do every piece needed to correctly draw to screen the SLsmg
character(s). These also handle differences between slang1 and slang2, and
between UTF-8 enabled and disabled.
- every other function handles only SLsmg chars, and calls only
msw_write_smgchars() or msw_write_smgchar() to draw text. (mainly
cover_exposed_area() and msw_smart_puts(), but also hide_cursor() and
show_cursor()).
- cover_exposed_area() read data directly from SLsmg using SLsmg_read_raw().
- create a function (send_key_sequence()), used to send sequences  generated
by wterm (e.g. for arrow and function keys).
- change _putkey() (now called only for "pure" text) to map UTF-16
characters to UTF-8 when in utf-8 mode.

The only function I don't know how to implement is msw_write_string(). This
is called only for tt_send() slang function, and is moslty intended as a
hack to bypass SLsmg and send data directly to the terminal. I can interpret
it as being utf-8 encoded if utf-8 enabled, convert to WCHARs and send to
_tt_writeW.

Anyways, this is a highly patched wterm.c. Note that this surely breaks
compilation on windows 3.1.

To see it at work, go to http://www.paneura.com/~dino/wjed-unicode.html. The
screenshots are taken on Windows 95A (the older version I can reach).

I also tried hacking xterm.c in the same way (delete all references to
vterm, and read data with SLsmg_read_raw()), and works, but the patch was a
hack, so it's not included. But it can be donw.

Comments?

								Dino

Attachment: wjed-utf8-support.diff.gz
Description: Binary data

[2005 date index] [2005 thread index]
[Thread Prev] [Thread Next] [Date Prev] [Date Next]