l i n u x - u s e r s - g r o u p - o f - d a v i s
L U G O D
 
Next Meeting:
January 6: Social gathering
Next Installfest:
TBD
Latest News:
Nov. 18: Club officer elections
Page last updated:
2003 Jun 09 16:18

The following is an archive of a post made to our 'vox-tech mailing list' by one of its subscribers.

Report this post as spam:

(Enter your email address)
Re: [vox-tech] vim and utf-8 support (newbie alert)
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [vox-tech] vim and utf-8 support (newbie alert)



On Mon, Jun 09, 2003 at 03:15:58PM -0700, Peter Jay Salzman wrote:
> note: in what follows, i'm a bit schizophrenic about "iso 10646" and
> "unicode".  since the tables and encodings are compatible in the most
> recent versions of the standard, i'm using them interchangeably.

Yeah, I pretty much always use "Unicode" to mean both of those. As far
as I'm concerned, there's not enough difference between them to
warrant careful distinction (in informal conversation, anyway), and
"Unicode" is so much easier to remember/pronounce than "ISO 10646"...

> 
> On Mon 09 Jun 03,  2:35 PM, Micah J. Cowan <micah@cowan.name> said:
> > On Mon, Jun 09, 2003 at 04:06:01PM -0500, Jay Strauss wrote:
> > 
> > OOC, Pete, are you planning on doing Hebrew homework or something like
> > that with vim?
>  
> i have some notes on vocabulary and grammar in dead tree format that i'd
> like to convert into magnetic format.   ;-)
> 
> >   2. I don't believe you can get the Hebrew vowels; but I haven't
> >      tried.
>  
> i only learned what ISO 10646 and utf is a few hours ago, but i thought
> that was the whole point of the ISO standard and unicode.

I was speaking of Emacs specifically... (check OM for context).

> i read that some of the characters in the 31 bit characterset were
> designated "combination characters" which provide accents for
> characters.

Yeah; what's great is there's the "combination characters", and also
for reasons of compatibility with existing encoding standards, there
are also characters which already have the vowels combined in
(IIRC). I have a copy of Unicode 3.0 in dead-tree format, but not with
me. (I believe the latest version is 4.0, released very recently).

> a) these are included in unicode for backwards compatibility
> b) you can always use two characters (combination characters) to
> represent pre-composed characters.

True. However, some formats will insist on one or the other, wherever
possible. For example, XML 1.1 demands that characters be precombined
to the extent possible. The main reason was that this happens to be
the format most documents are already in (at least for latin
languages, which were probably converted over from iso 8859), and they
wanted to settle on a specific canonical representation, so that they
could still use a byte-by-byte comparison, without having to worry
about whether there are two versions of "resume" (sorry, the station
I'm at doesn't have mule, so pretend there are accents), since they
are forced to use that particular representation. (Technically, in XML
1.0, it is quite possible to have two completely separate names that,
when normalized, are equivalent, but in byte representation were not
(i.e., one might use combination characters, the other might use
precombined).

> > Doesn't help you much, though, does it? ;)
>  
> heh.  well, before all this, i had zip, zero, nada knowledge of unicode,
> iso 10646, encodings, character tables, utf-2, utf-4, utf-8 and all
> sorts of non-english non-sense.

Unicode rocks, doesn't it? :)

-Micah
_______________________________________________
vox-tech mailing list
vox-tech@lists.lugod.org
http://lists.lugod.org/mailman/listinfo/vox-tech



LinkedIn
LUGOD Group on LinkedIn
Sign up for LUGOD event announcements
Your email address:
facebook
LUGOD Group on Facebook
'Like' LUGOD on Facebook:

Hosting provided by:
Sunset Systems
Sunset Systems offers preconfigured Linux systems, remote system administration and custom software development.

LUGOD: Linux Users' Group of Davis
PO Box 2082, Davis, CA 95617
Contact Us

LUGOD is a 501(c)7 non-profit organization
based in Davis, California
and serving the Sacramento area.
"Linux" is a trademark of Linus Torvalds.

Sponsored in part by:
Appahost Applications
For a significant contribution towards our projector, and a generous donation to allow us to continue meeting at the Davis Library.