Input methods: Difference between revisions

Revision as of 22:58, 31 July 2006

In order to input text in any particular language and writing system, we need a Unicode font to display it in, a rendering engine that knows how to display it, and a keyboard layout or Input Method Editor (IME) that provides a way to get all of the needed characters. Most alphabetic and syllabic languages can be typed on fairly simple keyboards that produce one Unicode character per key combination, using the ordinary typing keys together with Meta (usually Alt) and Compose keys (further description needed). Any accented letter that is included in Unicode in precomposed form falls within this capability. This covers letters that occur in any widely-used pre-Unicode character set, such as Latin-1 (ISO-8859-1), which supports French, German, Spanish, Italian, Scandinavian languages, and some other languages that use only the accented letters in Latin-1.

Multiple diacritics can be entered sequentially on simple keyboards of this type, while more elaborate input methods can enter more than one Unicode character code into the input buffer for each key combination. Yoruba is an example of a language that poses this choice, because it has vowel letters with an acute accent above and a dot below that are not available precomposed in Unicode.

The most elaborate IMEs are for input of CJKV characters for Chinese, Japanese, Korean, and the historical Vietnamese Chu Nomh writing. Each of these languages requires several thousand characters at a minimum, and there is a desire to have much more extensive CJKV sets available, including a number of Hong Kong characters and other recent additions, or the tens of thousands of historical characters important for scholarship.

Several hundred methods for entering CJKV characters have been invented over several decades. Among the most important (due to efficiency of use or ease of learning, or in a few cases both) are language-specific phonetic conversion systems for Chinese, Japanese, or Korean, and shape-based systems that are in principle independent of language, but in practice specific to particular countries up to now.

Tools for keyboard layouts, to come.

Tools for IMEs, to come.

List of Input methods by language, with links, to come.

Traditional Chinese

- Zhuyin conversion

- Romazi conversion

- Cangjie

- Four Corners

etc.

Simplified Chinese

- Pinyin

- Wubi

etc.

Japanese

- Romaji conversion

- Kana conversion

etc.

Korean

- Romaja conversion

- Hangeul conversion

etc.

Descriptions of IMEs, to come.

xcin

IIIMF

Chinput

etc.

Tests of IMEs, need help from expert users.

External links

To come.

Revision as of 22:56, 31 July 2006 (view source) 65.205.251.51 (talk) (New article)		Revision as of 22:58, 31 July 2006 (view source) 65.205.251.51 (talk) (Fix links) Newer edit →
Line 1:		Line 1:
	In order to input text in any particular [[language]] and [[writing system]], we need a [[Unicode]] [[font]] to display it in, a [[rendering engine]] that knows how to display it, and a keyboard layout or Input Method Editor (IME) that provides a way to get all of the needed characters. Most alphabetic and syllabic languages can be typed on fairly simple keyboards that produce one Unicode character per key combination, using the ordinary typing keys together with Meta (usually Alt) and Compose keys (further description needed). Any accented letter that is included in Unicode in precomposed form falls within this capability. This covers letters that occur in any widely-used pre-Unicode character set, such as Latin-1 (ISO-8859-1), which supports French, German, Spanish, Italian, Scandinavian languages, and some other languages that use only the accented letters in Latin-1.		In order to input text in any particular [[Languages\|language]] and [[Writing systems\|writing system]], we need a [[Unicode]] [[font]] to display it in, a [[rendering engine]] that knows how to display it, and a keyboard layout or Input Method Editor (IME) that provides a way to get all of the needed characters. Most alphabetic and syllabic languages can be typed on fairly simple keyboards that produce one Unicode character per key combination, using the ordinary typing keys together with Meta (usually Alt) and Compose keys (further description needed). Any accented letter that is included in Unicode in precomposed form falls within this capability. This covers letters that occur in any widely-used pre-Unicode character set, such as Latin-1 (ISO-8859-1), which supports French, German, Spanish, Italian, Scandinavian languages, and some other languages that use only the accented letters in Latin-1.

	Multiple diacritics can be entered sequentially on simple keyboards of this type, while more elaborate input methods can enter more than one Unicode character code into the input buffer for each key combination. [[Yoruba]] is an example of a language that poses this choice, because it has vowel letters with an acute accent above and a dot below that are not available precomposed in [[Unicode]].		Multiple diacritics can be entered sequentially on simple keyboards of this type, while more elaborate input methods can enter more than one Unicode character code into the input buffer for each key combination. [[Yoruba]] is an example of a language that poses this choice, because it has vowel letters with an acute accent above and a dot below that are not available precomposed in [[Unicode]].

Input methods: Difference between revisions

Revision as of 22:58, 31 July 2006

External links

Navigation menu

Search