Next Previous Contents

7. Mongolia - Computing Issues

7.1 Is there some kind of ``Mongolian ASCII'' or commonly acknowledged encoding standard for Mongolian language data processing?

Unlike the American ASCII code, the Chinese GuoBiao code or the Japanese JIS code there is not yet a national code system for the encoding of Mongolian writing be it encoded in its Classical or Cyrillic form. As a consequence, no international standard organization (like ISO) could accept a national standard and turn it into an international one.

The problems we find in this field are of a complex nature and frequently have strong mutual dependencies.

Let's look at Cyrillic encoding first. It is not far-fetched to suggest using an existing Cyrillic encoding scheme for encoding Mongolian but not even such a simple idea is without its traps. There is more than one Cyrillic encoding, and some encodings are incomplete: they do not include the Cyrillic yo or ë. In addition, these tables (or code pages) usually have no space to accommodate the additional Mongolian vowel symbols ü which must then be placed somewhere outside the natural order of the alphabet. Several modified code pages of this type exist; implementations available are mentioned below.

With Classical writing, the situation is even more complicated. For a long time in history, there has not been one commonly acknowledged Classical Mongolian alphabet (or cagaan tolgoï); differences can be observed in the number of letters, the sorting order and the treatment of ambiguous letters which have more than one reading for a given shape, like t/d. The situation is further complicated by the fact that one given letter may assume numerous different shapes depending on its position within the word. The designer of an encoding scheme has to decide whether only canonical letters (the ones under which one would try to find a word in a dictionary) are to be included or whether all shape variants should be included as well.

The next problem arises when thinking of computer technology. The eight bit (one byte) code space of commonly used systems cannot hold more than 256 characters of which 128 have been defined already. If both Cyrillic and Classical writing are to be enclosed in one common code space, it is only possible at the cost of sharing common letter shapes between Latin and Cyrillic characters. There is no other choice if one wants to avoid the switching of code pages in one document.

Another problem intimately related to writing is the field of transcriptions and transliterations. The layout of rules for transliterating Classical or Cyrillic Mongolian has many consequences in the field of data exchange, automatic text processing, the building of library catalogues, etc. Some popular systems (e.g. the so-called Petersburg transliteration) use characters which are not readily available on today's computers, and the ones working with reduced character sets are sometimes not popular.

Only in recent years (more or less starting with the UNESCO conference on the Computerization of Mongolian script in Ulaanbaatar in August 1992) there has been a genuine international effort to solve these problems and to come up with an encoding scheme that will be accepted world-wide. The Mongolian National Institute for Standardization and Metrology (MNISM), the Chinese National Standard Bureau, other standard bodies of other countries, ISO and UNICODE all have held regular meetings during the last years in order to define a standard.

So far, no final agreement exists, and there is no software package which could serve as a demonstrator for this future standard. All available software either defines its own code page or relies on ASCII representations of Mongolian which are then converted into Mongolian writing.

7.2 Are there computer programs for processing Mongolian language documents?

Yes, there are.

Nota Bene: While the editor is happy to offer this information it must be mentioned as a caveat that in most cases the editor could neither verify the sources of these programs nor did he have a chance to review them. In addition, not all of the programs are direct competitors: some of them provide `pure' front-ends for printing systems, other focus on data models which make them useful for text processing, etc. The available programs can be roughly classified as follows:

Next Previous Contents