Mongolia FAQ: Mongolia - Computing Issues

7. Mongolia - Computing Issues

7.1 Is there some kind of ``Mongolian ASCII'' or commonly acknowledged encoding standard for Mongolian language data processing?

Unlike the American ASCII code, the Chinese GuoBiao code or the Japanese JIS code there is not yet a national code system for the encoding of Mongolian writing be it encoded in its Classical or Cyrillic form. As a consequence, no international standard organization (like ISO) could accept a national standard and turn it into an international one.

The problems we find in this field are of a complex nature and frequently have strong mutual dependencies.

Let's look at Cyrillic encoding first. It is not far-fetched to suggest using an existing Cyrillic encoding scheme for encoding Mongolian but not even such a simple idea is without its traps. There is more than one Cyrillic encoding, and some encodings are incomplete: they do not include the Cyrillic yo or ë. In addition, these tables (or code pages) usually have no space to accommodate the additional Mongolian vowel symbols ü which must then be placed somewhere outside the natural order of the alphabet. Several modified code pages of this type exist; implementations available are mentioned below.

With Classical writing, the situation is even more complicated. For a long time in history, there has not been one commonly acknowledged Classical Mongolian alphabet (or cagaan tolgoï); differences can be observed in the number of letters, the sorting order and the treatment of ambiguous letters which have more than one reading for a given shape, like t/d. The situation is further complicated by the fact that one given letter may assume numerous different shapes depending on its position within the word. The designer of an encoding scheme has to decide whether only canonical letters (the ones under which one would try to find a word in a dictionary) are to be included or whether all shape variants should be included as well.

The next problem arises when thinking of computer technology. The eight bit (one byte) code space of commonly used systems cannot hold more than 256 characters of which 128 have been defined already. If both Cyrillic and Classical writing are to be enclosed in one common code space, it is only possible at the cost of sharing common letter shapes between Latin and Cyrillic characters. There is no other choice if one wants to avoid the switching of code pages in one document.

Another problem intimately related to writing is the field of transcriptions and transliterations. The layout of rules for transliterating Classical or Cyrillic Mongolian has many consequences in the field of data exchange, automatic text processing, the building of library catalogues, etc. Some popular systems (e.g. the so-called Petersburg transliteration) use characters which are not readily available on today's computers, and the ones working with reduced character sets are sometimes not popular.

Only in recent years (more or less starting with the UNESCO conference on the Computerization of Mongolian script in Ulaanbaatar in August 1992) there has been a genuine international effort to solve these problems and to come up with an encoding scheme that will be accepted world-wide. The Mongolian National Institute for Standardization and Metrology (MNISM), the Chinese National Standard Bureau, other standard bodies of other countries, ISO and UNICODE all have held regular meetings during the last years in order to define a standard.

So far, no final agreement exists, and there is no software package which could serve as a demonstrator for this future standard. All available software either defines its own code page or relies on ASCII representations of Mongolian which are then converted into Mongolian writing.

7.2 Are there computer programs for processing Mongolian language documents?

Yes, there are.

Nota Bene: While the editor is happy to offer this information it must be mentioned as a caveat that in most cases the editor could neither verify the sources of these programs nor did he have a chance to review them. In addition, not all of the programs are direct competitors: some of them provide `pure' front-ends for printing systems, other focus on data models which make them useful for text processing, etc. The available programs can be roughly classified as follows:

Layout software for Classical Mongolian produced at Inner Mongolia University for MSDOS and UNIX platforms. Maybe this is the most complete package one can dream of since it supports everything from different writing styles (Ulaanbaatar vs. Inner Mongol typeface) to different alphabets (including Oirat, Phags-ba etc.) Availability: Yes, but with a high price tag in the four-digit USD range.
Windows Software by American and German producers. These are usually only font sets which are sold in combination with some exotic text processing software. Does not offer full support for correct conversion of text data, etc.
The ``Sudar'' package of the National University of Mongolia was written in 1991/2 by M. Erdenechimeg. This package runs on a DOS platform, can do both Classical and Modern Mongolian and has import utilities for a number of encodings. The author is developing a new package at the moment, the support for improvements of ``Sudar'' supposedly being discontinued.
``Cyrillic only'' products for enhancing MSDOS platforms are available at little or no cost in Mongolia. These include printer drivers, screen fonts and keyboard mappers for the extended Cyrillic alphabet. Around three or four different encodings are known under the following program names: NCC, MOSLAST, SUNCHIR and MONKEGA. No commercial code converters available, no support for Classical Mongolian.
Research-type programs for MacIntosh machines, produced by the Université de Nanterre but never made publicly available.
One classical font is offered by Ecological Linguistics for Mac systems.
A commercial font package is available for extended Cyrillic by Linguist's Software for both the Mac and PC worlds.
One apparently free Cyrillic font package for Mongolian is available from www.magicnet.mn, it is intended for Windows3xx users. Numerous reports were received that the system, once automatically installed (there is no manual installation process) replaces system fonts and keyboard drivers in an irreversible manner so it is difficult to use this font on an occasional basis.
Daniel Kai's XenoType Technologies' Inner and Outer Mongolian TrueType (and Postscript) fonts for the Mac (as well as Soyombo, Phagspa) in the computer systems for Classical Mongolian. This system gets good reviews.
MBE -- Mongol Bichig Editor. Written in Taiwan and released in 1995, this editor for MSDOS system provides true vertical display and editing as well as 48-pixel and 96-pixel bitmap fonts for nice printing results. The awkward editing behaviour and the feature that everything between whitespace is regarded as one input and editing unit (one cannot delete a single letter, only a complete word!) make it a bit difficult to use. For documents in the pageno<10 range, like short letters etc. the system provides a simple interim solution until really powerful systems emerge.
MLS - Mongolian Language Support. Originally developed for IBM compatible PCs, now extended to the Unix world. Availability: free. See the MLS software section of Infosystem Mongolei. MLS is a MSDOS enhancement featuring support for both Classical and Cyrillic Mongolian. It offers conversion modules, a viewer for text with vertical lines and allows the continued use of (text mode) applications like dBASE, spreadsheets and text processing packages. Windows support is currently under development. Besides the MLS package itself there is the above-mentioned Mongolian text viewer (MVIEW) with on-line conversion from transliteration to Mongol script and a converter from Mongol text to graphics (MLS2PCX) which generates graphics files out of Mongolian language texts. The free packages do not yet contain printer support which is overly due and can be expected soon (said the author of MLS a long while ago). It should be mentioned that the focus of MLS lies in processing Mongolian language data and providing Internet support rather than creating beautiful documents. Technology advances rapidly, and the original devices conceived for printing MLS documents were superseded soon due to their numerous limitations. The MLS author then developed a generic MLS printing support via LaTeX, and in early summer 1998 a Windows software for printing Mongolian appeared, too, which will soon offer MLS support (see next two items).
MonTeX -- Mongolian for LaTeX2e. Donald Knuth's TeX is certainly the finest document processor available in the digital universe. It enjoys outstanding reputation in university circles and beyond. Since the original MLS package never provided meaningful printer support, the task of creating hard copy documents was relegated to TeX/LaTeX. MonTeX can typeset portions or complete texts of Cyrillic Mongolian in an acceptable manner. The package allows the use of virtually all popular codepage layouts, thus typesetting one's texts in the favourite environment should not pose too much of a problem. MonTeX is available from MLS or from the CTAN servers (Comprehensive TeX Archive Network).
QAGUCIN -- a Mongol Bicig editor for Windows95 and Windows3.xx with an editing window for transliterated Mongolian and an output window for Classical script. The QAGUCIN Download page offers this package for free. QAGUCIN is still in an early development stage but looks very promising. The author of QAGUCIN, Michael Warmuth, is also working on including MLS support.

Next Previous Contents