Unicode for Romanised Pali Scripts

by Ong Yong Peng, MEngSc, BE, BSc, MIEEE, MIEAust
28 January, 2005

Under the Unicode system, characters are placed in code charts [1]. Each character (or symbol) has a unique number, known as code point, assigned to it. In the Pali Roman script, there are several characters with diacritics. These characters are located in three separate code charts in the Unicode system as follows:

Latin-1 Supplement

Latin Extended-A

Latin Extended Additional

The following table shows these characters, the character code to display them in HTML documents, and glyphs for comparison. If your web browser does not display correctly the Unicode characters, in square brackets [ ... ], you should consider installing Unicode fonts or using browsers which support Unicode[4].

Name

 HTML character entity escape code

Glyph 
Symbolic Numeric Hexadecimal
Latin-1 Supplement
Latin capital letter N with tilde Ñ [ Ñ ] Ñ [ Ñ ] Ñ [Ñ]
Latin small letter n with tilde ñ [ ñ ] ñ [ ñ ] ñ [ ñ ]
Latin Extended-A 
Latin capital letter A with macron Ā [ Ā ] Ā [ Ā ]
Latin small letter a with macron ā [ ā ] ā [ ā ]
Latin capital letter I with macron Ī [ Ī ] Ī [ Ī ]
Latin small letter i with macron ī [ ī ] ī [ ī ]
Latin capital letter ENG [2] Ŋ [ Ŋ ] Ŋ [ Ŋ ]
Latin small letter eng [3] ŋ [ ŋ ] ŋ [ ŋ ]
Latin capital letter U with macron Ū [ Ū ] Ū [ Ū ]
Latin small letter u with macron ū [ ū ] ū [ ū ]
Latin Extended Additional
Latin capital letter D with dot below Ḍ [ Ḍ ] Ḍ [ Ḍ ]
Latin small letter d with dot below ḍ [ ḍ ] ḍ [ ḍ ]
Latin capital letter L with dot below Ḷ [ Ḷ ] Ḷ [ Ḷ ]
Latin small letter l with dot below ḷ [ ḷ ] ḷ [ ḷ ]
Latin capital letter M with dot above [2] Ṁ [ Ṁ ] Ṁ [ Ṁ ]
Latin small letter m with dot above [3] ṁ [ ṁ ] ṁ [ ṁ ]
Latin capital letter M with dot below [2] Ṃ [ Ṃ ] Ṃ [ Ṃ ]
Latin small letter m with dot below [3] ṃ [ ṃ ] ṃ [ ṃ ]
Latin capital letter N with dot above Ṅ [ Ṅ ] Ṅ [ Ṅ ]
Latin small letter n with dot above ṅ [ ṅ ] ṅ [ ṅ ]
Latin capital letter N with dot below Ṇ [ Ṇ ] Ṇ [ Ṇ ]
Latin small letter n with dot below ṇ [ ṇ ] ṇ [ ṇ ]
Latin capital letter T with dot below Ṭ [ Ṭ ] Ṭ [ Ṭ ]
Latin small letter t with dot below ṭ [ ṭ ] ṭ [ ṭ ]

Notes:

[1] Complete Unicode code charts are available here: http://www.unicode.org/charts

[2] These characters are interchangeable.

[3] These characters are interchangeable.

[4] Information on Unicode fonts and browsers is available from Alan Wood's site: http://www.alanwood.net/unicode