Andy's
Pali Page
Unicode
and Pali
updated
July 24, 2000
Unicode is a computer standard that has one unique "computer
representation" for every symbol ever used in any human
language. This will bring big benefits to software companies that
develop software for international markets, and allow people who work
"multi-language" to share documents easier.
Unicode does NOT make it much easier to create multi-language
documents. A person will still need to set the language for
every section of their text, remember keyboard layouts, and remember
special keystroke combinations. The key issue: there simply aren't
enough keys on the keyboard to make typing easier when a person uses
more than one language. There are several approaches that you can
take when using Unicode characters to create Pali documents.
For completeness, here is a link to the Unicode home site: http://www.unicode.org/
An excellent site that makes it easier to find the characters for any
language and their codes is the link "Character Lists and Test
Pages for Unicode Ranges" at: Alan
Wood's Unicode Resources
Get a Unicode Font
The first thing you will need is a font that can display the Unicode
characters in your word processor.
The first thing to note is that not all Unicode fonts can display all
of the Unicode characters! The huge Unicode standard is organized
into "Code Pages" (Subsets), so that font creators can
create displayable characters only for certain parts of the Unicode
characters if they wish.
You may already have a Unicode font on your computer! Check to see if
the "Lucida Sans Unicode" font is on your computer. This
does not have all of the "Code Pages" (subsets) necessary
to display all of the Pali characters.
This is an example of a Unicode font that can display all Unicode
characters: Microsoft offers a font called "Arial Unicode
MS" that you can use with Word 97 or higher. (i.e. if you have
Word 97 - or higher - on a Windows 98 computer - or higher - you can
use the font).
The download is on the MS Publisher web site, but it is a
general-purpose Unicode font, containing 40,000 Unicode characters.
The font is a 13 Mb. download ( 1 hour at 28.800 download speed), and
needs 28 Mb.(yes, that's Mb.!) on your disk after you install it.
If you wish, click on this link to download
the Arial Unicode MS font.
Installing the font
You install the font like any other font. Instructions on how to
install a font are at the bottom of my web page Pali
Fonts.
Using the Unicode font with Microsoft Word (I don't know about WordPerfect)
I'll start by saying that the "Arial Unicode MS" font is
only one Unicode font. The techniques below will work only work with
ANY Unicode font that implements at least the "Latin-1",
"Latin Extended-A" and "Latin-Extended-Additional"
subset character pages of Unicode.
Secondly, the techniques below will allow you generate the Pali
characters independent of which "natural languages" you are
working in (as long as you have a Unicode font selected for that part
of your text).
You have several choices:
1. Customize your keyboard to insert the symbols
2. Create macros that generate the symbols
3. Get someone to create a Pali keyboard driver (.nls) file if most
of your typing is in Pali.
Customizing Your Keyboard to Insert the Symbols
1. Select the font "Arial Unicode MS" in your document
(like any other font)
2. "Insert"->"Symbol..." will show you the
symbol dialog. On the right there is a drop-down list called
'Subset'. You will need to find the characters (see table below),
select them, and then use the button at the bottom of the dialog
called 'Shortcut Key...' to assign a unique keyboard key sequence to
generate the character. Make sure you write this down!
3. After this, you will be able to get the Pali Unicode characters
from the keyboard.
Create macros that Generate the Pali Characters
1. Create macros
("Tools"->"Macro"->"Macros" or
just Alt-F8).
2. Assigning a keystroke to your macro is
"Tools"->"Customize..."->button at bottom
"Keyboard...", then in the 'Categories' list box select
'Macros'. This produces a list of your macros. Select the macro for
the character and assign a unique keystroke sequence. Be sure to
write this down!
Here is the code that you will need for your macros:
Sub Pali_m_dotunderneath()
'
' Pali_m_dotunderneath Macro
' Macro created 06/09/00 by Andy Shaw
'
With Selection
.Collapse Direction:=wdCollapseStart
.InsertSymbol CharacterNumber:=7747,
Font:="Arial Unicode MS", Unicode:=True
End With
End Sub
Creating a Pali keyboard Driver
It would be sensible for anyone using "mostly Pali" in
their documents to have a Pali keyboard driver. If you are using two
languages (i.e. Pali and English), a program like palitrans is much
easier to use. palitrans is not yet available for Unicode. If you
wish to have the source code, just let me know. palitrans is
"open source" under the terms of the GNU General Public License.
A Handy Little Program for Multi-lingual people
Microsoft has a very useful little program that will show you how the
keys are mapped/assigned when you change keyboards (English, French,
etc). You can use the "visual keyboard" and your mouse to
type in characters. It works with Office 2000. You can download it at:
Microsoft
Visual Keyboard
Understanding the Suggested Keyboard Assignments in the Table below
These keyboard assignments have been designed to be "easy to
remember", and as easy as possible to type. They have been
tested on a US English Windows 98 system using Word 97.
The capital letters are used because that is what you will see
when you do the keyboard assignments in Word 97. All that a capital
letter means is the key. For instance, "A" means the
"a" key, not capital "A" (Shift-A).
The assignments for the letters ~n, ~N, "n, and "N are a
bit different than the rest of the keys. I have used "T"
for tilde, and "O" for overdot.
Note: do not use text formatting keystrokes (like Ctrl-B for bold)
when typing in the Unicode font you have chosen for the keystroke
assignments. Enter the text, then use the mouse and toolbar buttons
for text formatting (bold, underline, indentation, etc.)
Once you've done a couple of keys, the whole setup should take about
15 minutes to create and test.
Now, lets set up and test two keys ("n and "N).
1. From the main menu "Insert"
2. In that drop-down menu "Symbol..."
3. In the upper left corner of the dialog box that just appeared you
will see "Font:" Choose the Unicode font you prefer (for
instance, "Arial Unicode MS").
4. In the upper right corner, you will see "Subset:" Most
of the subsets are at the top of the list, but "Latin
Extended-Additional" is about 1/3 of the way down the list and a
bit hard to find. Locate it, and select it.
5. Now we must find the character "n
6. Once we find the character, we select it with the mouse.
7. At the bottom there is a button called: "Shortcut
Key...". Press the button.
8. The cursor should be automatically positioned in the data entry
area called "Press new shortcut key". If the cursor is not
there, click there.
9. Hold down the Ctrl key, and WITHOUT releasing it, press the
"O" key, press the "N" key, and release the Ctrl
key. You should see exactly the text in the table below. ("Ctrl-O,N").
10. Press the button "Assign"
11. Press the button "Close"
Now, find the "N character and repeat steps 5 to 8.
9.b. Hold down the Shift key AND the Ctrl key at the same time and DO
NOT release them, press "O", press "N", and
release the Shift and Ctrl key.
You should see exactly this: "Ctrl-Shift-T,Shift-T"
10.b Press the button "Assign"
11.b Press the button "Close".
Now, press the button "Close" to close the
"Symbol" dialog, select the font "Arial Unicode MS".
Press the Ctrl key and KEEP it held down, press "O", press
"N", release the Ctrl key. You should see the Pali
character for "n
Press the Shift key AND Ctrl key and DO NOT release them, press
"O", press "N" and release the Shift and Ctrl
keys. You should see the "N character.
Repeat for the rest of the keys.
Character (as tranliteration) |
Character Number |
Suggested Keystroke Assignment |
Unicode Subset |
Aa |
256 |
Ctrl-Shift-A,Shift-A |
Latin Extended-A |
aa |
257 |
Ctrl-A,A |
Latin Extended-A |
Ii |
298 |
Ctrl-Shift-I,Shift-I |
Latin Extended-A |
ii |
299 |
Ctrl-I,I |
Latin Extended-A |
Uu |
362 |
Ctrl-Shift-U,Shift-U |
Latin Extended-A |
uu |
363 |
Ctrl-U,U |
Latin Extended-A |
.D |
7692 |
Ctrl-Shift-D,Shift-D |
Latin Extended-Additional |
.d |
7693 |
Ctrl-D,D |
Latin Extended-Additional |
.L |
7734 |
Ctrl-Shift-L,Shift-L |
Latin Extended-Additional |
.l |
7735 |
Ctrl-L,L |
Latin Extended-Additional |
.M |
7746 |
Ctrl-Shift-M,Shift-M |
Latin Extended-Additional |
.m |
7747 |
Ctrl-M,M |
Latin Extended-Additional |
.N |
7750 |
Ctrl-Shift-N,Shift-N |
Latin Extended-Additional |
.n |
7751 |
Ctrl-N,N |
Latin Extended-Additional |
~N |
209 |
Ctrl-Shift-T,Shift-N |
Latin-1 |
~n |
241 |
Ctrl-T,N |
Latin-1 |
"N |
7748 |
Ctrl-Shift-O,Shift-N |
Latin Extended-Additional |
"n |
7749 |
Ctrl-O,N |
Latin Extended-Additional |
.T |
7788 |
Ctrl-Shift-T,Shift-T |
Latin Extended-Additional |
.t |
7789 |
Ctrl-T,T |
Latin Extended-Additional |
|