I Can Read Aramaic in 10 Transliterations

Not quite as impressive as being able to hold one's breath for 10 minutes, I admit, but after last night re-posting a lecture about transliterating Syriac on SYR101, more transliteration antics sought me out. :-)

As those who have heard me rant about the problems with Aramaic computing, dealing with a way to store Aramaic text is an absolute nightmare of What Dreams May Come proportions.

In essence, if you're working with one or two dialects that share a script, there is little difficulty; however, if you're trying to work with the language on a broader scale, things suddenly become tricky.

Within Unicode, there are no fewer than 4 different Unicode blocks to choose from that would represent different Aramaic scripts (Hebrew, Syriac, Phoenician, Imperial Aramaic) not to mention a number of others that are either used to write some Neo-Aramaic dialects (Arabic) or are under consideration (Mandaic).

If one were to store the raw Unicode in a database, writing queries would become abominable, as none of the scripts have character equivalence (i.e. a MySQL database wouldn't know that כתב and ܟܬܒ are potentially the same word, as their character values are different).

The easiest way to get around such snags is to store language data in a standard transliteration, casting and typesetting the text into whatever script is necessary to display it (in effect letting the content be content and the script be styling, like the current trend in HTML/CSS standards).

One could make their own transliteration, but that causes your data some trouble importing it to other systems, so one of the best bets is to utilize something based upon an already accepted standard.

Generally, for Aramaic, three different "flavors" of transliteration are used, each suited towards a particular purpose.

TRANSCRIPTION OF SYRIAC:
========================
Consonants: A B G D H O Z K Y ; C L M N S E I / X R W T
Vowels: a o e i u
Diacretics: ' dot above, Qushaya
, dot below, Rukkakha
_ line under
* Seyame

First there is SEDRA encoding used to encode the Syriac Entry Data Retrieval Archive, put together by George Kiraz et al. at Beth Mardutho. It is magnificent for transliterating Syriac with its diacritical marks, but the transliteration itself is a bit odd to type and use. With such conventions as O for waw, W for shin, and I for pe, it's a bit confusing, but make a bit more sense when one realizes that they are seemingly based upon how the Syriac letters themeselves look.

Also, it's stuck to Syriac. There aren't any Tiberian vowel encodings for 'Hebrew'-script dialects, nor is there any Mandaic support.

In either case, the data contained in the SEDRA database (a lexicon to the Syriac Peshitta) is a wealth of information that is consistently encoded and tagged.

Michigan Claremont Encoding:
) B G D H W Z X + Y K L M N S ( P C Q R $ T
A F I E U : . - ] [

An earlier solution, actually, was Michigan Claremont encoding, which was a lot more "phonetic" and easier for an English speaker to encode and decode. It's only awkward features were the use of f for zqapa/qamets, and that the character + tends to get lost when typed into URLs manually (as "+" in a URL denotes a space).

Michigan Claremont is currently one of the most widely used transliterations in the field. The Comprehensive Aramaic Lexicon adapted it with four changes:
  1. Most of the consonants could be written in lower-case rather than in caps.
  2. + (which represented teyt) was changed to T because of the aforementioned + problem.
  3. f was changed to A under the impression that a for "short" a and A for "long" a are more intuitive.
  4. A subset of the encoding was re-engineered so that it could encode Mandaic Aramaic (which has a very, very different script).
The only problem with these adaptations is that if you forget to query your database as a binary compare (i.e. *strictly* match the characters against their values) searches may confuse "t" and "T" to be the same letter.

For that reason when I built the Aramaic Designs database, I used a CAL code with teyt back once again to +, simply allowing for my software to escape all +'s to their appropriate URL-friendly form.

Between Michigan-Claremont/CAL and SEDRA you can account for the vast majority of transliterations within academia and fonts...

... however a newer one is very popular on the Internet, and that is the keyboard layout for the Estrangela font used on Peshitta Primacy websites, and that is what I bumped into tonight....

[However... now I must get some sleep. :-) I shall reveal the nature of the encounter once I have had more rest!]

Peace,
-Steve

Labels: , , ,