Erik's notes on
This page describes procedures for converting text from the Macintosh platform to Windows when dealing with East European languages, including Greek.
Language: non-English speakers
(English speakers might benefit, too, though)
Geography: in West Europe,
Latin America and Sub-Saharan Africa
Work: working with
Cyrillics and other languages of East & South East Europe
on a professional level: students, professors, translators
IT knowledge: basic text
processing, beginners, computer dummies
Operating systems: Classic
Macintosh (i.e., Macintosh Operating Systems 9.2.2 and earlier, not
OS X)
Wallet size: empty (the
tools needed are freeware or shareware)
This document was first published: January 20, 2004
Other pages on the World Wide Web deal with this and similar issues, but I cannot help giving my own advice. It is a jungle out there.
Rivalling sites: Russification of Macintosh
A general approach to text conversion is the following, for which I am much endebted to Mr. Sylvain Jallard, residing in Zagreb (by June 3, 2002), is the following:
- Save the file as text-only (.txt)
- Make a simple character-to-character find/replace (I use BBedit for this task)
- Save it for DOS platform
As you may gather, Cyrillic and Greek texts won't work that
easily with this. In the case of Cyrillics (e.g., Russian), the
find/replace procedure will have to run for about 70 times.
Although BBEdit can do things faster (multiple find/replace through
several files at a time), it seems awesome. BTW, BBEdit has a free
cousin
BBEdit Lite, available here.
Linguist Software offers the utility CrossPlatform Converter for Macintosh. It ships for 39.95 US$ and can do all sorts of text conversion of .txt and .rtf files of Cyrillic, Greek, Turkish, and even Arabic based texts written on a Mac. I haven't tried it yet. The Danish company Continental Information offers cross-platform text conversion based on software developed by themselves. I have not succeeded to persuade them to market that programme.
Very few other people have undertaken the task to solve this problem, i. e. developed a program dedicated to conversion of non-ASCII texts between Macintosh and Windows platforms. Windows 95 and especially the spread of WWW have been too focused on the image part of personal computing (pornography!), leaving the text issues miles behind. Who wants to read these days?
Try Apple's Language Kit yourself. It works, but seldomly the way you want it to. It also costs if you haven't installed Macintosh Operating System 9 or higher that offer it as an optional installation. If you can get your hands on a Cyrillic instruction for your operating system, dump it into your System folder. It can help.
The simplest and most bulletproof trick to convert Cyrillic texts, written on the Macintosh OS platform to the same texts in Windows is using a Mozilla/Gecko browsers' text encoding feature.
If you want to upload the text to a server, you may get results that are with various degrees of satisfaction. The results depend on
Your server is the most important part. Your internet service provider (ISP) has a server (a computer storing the files of your web site). As long as you keep the data on your own Macintosh or on a medium such as a floppy disk, almost any text encoding is possible, even when making it available through Web Sharing. However, once you upload your Cyrillic document to a non-Macintosh server, problems arise. Aparently, it depends on the server software and operating system. The chief distinction is between WebDAV and Apache on Linux.
WebDAV is a standard developed by Microsoft and is used for uploads by, e.g., the web editoring application Microsoft Frontpage. Apple Inc. also uses the WebDAV interface for its .Mac servers. WebDAV is easy to use. But it is expensive to the internet service provider. Instead, the most popular server configuration (because most of the key software is free) is a Linux server running Apache server software (the most popular of the 21st century). In that case, Netscape is inevitable. You simply must encode the text in the above way.
Rule of a thumb: If you don't know what technology your server is using,
test it by writing the URL in your browser's URL field, adding a slash and
a name that doesn't match any of your documents, e.g., http://www.my-domain.com/blablabla.html
. Usually, this will trigger
an error page. At the bottom of the error page, the server configuration is
specified. Otherwise, FTP upload is a good indicator for Apache-on-Linux. Finally,
if you pay much less than your friends do for their web hotels, you may
be having your site on a Linux server.
I have set up some Cyrillic test files on various servers to display the differences. You can test them in any browser you like to see the differences.
Apache Server at struer.net | Apache Server at b-One.dk | UNIX server at Atevo.dk |
Universal compatibility | Restricted compatibility (using WebDAV, attached file in e-mail, portable media [e.g., floppy disk, Zip drive, memory stick, iPod, memory card for digital cameras, etc.]. Not FTP-to-Apache-on-Linux)
The central point of this procedure is the HTML conversion. You might as well skip the text editing part in a separate word processor, also because HTML documents tend to be very economic in terms of disk space. However, text processors have their advantages, using more advanced tools such as magnification, proof reading, find-and-replace, faster maneuvering through long documents, as well as fast screen updates. Also, if the end user is a dummy too, he/she may want to execute further editing (such as tables, font styles) in the text editor she is accustomed to.
In theory, this approach should apply to Greek and Turkish as well, but I haven't tested it yet.
Umberto Eco, the Italian semiologist, once famously compared Macs and PCs to the two main branches of the Christian faith: Catholics and Protestants.
The Mac is Catholic, he wrote in his back-page column of the
Italian news weekly, Espresso, in September 1994. It is cheerful,
friendly, conciliatory, it tells the faithful how they must proceed
step by step to reach — if not the Kingdom of Heaven — the moment
in which their document is printed.
The Windows PC, on the other hand, is
Protestant. It demands difficult personal decisions, imposes a
subtle hermeneutics upon the user, and takes for granted the idea
that not all can reach salvation. To make the system work you need
to interpret the program yourself: A long way from the baroque
community of revelers, the user is closed within the loneliness of
his own inner torment.
The central point of the universal compatibility procedure is the HTML conversion. What Mozilla and Netscape do is writing the Cyrillic text with Unicode, using a number for each character, even when not specifically set to write Unicode. Since Unicode is the future of text processing, this encoding is promising for later versions of the operating systems involved.
The texts produced in this way (in ISO-8859-1) consume 2 or 3 times more space than when in ISO-8859-5 or KOI8.
Warning: The resulting document may be hard to read in Macintosh OS 8.6 and earlier. Internet Explorer 4 and later can display them correctly in OS 8.6. Netscape v4.79 and earlier cannot. Use it only from Macintosh OS 9 and later to Windows.
In theory, this approach should apply to Greek and Turkish as well, but I haven't tested it yet.