Erik Thau-KnudsenErik's notes on

Converting East European text from Macintosh to Windows PCs

Purpose

This page describes procedures for converting text from the Macintosh platform to Windows when dealing with East European languages, including Greek.

Target group

Language: non-English speakers  (English speakers might benefit, too, though)
Geography: in West Europe, Latin America and Sub-Saharan Africa
Work: working with Cyrillics and other languages of East & South East Europe
on a professional level: students, professors, translators
IT knowledge: basic text processing, beginners, computer dummies
Operating systems: Classic Macintosh (i.e., Macintosh Operating Systems 9.2.2 and earlier, not OS X)
Wallet size: empty (the tools needed are freeware or shareware)
This document was first published: January 20, 2004

Other pages on the World Wide Web deal with this and similar issues, but I cannot help giving my own advice. It is a jungle out there.

Rivalling sites: Russification of Macintosh

Contents

Find/replace approach

A general approach to text conversion is the following, for which I am much endebted to Mr. Sylvain Jallard, residing in Zagreb (by June 3, 2002), is the following:

  1. Save the file as text-only (.txt)
  2. Make a simple character-to-character find/replace (I use BBedit for this task)
  3. Save it for DOS platform

As you may gather, Cyrillic and Greek texts won't work that easily with this. In the case of Cyrillics (e.g., Russian), the find/replace procedure will have to run for about 70 times. Although BBEdit can do things faster (multiple find/replace through several files at a time), it seems awesome. BTW, BBEdit has a free cousin BBEdit Lite, available here.

Linguist Software offers the utility CrossPlatform Converter for Macintosh. It ships for 39.95 US$ and can do all sorts of text conversion of .txt and .rtf files of Cyrillic, Greek, Turkish, and even Arabic based texts written on a Mac. I haven't tried it yet. The Danish company Continental Information offers cross-platform text conversion based on software developed by themselves. I have not succeeded to persuade them to market that programme.

Historical remark

Very few other people have undertaken the task to solve this problem, i. e. developed a program dedicated to conversion of non-ASCII texts between Macintosh and Windows platforms. Windows 95 and especially the spread of WWW have been too focused on the image part of personal computing (pornography!), leaving the text issues miles behind. Who wants to read these days?

Converting Cyrillic text from Macintosh to Windows environments

The Cyrillic instruction file looks like this.   Try Apple's Language Kit yourself. It works, but seldomly the way you want it to. It also costs if you haven't installed Macintosh Operating System 9 or higher that offer it as an optional installation. If you can get your hands on a Cyrillic instruction for your operating system, dump it into your System folder. It can help.

HTML conversion

The simplest and most bulletproof trick to convert Cyrillic texts, written on the Macintosh OS platform to the same texts in Windows is using a Mozilla/Gecko browsers' text encoding feature. 

Uploading to a server : Caution

If you want to upload the text to a server, you may get results that are with various degrees of satisfaction. The results depend on

  1. The server software (platform: Macintosh, Linux, Windows; Apache server software or other)
  2. Your operating system
  3. Your browser

Your server is the most important part. Your internet service provider (ISP)  has a server  (a computer storing the files of your web site). As long as you keep the data on your own Macintosh or on a medium such as a floppy disk, almost any text encoding is possible, even when making it available through Web Sharing. However, once you upload your Cyrillic document to a non-Macintosh server, problems arise. Aparently, it depends on the server software and operating system. The chief distinction is between WebDAV and Apache on Linux.

WebDAV is a standard developed by Microsoft and is used for uploads by, e.g., the web editoring application Microsoft Frontpage. Apple Inc. also uses the WebDAV interface for its .Mac servers. WebDAV is easy to use. But it is expensive to the internet service provider. Instead, the most popular server configuration (because most of the key software is free) is a Linux server running Apache server software (the most popular of the 21st century). In that case, Netscape is inevitable. You simply must encode the text in the above way. 

Rule of a thumb: If you don't know what technology your server is using, test it by writing the URL in your browser's URL field, adding a slash and a name that doesn't match any of your documents, e.g., http://www.my-domain.com/blablabla.html. Usually, this will trigger an error page. At the bottom of the error page, the server configuration is specified. Otherwise, FTP upload is a good indicator for Apache-on-Linux. Finally, if you pay much less than your friends do for their web hotels, you may be having your site on a Linux server.

I have set up some Cyrillic test files on various servers to display the differences. You can test them in any browser you like to see the differences.

Apache Server at struer.net | Apache Server at b-One.dk | UNIX server at Atevo.dk

Select your media

Universal compatibility | Restricted compatibility (using WebDAV, attached file in e-mail, portable media [e.g., floppy disk, Zip drive, memory stick, iPod, memory card for digital cameras, etc.]. Not FTP-to-Apache-on-Linux)

Restricted compatibility

Start user requirements:

  1. Any Macintosh computer
  2. Macintosh Operating System 7.0 and above (I can't guarantee for System 6)
  3. Cyrillic fonts complying with the Macintosh standard. Examples:
    • Apple Cyrillic system fonts (compulsory): At least one of these:
      1. Russian System Fonts for OS 7: ARSKurier (a Courier clone), Bastion (sens-serif), Latinskij (Times), Prjamoj, Prjamoj Prop (both Prjamojs, also spelled Priamoj, render Geneva), and Sistemnyj (Chicago);
      2. later fonts (introduced in OS 8.x): Chicago CY, Geneva CY, Helvetica CY, Monaco CY, Times CY.
    • Optional: ER Bukinist Macintosh (a Bookman clone), ER Univers Macintosh (a Helvetica clone)
    • C&G fonts (optional):Bodoni Cyrillic FAF, GlasnostDemiboldFAF, GlasnostExtraBoldFAF, GlasnostLightFAF, MurmanskFAF, OdessaScriptFAF, SvobodaFAF, VremyaFAF (a Times clone)
  4. Any text processor (the below example will employ Microsoft Word 5.1a). Avoid text processing making use of Apple's world script (installed optionally with the Macintosh OS 9 and above).
  5. Netscape Navigator/Communicator, v.3 Gold or 4.x and above. Make sure that the correct Cyrillic fonts are set in the Cyrillic encoding (in Netscape Communicator: Preferences -> Appearance -> Fonts -> For the encoding -> Cyrillic -> [above mentioned fonts or other ones complying])

End user requirements:

  1. Any PC (including PC emulators, PC PCI cards)
  2. Windows 95 and above. Make sure that Cyrillic text software is installed.
  3. A text processor. Microsoft Word 7 and above works fine. NotePad does not.
  4. A web browser recognising Cyrillic encoding, such as Opera, Netscape Communicator or Microsoft Internet Explorer.

Procedure

On the Macintosh platform:

  1. Write and edit the text and save it. Do not hyphenate it!
  2. Copy the text by hitting the keys Apple-C (or using the Edit menu)
  3. Open a blank page in the HTML Editor module of your Netscape software, e.g. Netscape Composer.
  4. Set the Character set (encoding) of the Netscape page: For Russian, other Eastern Slavic, and Bulgarian: use KOI8 (Kod obmena informacii). For Macedonian and Serbo-Croatian: use Windows-1251.
  5. Paste the copied text into the Netscape document.
  6. Save the Netscape document.
  7. Transfer the Netscape document to a Windows platform (Windows 95 and above), i.e., mail it or load it onto a floppy disk.

On the Windows platform

  1. Open the Netscape document in your web browser.
  2. Copy the text by Control-C (or using the Edit menu)
  3. Open your Windows text editor (Microsoft Word v7 and above works fine).
  4. Paste the text into a new text document.
  5. Save the document. You are done.

The central point of this procedure is the HTML conversion. You might as well skip the text editing part in a separate word processor, also because HTML documents tend to be very economic in terms of disk space. However, text processors have their advantages, using more advanced tools such as magnification, proof reading, find-and-replace, faster maneuvering through long documents, as well as fast screen updates. Also, if the end user is a dummy too, he/she may want to execute further editing (such as tables, font styles) in the text editor she is accustomed to.

In theory, this approach should apply to Greek and Turkish as well, but I haven't tested it yet.

I'm still using Mac OS 9

Universal compatibility

Start user requirements:

  1. A Macintosh computer
  2. Apple Language Kit installed (installation instructions) (delivered with the Installation CD-ROM for Mac OS 9 and later. If you are using an earlier system, you will need a hack to install Unicode. My own trick)
  3. Cyrillic fonts complying with the Macintosh standard. Examples:
    • Apple Cyrillic system fonts (compulsory): At least one of these:
      1. Russian System Fonts for OS 7: ARSKurier (a Courier clone), Bastion (sens-serif), Latinskij (Times), Prjamoj, Prjamoj Prop (both Prjamojs, also spelled Priamoj, render Geneva), and Sistemnyj (Chicago);
      2. later fonts (introduced in OS 8.x): Chicago CY, Geneva CY, Helvetica CY, Monaco CY, Times CY.
    • Optional: ER Bukinist Macintosh (a Bookman clone), ER Univers Macintosh (a Helvetica clone)
    • C&G fonts (optional):Bodoni Cyrillic FAF, GlasnostDemiboldFAF, GlasnostExtraBoldFAF, GlasnostLightFAF, MurmanskFAF, OdessaScriptFAF, SvobodaFAF, VremyaFAF (a Times clone)
  4. Netscape v.7 and above (version 6 may work too, I haven't tested it; probability is that even 3.0-4.8 versions work, but I am no longer able to test that) or Mozilla. Version (build) 1.2.1 of Mozilla is the highest one available in Macintosh OS 9.x. Later versions are Cocoa applications, i.e., won't work in Classic operating systems. Last stable version for Macintosh OS X is number 1.7.5. Make sure that the correct Cyrillic fonts are set in the Cyrillic encoding (in Mozilla: Preferences -> Appearance -> Fonts -> For the encoding -> Cyrillic -> [Select the fonts for the various typefaces. Mozilla will only display the ones available.])
  5. Your own room in a web hotel. I expect you to know how to upload files to the server.

End user requirements:

  1. Any PC (including PC emulators, PC PCI cards)
  2. Windows 95 and above. Make sure that Cyrillic text software is installed.
  3. A text processor. Microsoft Word 7 and above works fine. NotePad does not.
  4. A web browser recognising Cyrillic encoding, such as Netscape 7, Mozilla or Microsoft Internet Explorer v4 and above.
  5. Access to the internet

Procedure

On the Macintosh platform:

Umberto Eco on Macs

 Umberto Eco, the Italian semiologist, once famously compared Macs and PCs to the two main branches of the Christian faith: Catholics and Protestants.

The Mac is Catholic, he wrote in his back-page column of the Italian news weekly, Espresso, in September 1994. It is cheerful, friendly, conciliatory, it tells the faithful how they must proceed step by step to reach — if not the Kingdom of Heaven — the moment in which their document is printed.

 The Windows PC, on the other hand, is Protestant. It demands difficult personal decisions, imposes a subtle hermeneutics upon the user, and takes for granted the idea that not all can reach salvation. To make the system work you need to interpret the program yourself: A long way from the baroque community of revelers, the user is closed within the loneliness of his own inner torment.

  1. Open a blank page in the HTML Editor module of your Netscape or Mozilla software, e.g. Menu --> Window --> Composer.
  2. Make sure that the Character set (encoding) of the Composer page is Western (ISO-8859-1), i. e. Menu --> View --> Character set --> Western (ISO-8859-1). or English (US-ASCII), i.e., Menu --> View --> Character set --> English (US-ASCII). The ISO-8859-1 Character Set is default in most versions of the Netscape and Mozilla Composer modules. Do not use a Cyrillic encoding  or Unicode!
  3. Write the text and make the needed layout.
  4. Save the Netscape or Mozilla document.
  5. Transfer the Netscape or Mozilla document to a Windows platform (Windows 95 and above), i.e., mail it or upload it to your home site.

On the Windows platform

  1. Open the Netscape document in your web browser.
  2. Copy the text by Control-C (or using the Edit menu)
  3. Open your Windows text editor (Microsoft Word v7 and above works fine).
  4. Paste the text into a new text document.
  5. Save the document. You are done.

Comment

The central point of the universal compatibility procedure is the HTML conversion. What Mozilla and Netscape do is writing the Cyrillic text with Unicode, using a number for each character, even when not specifically set to write Unicode. Since Unicode is the future of text processing, this encoding is promising for later versions of the operating systems involved. 

The texts produced in this way (in ISO-8859-1) consume 2 or 3 times more space than when in ISO-8859-5 or KOI8.

Warning: The resulting document may be hard to read in Macintosh OS 8.6 and earlier. Internet Explorer 4 and later can display them correctly in OS 8.6. Netscape v4.79 and earlier cannot. Use it only from Macintosh OS 9 and later to Windows.

In theory, this approach should apply to Greek and Turkish as well, but I haven't tested it yet.

All reference to profit and non-profit organisations on this page is deliberate and not sponsored.