Archive

Archive for the ‘software localization’ Category

Unsophisticated Translator vs. File Format – Who Wins?

November 19th, 2009 1 comment

translator-file-format-confusionAmong the dozens of unsung challenges of translation, file formats are among the most infamous.

There is almost always a disconnect between the format in which content creators save their files and the format (or the tools, really) in which the translators want to work. The content is in .pdf, and the translators want to work with MS Word .doc files. The software application works with .xml, and the translators want to work in with MS Excel .xls files.

One client, which distributes a platform for creating mobile phone applications, is grappling with this at the moment. It manages resources in a structured file format based on a programming language called Lua, so its translatable text looks like this:

ModRsc {

name  =”IDS_STRING_1001″,

id    = 1001,

type  = 1,

data  =EncStringRscData(0×03, “This is a new app.”),

}

They have asked me how translation works, because they’re willing to build a converter into their editing tool that takes resources like “This is a new app” and puts them into a format (.doc, .xls, .txt, etc.) that translators can use.

I applaud this kind of thinking, and spent about an hour on the phone with their development team in Hyderabad discussing it.

Background on Translators

There are two kinds of translators:

  1. Sophisticated – proper localization companies using computer-aided translation (CAT) tools and professional linguists
  2. Unsophisticated – “Here, Bob (or Najiv or Youli or Ramesh). Have your brother-in-law translate this for us.”

Sophisticated translators will use tools that parse XML, HTML, .rc, .properties, etc. files flawlessly, isolating the text for them to translate and hiding the code, tags and other things we don’t want them going near. As long as the convention is reliable – e.g., translate anything in quotation marks – the sophisticated translator can modify the parser to find and extract the text. These translators and tools are appropriate for apps with any number of translatable strings.

Unsophisticated translators are limited to tools like MS Word and Excel. Therefore, somebody or something needs to pull out and transform the translatable text into these formats. Then, after translation, somebody or something needs to reverse the transformation and put the text back in. These translators and tools are appropriate for apps with small numbers (<100) of translatable strings.

Recommendation for Resource Files

I explained that the Lua files would pose no problem for sophisticated translators. Tools like Trados, SDLX and Déjà Vu will easily parse and isolate translatable content in quotation marks.

For unsophisticated translators, one of the engineers suggested creating a plugin for MS Word that would parse all translatable text and allow translators to work in their favorite tool. There would be no need for developing transformations or conversion routines to go from Lua format into .xls/.txt/.doc/etc. The plugin could save the translated version back out in native Lua format.

Everybody wins with this solution.

  • My client’s engineers get off the hook of creating an intricate program to cover multiple output and input formats for resources.
  • The app developer has an easy way of pushing resources out to translators and pulling in the translated result.
  • Sophisticated translators don’t need to change the way they work at all.
  • Unsophisticated translators get a simple format that only requires that they install an MS Word plugin.

What kinds of file-format trade-offs do you have to make?

John White of venTAJA Marketing is a localization project manager and consultant.

photo credit: woodleywonderworks

Localization Testbenches, Part II (Software)

April 6th, 2007 Comments off

What are you using to test your localized products? If you’re handing them to your domestic QA team and expecting that they’ll intuitively test them with correct language locale settings, you may be in for an unpleasant surprise.

1) Software
This will probably take you the most time to get right, because you need to go to more pains to emulate the real-world scenario of your customers. They’ve bought computers running Windows XP/Japanese or Linux/Russian or MacOS/Arabic. The hardware nowadays isn’t different (except for the keyboards), so you don’t need to outfit your lab with machines from all over the planet.

However, if you install your Korean product under US-English Windows XP, you’ll probably be in for lots of corrupted characters on screen. This is because characters in Korean (and Japanese, Chinese, Arabic and a few other languages) take up two bytes, whereas characters in English (and other Western languages) take up only one byte. An English operating system tries to interpret the Korean characters one byte at a time, and the result is usually illegible.

Modern operating systems include the fonts and locale support for these multi-byte languages, though it usually needs to be enabled. This is a good half-measure for testing your localized products, but it’s still not exactly what your in-country customers will see, so you should consider native-language testbenches, onto which you freshly install the native operating system.

This can get clunky and hardware-intensive – even if you’re partitioning the disk and dual-booting – so you may also consider virtualization products like VMWare and Virtual Disk. You can host dozens of different native-language systems on a single hard drive, and run several of them at a time, if your machine is sufficiently endowed.

Of course, almost any solution will spook your testers, who will consult their job descriptions and inform you that they contain no mention of “putting up with weird languages.” This is not an insurmountable problem, though it is a topic for another post.

Note: Believe it or not, some people think it’s pretty slick to see MacOS in Portuguese or Russian RedHat. They are hypnotized by how similar the interface is, and struck by the differences. A neat show-stopper for your evangelization sessions.