Unsophisticated Translator vs. File Format – Who Wins?
Among the dozens of unsung challenges of translation, file formats are among the most infamous.
There is almost always a disconnect between the format in which content creators save their files and the format (or the tools, really) in which the translators want to work. The content is in .pdf, and the translators want to work with MS Word .doc files. The software application works with .xml, and the translators want to work in with MS Excel .xls files.
One client, which distributes a platform for creating mobile phone applications, is grappling with this at the moment. It manages resources in a structured file format based on a programming language called Lua, so its translatable text looks like this:
| ModRsc {
name =”IDS_STRING_1001″, id = 1001, type = 1, data =EncStringRscData(0×03, “This is a new app.”), } |
They have asked me how translation works, because they’re willing to build a converter into their editing tool that takes resources like “This is a new app” and puts them into a format (.doc, .xls, .txt, etc.) that translators can use.
I applaud this kind of thinking, and spent about an hour on the phone with their development team in Hyderabad discussing it.
Background on Translators
There are two kinds of translators:
- Sophisticated – proper localization companies using computer-aided translation (CAT) tools and professional linguists
- Unsophisticated – “Here, Bob (or Najiv or Youli or Ramesh). Have your brother-in-law translate this for us.”
Sophisticated translators will use tools that parse XML, HTML, .rc, .properties, etc. files flawlessly, isolating the text for them to translate and hiding the code, tags and other things we don’t want them going near. As long as the convention is reliable – e.g., translate anything in quotation marks – the sophisticated translator can modify the parser to find and extract the text. These translators and tools are appropriate for apps with any number of translatable strings.
Unsophisticated translators are limited to tools like MS Word and Excel. Therefore, somebody or something needs to pull out and transform the translatable text into these formats. Then, after translation, somebody or something needs to reverse the transformation and put the text back in. These translators and tools are appropriate for apps with small numbers (<100) of translatable strings.
Recommendation for Resource Files
I explained that the Lua files would pose no problem for sophisticated translators. Tools like Trados, SDLX and Déjà Vu will easily parse and isolate translatable content in quotation marks.
For unsophisticated translators, one of the engineers suggested creating a plugin for MS Word that would parse all translatable text and allow translators to work in their favorite tool. There would be no need for developing transformations or conversion routines to go from Lua format into .xls/.txt/.doc/etc. The plugin could save the translated version back out in native Lua format.
Everybody wins with this solution.
- My client’s engineers get off the hook of creating an intricate program to cover multiple output and input formats for resources.
- The app developer has an easy way of pushing resources out to translators and pulling in the translated result.
- Sophisticated translators don’t need to change the way they work at all.
- Unsophisticated translators get a simple format that only requires that they install an MS Word plugin.
What kinds of file-format trade-offs do you have to make?
John White of venTAJA Marketing is a localization project manager and consultant.
photo credit: woodleywonderworks