Archive

Archive for the ‘machine translation’ Category

A Day for Google, a Year for All Professional Translators

April 30th, 2012 No comments

“What all the professional human translators in the world produce in a year, our system translates in roughly a single day.”

-Google Translate research scientist Franz Och, in a blog post

Google TranslateDara Kerr reports this in an article on Google Translate’s 64 languages and 200 million users.

Life could be worse, I say.

Suppose we were all cobbling shoes or manufacturing turkey basters. Google wouldn’t even take notice of us. We’d be leading unexamined lives and wondering why our industry was so quiet and free from disruption.

Meanwhile, Franz is kind enough to throw us a bone:

“Of course, for nuanced or mission-critical translations, nothing beats a human translator, and we believe that as machine translation encourages people to speak their own languages more and carry on more global conversations, translation experts will be more crucial than ever.”

Comically, the consensus at last week’s Localization Unconference was to breeze right by the topic of machine translation. “We’re over it,” several people said, chuckling. Were we just whistling past the graveyard?

That’s a lot of words moving through a machine. How many of them came from you?

Categories: machine translation Tags:

Do You Want to Help Me Localize? Then Don’t Help Me.

September 17th, 2009 No comments
"Let me help you localize that."

"Let me help you localize that."

We’re running a medium-sized project from En-Us (English for US) into Es-Mx (Spanish for Mexico) and Pt-Br (Portuguese for Brazil). It’s a Web portal based on Java, and the strings are in resource bundles  or .properties files. The product manager and I are in California, the engineering lead is in UK, the Java developers are in India, and I don’t know where the translators are. Maybe Greenland. Par for the course these days: the sun never sets on this project.

One of the translators sent me an anguished e-mail and a screenshot the other day during linguistic QA of the Spanish portal:

“Here is a screen grab containing a few Es strings I don’t recognize,” she wrote. “I’m the only translator, so I know all of the text, but I’ve never seen these strings before. They’re not in TM, either. In fact, the En strings aren’t even in TM. Have you been translating strings yourself? Has somebody else? What’s going on??”

(I like my translators intense. The drama keeps things from getting dull.)

It took some digging, but I deduced that someone on the chain of development had been creating new strings – which I knew they were going to do – and using Google Translate to stage them quickly and see whether they would run too long. If they were too long for the button or some other element in the UI, the engineer had more work ahead, but s/he didn’t want to wait for the proper translation, and so made use of free, Web-based translation.

This kind of translation gets many of us in the industry in high dudgeon, and there’s no point in my rehashing the pros and cons. This, however, was an unintended consequence of having machine translation as close as your browser. If the engineer had staged the translation, gauged its length, then removed it, it would never have befuddled the translator.

The problem, then, is that machine translation looks just the same as human translation, except that a linguist needs to trip over it, double-check it and raise a QA issue over it.

I told the engineering lead about this via e-mail:

PLEASE PLEASE PLEASE PLEASE PLEASE tell the team not to do their own translating. They may think it’s helpful, but It’s counterproductive. I’d rather just have the En strings alone.

The moral: To the list of things you work out with Engineering at the start of a localization project, add “I promise I won’t write code for you if you promise not to do any translation for me. Deal?”

John White manages localization projects for several technology clients. He fancies himself the localization manager for companies that do not want one.

photo credit: pulihora

Getting your Documentation Ready for Localization

July 10th, 2008 7 comments

Have you had to prepare your documentation for localization yet? My experience is that in almost all companies, writers have far too many other oppressive concerns gnawing at them to think about writing for localization.

A few days ago an industry colleague sent me a message asking, “Do you have experience making recommendations for how documentation can be authored for localization? I am looking to make our doc  process more efficient to reduce costs.”

I replied that, given his stature and tenure in the industry, there was not likely anything I could suggest that he hadn’t already considered. Nevertheless, I sent him a list of ideas, in increasing order of difficulty:

  1. Make sure all the writers’ computers are plugged in. (A bit of ironic humor I could not resist.)
  2. Is it easy to get from the authoring tool(s) into TM, and back out into publishable format? This is my current headache with an API reference manual we localize for one client, because moving from source language to the translator tools and back to target format is a colossal headache. If you have similar problems, devote some cycles at the format-layer, even if it means writing an interface between your content management system and the translation tool.
  3. There are “authoring memory” tools that can suggest and re-use already-xlated source text, so that writers don’t say nearly the same thing multiple times and incur unnecessary TM penalties. Sajan has one, and SDLX contains one as well. I’ve never used either one, but I can imagine that success with the tools would require somebody with the documentation-familiarity of a technical writer and the global consciousness of a localization manager. Like you.
  4. I’ve presented on localization to a variety of audiences, and have consistently found tech writers to be the most interested in it, vastly more so than developers. When you show writers how the TM tools work, tell them how they can save money and re-use content, and let them know that you care about the impact of their work on international products, they will smell the coffee and engage. This takes a bit of evangelism, but it’s worth it if the writers change their own practices.
  5. Convert everything to XML. Although Renato and Don of Common Sense Advisory joke that that will fix any L10n problem, it’s nonetheless a good, long-term direction in which to move. It’s easier to re-use text, and easier to mark text that should/should not be translated. That will save you money.
  6. Start a program of controlled language authoring (dumbing down the sentences, always writing in a structure that machine translation will recognize, etc.). I guess that GM and Caterpillar are poster children for this kind of thing, but it puts the writers (and you, in the bargain) through the change of life, which is why I mention it last.

What about you? Have you faced this in your organization? How have you made document localization easier for the company, without driving your writers crazy?

If you liked this post, have a look at Getting Writers to Care about Localized Documents.

Machine translation in action

July 20th, 2007 Comments off

Has your boss asked you to use Google or AltaVista or some other flavor of machine translation to lower your translation costs?

Here’s somebody who has put his money where your boss’ mouth is.

Controlled language website attracts visitors from 110 countries

www.muegge.cc, a website dedicated to demonstrating the value of controlled language authoring and machine translation (MT), has attracted visitors from more than 110 countries since its launch in the summer of 2006. One of the unique features of this website is the fact that it uses Google language tools to automatically translate the site’s content into 15 language pairs such as German to English or English to Simplified Chinese. The website was created from the ground up for MT, and all text was written in compliance with the CLOUT rule set, a controlled language designed specifically for MT.

muegge.cc, E-mail: info@muegge.cc, Web: http://www.muegge.cc

How do they do it? By controlling the text that goes into the translation machine. The simpler, more predictable and better structured the text, the more likely it can generate a satisfactory translation. In other words, machine translation would probably work better on a page of Hemingway than on a page of Shakespeare or Faulkner.

Don’t forget, though: What you save in translation, you’ll spend in whipping your writers into line. It may not look like real dollars, but it’s time.

And time, as they say, is money.