Archive

Archive for the ‘localization project’ Category

Localization in the Flat World

February 13th, 2007 Comments off

You need to understand the localization process – even if you’re in denial about it – because the world is flat (apologies to Thomas L. Friedman) and the sooner you see how the process goes, the sooner you join the ever-flattening world.

Do you see your company in the following scenario?

We’re bringing a prospective new client into the flat world. Up to now, they’ve dealt with translations, in which somebody overseas says, “We need to be able to read this document in our own language.” Recently, though, the folks overseas are saying, “We need to be able to use your product in a way that makes sense to us.” The unspoken rest of the sentence, of course, goes, “…or we’ll find a different one and use it.”

They are on the road to localization. What’s next?

  1. Demystify the process. What’s really involved in creating a localized product? How much will it affect this organization?
  2. Identify and talk to all stakeholders. Inform them of what’s coming and what will be required of them.
  3. Figure out exactly what needs to be localized: software strings, documentation, help, Web pages, installer strings, sample files, etc.
  4. Create a project plan. Much of it will be wrong the first time out, but as long as you know it’s a living document, it will serve its purpose.
  5. Appoint or find a project manager. The localization project needs a champion (some might say a lightning rod), because it won’t all happen on its own.

Localization Train slowing

January 30th, 2007 Comments off

We’re seeing the localization juggernaut lose some steam.

In the early years, this client localized its flagship software package for developers in China, Japan and Korea (CJK), then added Brazil. It took small, reference applications into as many as 10 languages (including Hebrew and Thai) as those markets showed promise. The budget was pretty fat, the localized products were freshened frequently, and the developers were happy to have software and doc in their own language.

I suppose it was to be expected that this would peter out with time, because markets change, business cases wax and wane, and some regions never return the investment.

The new stressor on localization was less easy to anticipate: bulk. Each generation of improvements to the product brings several hundred more pages of documentation. All of this new documentation is, of course, “free” in English, but somebody has to pull out a checkbook to deal with it in other languages, and that checkbook comes out more slowly and with more misgivings these days.

Engineering and Product Management furrow their brow nowadays when I walk in with cost estimates. I’ve adapted to this change in attitude with a few techniques:

  1. The Technical Reference is the fattest target and the source of most of the expansion. It lives in a compiled help file (CHM) that is no longer written by Tech Pubs, but generated by Perl scripts from header files written by the engineers. Our modus localizandi has been to hand off the finished help project, now comprising 3700 HTML files, and have the HTML translated. In an effort to lower cost, I’m attempting a proof-of-concept to localize the header files themselves, then tune the scripts to convert them into localized HTML. This should lower our localization engineering costs considerably.
  2. I agitate for interim localization updates, peeling off documentation deltas every few weeks and handing them off for translation, even if there are no plans to release them yet. This reduces the sticker shock and time-to-market delay that comes of getting an estimate on a release only when necessary, which may be a 10- to 18-month interval. Product Management and Engineering, who only think about localization when it’s absolutely unavoidable, find the tsunami of untranslated text depressing.
  3. Although it’s not a very clean way of doing things, I screen from the localization handoff those items that I know have little to be translated. Sometimes I go to the level of resource files, but more often I take documents to which only a few minor changes have been made from one En version to the next, hand off changed text, then place the translations myself. This is not for the faint of heart, nor for those who don’t really know the languages involved, but it can save some money.
  4. I try to keep global plates spinning, in the hope that more people will consider the global dimension of what we do, and the fact that localization is the necessary step for making your product acceptable to people whose use of your product will make you money, if you make it easy for them.
  5. I never impart bad news on Friday.

Localized Binaries – The Plot Thickens

December 7th, 2006 Comments off

The engineer has demonstrated that it is no longer possible to build just the resource binaries; it is now necessary to build the entire blinking product.

“Why is that?” I ask.

“We’ve improved the makefile,” he replies. The makefile is a script used by the make command to build binaries.

“That doesn’t feel like an improvement to me,” I venture. “Why can’t I just build the two or three resource binaries I need? I don’t need all of the executables and other rot.”

“Yes, well, we’ve improved the makefile.”

“But there was a small, localized makefile that lived in each of the directories of the resource binaries I wanted. What happened to them?”

“We improved the main makefile by rolling all of those lower-level makefiles into it.”

That’s a hint to me that they improved it for the purpose of creating all of the files that go into the installer, but that’s far and away more files than I want. It also means that it’s probably going to take me a half-hour now to build binaries that used to take about six seconds each.

Had they been following good I18n hygiene, they’d have asked themselves (or me, even) whether there were costs associated with consolidating all of the lower-level makefiles and eliminating the possibility of rebuilding except in this huge batch. The costs don’t really affect them that much, though they’ll slow me down somewhat.

It’s an “improvement.”

Translators – The Tech Writer’s Best Friend/Worst Enemy

October 2nd, 2006 Comments off

“Say, Jean, the translators found some problems with the original English documentation your team wrote. Do you want me to send you the corrections?”

“Sure, John. Do you want me to send you a nice cartridge of mustard gas?”

I never find that writers embrace the feedback from translators. It’s not that the translators go out of their way to find errors; mostly it’s that they don’t understand the term/phrase/sentence/paragraph and cannot therefore translate it. This is the fundamental test of documentation usability – are you conveying your idea to the reader? – and most writers get grumpy when they don’t pass it with flying colors.

It’s also not the case that translators get snooty about the errors they find. In fact it’s I, not the translators, who add a thin veneer of snootiness to the comments I send to the writers.

  • Why are all of these Copy(1) and Copy(2) files in the RoboHelp project? Do we need them? Should we translate them?
  • The FrameMaker files you gave us don’t match the PDF you gave us. Which one is correct? (To their credit, most translators won’t take for granted that the one with the later date stamp is the definitive source.)
  • This training manual is for version 5.3. The last version of the software that we translated was 5.1. What has changed in the software? Do we need to translate the delta first, in order to translate the manual properly?
  • We found 136 pages in the online help file with no content in them. Are they meant to be that way, or did something go wrong in the extraction process?

Writers don’t often like to hear such feedback, but, if it’s implemented, nobody can deny that it makes the books better.

Segmentation and Translation Memory

September 20th, 2006 Comments off

To get the broken sentences in the new files to find their equivalents (or even just fuzzy matches) in translation memory we have three options:

  1. Modify the Perl scripts that extract the text from the header files into the HTML, so that the scripts no longer introduce the hard returns.
  2. Massage the HTML files themselves and replace the hard returns with spaces.
  3. Tune the segmentation rules in Trados such that it ignores the hard returns (but only the ones we want it to ignore) and doesn’t consider the segment finished until it gets to a hard stop/period.

To go as far upstream as possible, I suppose we should opt for #1 and fix the problem at its source. This seems optimal, unless we subsequently break more things than we repair. Options #2 and #3 are neat hacks and good opportunities to exercise fun tools, but they burn up time and still don’t fix the problem upstream.

Also, I don’t want the tail to wag the dog. The money spent in translating false positives may be less than the time and money spent in fixing the problem.

Moving the Localization Carpet under the Source Text

September 15th, 2006 Comments off

Here’s the mess I face.

The HTML files are filled with paragraphs formatted like this:

Currently, this function gets called for trust overrides and

client authentication. On client authentication, the supplied

interface contains the server’s Certificate Authorities Distinguished

Names list (see references) and the negotiation handler

always gets called so as to give a chance to the client to supply

the correct client certificate based on the DN list.

At the end of each line are two hard returns. It wasn’t always this way, so each complete sentence is sitting happily in translation memory. Unfortunately, Trados pulls in each of these six 80-character fragments and calls it low- or no-match because it can’t find enough of a concordance. This is a classic case of false positives driving up translation costs.

I’m still exploring options. Meanwhile, there’s no sense in starting the translation work.

Who’s in trouble: the Localization Vendor or me?

September 10th, 2006 Comments off

The localization estimate has come back on the HTML files in the API Reference, and it’s as ghastly high as I’d feared.

The vendor’s project manager does something clever with these thousands of files: she uses SDLX Glue to glue them together into six or seven batches of several hundred files each. That way she avoids carpet-bombing the translator with jillions of files; this also keeps the translator in the translation business and out of the file management business. After translation, the project manager un-glues them using SDLX Glue and hands them off internally for engineering, QA, etc.

The downside to this technique is that the TM analysis comprises only the six or seven files. I can’t see down to the level of granularity I want unless I ask for a special analysis down to the file. They don’t mind doing it for me, but it’s not in their regular workflow and I have to wait for it.

Anyway, the count of unmatched words is preposterously high, and I’m pretty sure it’s due to changes in the scripts that extract the HTML from the header files. Sentences and segments in version 4.0 don’t match those in the last version because of things like double line-feeds and mucked up HTML tags.

I need to have a deeper look at the original English HTML files and bucket them for handoff. BeyondCompare shows me that the text in some files hasn’t changed at all, and I’ll need to spoon-feed these to the vendor.

Either that or get shot down when I take this estimate up to the third floor for approval…

The Lonely Localization Manager

September 5th, 2006 Comments off

It’s a bit strange, the way in which I get my work done.

Naturally, I play the role of localization manager and primary contact at developer meetings, and I manage budget and schedule for several projects at once. I’ve become the lightning rod for issues ranging from character sets to encodings to what-do-these-Chinese-characters-say. All in all, most localization issues are pretty well in hand because I’ve been able to manage them in a way that conforms to best practices, with a bit of experimentation thrown in to see how much better we can make things.

I suspect, though, that most localization managers who might read this have teams and staff and reporting structures. I would bet they also spend more time scrapping internally with development managers, QA and upper management, struggling to make localization conspicuous and wildly successful. I just make it work.

As localization manager for a software company in the early 1990′s I went through that. I get a lot more done this way, and I enjoy it more.

There was that exception last year. I had a client that drove me bonkers because their entire corporate culture was geared to nothing more than tolerating localization, and that with a clothespin on their nose. Wish I’d been blogging during that engagement; it was a wild ride.