04 December 2008

Localizing DITA Projects

Have you seen DITA projects land in your inbox yet? The full promise of XML is about to become your next headache.

If you don't know what DITA is, here's the thumbnail from the Open Toolkit's User Guide:
"DITA (Darwin Information Typing Architecture) is an XML-based, end-to-end architecture for authoring, producing, and delivering information (often called content) as discrete, typed topics."
In short, the source content you hand off for localization lives in XML files. If you get to the party soon enough, you can help your own cause by asking the authors to use specific XML tags in their authoring to make it easy for you to find text you need to translate and to ignore text you don't need to translate. The authors will surely fall all over themselves to make you happy with this new technology, so take advantage of it while it's still novel.

The problem with XML is that it's ugly and nobody can use it as documentation in that format, so it needs to be transformed into HTML, PDF, CHM, XHTML, or some other gestalt that people will use. The DITA Open Toolkit is an open-source means for performing this transformation, using scripts and languages to shape the content.

Your problem as a localization professional is not in the XML; it's in the transformation.

How do you know that the scripts your writers use for the source language (let's say, English) will work when you have to run them on XML files translated into Korean or Hebrew or Russian? (Well, they will run; the question is whether the result is good or garbage.)

With a kit like the Open Toolkit, things run as advertised when used right out of the box. The open-source project even devotes a chapter of its user guide to "Localizing (translating) your DITA content," and they are kind enough to provide pre-translated text like "Parent Topic," "Previous," "Next," which you can hook with the xml:lang attribute. The tricky part lies in the customization.

One Tech Pubs team engaged a team of script programmers to customize the toolkit. They've introduced strings like "Copyright Statement" and "Enter keyword" and placed a "Last updated" datestamp on every page in the help project. They've also implemented a search function (gulp!) so users can locate content in the help files. There's nothing wrong with this customization work, except that nobody was thinking of other languages while doing it. Now we're sorting out the location of the custom strings, the way to get the toolkit to format dates according to locale, and how to convince the search function that characters can take up more than one byte.

You will face the same problems. You'll need to internationalize your writers' customizations so that things work properly in your target language.

So when your writers tell you how much easier your life will be now that content is in XML, don't forget to look a bit further down the road at what they're using to transform that XML into something useful. That's where you'll put in the hours.

Labels: , , ,

21 August 2008

Localizing Code Snippets - Part II

Last week I posted on the dilemma of how to localize Code Snippets, the selected pieces of your documentation that you shoehorn into XML files so that Visual Studio can present them in tool-tip-like fashion to the user while s/he is writing code that depends on your documentation.

My goal was to ensure that the process of grabbing these bits of documentation (mostly one-sentence descriptions and usage tips) was internationalized, so that we could run it on translated documentation and save money. This has proved more difficult than anticipated.

Here is the lesson: If you think it's hard to get internal support for internationalizing your company's revenue-generating products, just try to get support for internationalizing the myriad hacks, scripts, macros and shortcuts your developers use to create those products.

In this client's case, it makes more sense to translate the documentation, then re-use that translation memory on all of the Code Snippet files derived from the documentation. It will cost more money (mostly for translation engineering and QA, rather than for new translation) in the short run, but less headache and delay in the long run. Not to mention fewer battles I need to fight.

Discretion is the better part of localization valor.

Labels: , , , , ,

14 August 2008

Localizing Code Snippets

"Why would I localize code snippets?" you ask. (Go ahead; ask.)

Everybody knows you don't translate snippets of code. Even if you found a translator brave enough to take on something like int IBACKLIGHT_GetBacklightInfo(IBacklight *p, AEEBacklightInfo * pBacklightInfo), the compiler would just laugh and spit out error messages.

However, if you're a developer (say, of Windows applications) working in an integrated development environment (say, Microsoft Visual Studio), you may want to refer very quickly to the correct syntax and description of a feature without searching for it in the reference manual. The Code Snippet enhancement to Visual Studio makes this possible with a small popup box that contains thumbnail documentation on the particular interface the developer wants to use. It's similar in concept and appearance to the "What's This?" contextual help offered by right-clicking on options in many Windows applications.

How does the thumbnail documentation get in there? It's a tortuous path, but the enhancement pulls text from XML-formatted .snippet files. You can fill the .snippet files with the information yourself, or you can populate them from your main documentation source using Perl scripts and XSL transformation. So while you're not really translating code snippets, you're translating Code Snippets.

And therein lies the problem.


One of our clients is implementing Code Snippets, but the Perl scripts and XSL transformation scripts they're using to extract the documentation, don't support Unicode. I found this out because I pseudo-translated some of the source documentation and ran the scripts on them. Much of the text didn't survive to the .snippet files, so we're on a quest to find the offending portions of the scripts and suggest internationalization changes.

We've determined that the translated documentation in the Code Snippets will display properly in Visual Studio; the perilous part of the journey is the process of extracting the desired subset of documentation and pouring it into the .snippet files. Don't expect that your developers will automatically enable the code for this; you'll probably have to politely persist to have it done right.

Alternatives:
  • Wait until all of your documentation has been translated, then translate the .snippet files. It's more time-consuming and it will cost you more, but working this far downstream may be easier than getting your developers to clean up their scripts.
  • Make your Japanese developers tolerate English documentation in the Code Snippets.
Neither one is really the Jedi way. Work with your developers on this.

Labels: , , , , ,

17 July 2008

Getting Documentation Ready for Localization - The Audience Speaks

Don't you love it when the audience is listening? Even more when they write back?

Last week's post included a handful of considerations about preparing documentation for localization. An alert reader and industry veteran (who prefers obscurity to the onslaught of Web-fame that this post will undoubtedly unleash) sent me a table of resources she has compiled on the topic over several years' time:



Title

Publisher

Summary
and Notes

25 Tactics
to “Internationalize” your English

Intercom
(STC magazine)

Hints on
writing localization-friendly copy. (One example: choose words with one or
few meanings).

Authoring
and controlled language

TAUS
(Translation Automation User Society)

A guide to
how and why companies are starting to manage their writing and editing
“upstream.”

Basic Tips
for Loc Writing

Globalvision International

A brief
overview from a translator’s perspective on how to simplify the work of
the translator.

Color
Connotations

Lionbridge

Guidelines
on how different colors are perceived throughout the
world.

Localizing
Art

Globalvision International

Tips on
how to improve graphics localization.

Reducing
Localization Costs

Globalvision International

Tips on
how to write text that is less expensive to localization – both new copy
and updates.

Tech
Writing for Localization

Client
Side News Magazine – Tech Writer
supplement

How
culture and jargon impacts writing and localization. Tips on the purpose
and benefits of standards and
templates.

Terminology Management White
Paper

Jonckers

Why
consistent terminology is important to
localization.

Writing
for Translation

Multilingual Magazine

Tips on
how to simplify text. Information on how DITA impacts
localization.

The contributor comments, "Please note that some of this is proprietary to the publisher and not generally available."

Find what you can and help yourself!

Labels: , ,

10 July 2008

Getting your Documentation Ready for Localization

Have you had to prepare your documentation for localization yet? My experience is that in almost all companies, writers have far too many other oppressive concerns gnawing at them to think about writing for localization.

A few days ago an industry colleague sent me a message asking, "Do you have experience making recommendations for how documentation can be authored for localization? I am looking to make our doc  process more efficient to reduce costs."

I replied that, given his stature and tenure in the industry, there was not likely anything I could suggest that he hadn't already considered. Nevertheless, I sent him a list of ideas, in increasing order of difficulty:
  1. Make sure all the writers' computers are plugged in. (A bit of ironic humor I could not resist.)
  2. Is it easy to get from the authoring tool(s) into TM, and back out into publishable format? This is my current headache with an API reference manual we localize for one client, because moving from source language to the translator tools and back to target format is a colossal headache. If you have similar problems, devote some cycles at the format-layer, even if it means writing an interface between your content management system and the translation tool.
  3. There are "authoring memory" tools that can suggest and re-use already-xlated source text, so that writers don't say nearly the same thing multiple times and incur unnecessary TM penalties. Sajan has one, and SDLX contains one as well. I've never used either one, but I can imagine that success with the tools would require somebody with the documentation-familiarity of a technical writer and the global consciousness of a localization manager. Like you.
  4. I've presented on localization to a variety of audiences, and have consistently found tech writers to be the most interested in it, vastly more so than developers. When you show writers how the TM tools work, tell them how they can save money and re-use content, and let them know that you care about the impact of their work on international products, they will smell the coffee and engage. This takes a bit of evangelism, but it's worth it if the writers change their own practices.
  5. Convert everything to XML. Although Renato and Don of Common Sense Advisory joke that that will fix any L10n problem, it's nonetheless a good, long-term direction in which to move. It's easier to re-use text, and easier to mark text that should/should not be translated. That will save you money.
  6. Start a program of controlled language authoring (dumbing down the sentences, always writing in a structure that machine translation will recognize, etc.). I guess that GM and Caterpillar are poster children for this kind of thing, but it puts the writers (and you, in the bargain) through the change of life, which is why I mention it last.
What about you? Have you faced this in your organization? How have you made document localization easier for the company, without driving your writers crazy?

If you liked this post, have a look at Getting Writers to Care about Localized Documents.

Labels: , , , ,

19 October 2007

Whaddya know? They asked me first this time!

Do you spend a lot of your time running to catch up to the train? Have you ever been surprised in the middle of a meeting by project plans that were well underway with no thought given yet to localization? Are you getting used to it?

What if they asked you first (or at least early on) about the project's implications for internationalization and localization? Would you know how to react?

This certainly caught me by surprise a few months ago. A client called me in for consultation. He didn't want me to manage the upcoming localization of his user manuals; he wanted me to review and edit the English versions so that they would be ready to localize.

This client, though small, is enlightened. The company is selling English, French, German, Spanish and Japanese versions of several products, and it has a hand-in-glove relationship with its localization company. It knows where its global bread is buttered.

I jumped at the chance to work with people thinking this far in advance, so I reviewed the manuals and submitted changes, almost all of which were acceptable.

How can you review/edit documentation with an eye to translating it?
  1. Take advantage of redundancy. Ensuring that identical sentences and paragraphs remain identical is a good way to lower per-word translation costs. Turn the text into a bookmark at its first occurrence, then invoke or cross-reference that bookmark at subsequent occurrences.
  2. Ensure that the product matches the documentation. Not all organizations get around to this, believe it or not, and it becomes a bit of value added by the internationalization/localization function.
  3. Standardize terms. Especially in companies without a well developed team of writers, manuals end up with pairs or trios of synonyms that will vex translators and add no information, so take the liberty of eliminating one in favor of the other:
    • Determine/specify
    • based on/according to
    • click the button/click on the button/select the button
    • lets you/enables you to/allows you to
  4. Mention errors and inconsistencies that have nothing to do with internationalization. Again, you increase the perceived value of the localization function. Even though the result doesn't affect the localized products, the Localization Department (you) are contributing to a better core product.
  5. Axe a few "dead" words. They add little to the explanation, will probably not survive translation, and inflate wordcount:
    • unique
    • basically
    • popular
    • congratulations
    • very much
By the way, the review took longer than I'd anticipated, so if you have a similar opportunity, don't bid a flat fee the first time.

Interested in this topic? Have a look at Improved Docs through Localization.

Labels: , , , ,

16 January 2007

Improved Docs through Localization

I spent some time on the phone with new clients last week, going through a user guide they plan to have localized. Discussing the usual localization questions (i.e., the ones I figured the translators would ask sooner or later), we began to edge towards the initially depressing realm of Changing Documentation to Suit Localization.

"Don't misunderstand," I repeated intoned, "I'm not trying to get you to re-write an already published book just to make localization easier. We're just bringing up small issues in how you can write future books a bit more generically so that you can take exactly what you've published in English and hand it off for localization without customizing it first."

Still, I thought I detected a collective, resigned sigh from them. I've learned by now that it translates to "Writing for translation is really going to be a pain, isn't it?"

They then asked for suggestions about optimizing future documents for localization purposes, in the form of guidelines or style guides. This is good thinking, and I told them so. It amounts to documentation internationalization.

I've read plenty of articles on how to do this (authors include Kit Brown of Comgenesis and Nancy Combe), but I usually find them superficial (leave white space, use numbered callouts, be sure to do the software first...), because the solution doesn't lie in documents, but rather in each organization and in the way that Engineering, Product Management, Tech Pubs and the overseas partners work together.

I told them that they could read up on this for a month, or we could all just go through the process of localizing an already written manual and make our own guidelines. The former won't do any harm, but I think they'll find that the latter will result in more - and more-specific - pointers that will apply to future books.

The important thing is to arrive at gradual changes that the company will tolerate in the next 3/6/12 months, so that their books become more global without the localization-tail wagging the dog.

Labels: , ,