29 May 2008

Localizing Robohelp Files - The Basics

We get a lot of search engine queries like "localize Robohelp file" and "translate help project." I'm pretty sure that most of them come from technical writers who have used Robohelp to create help projects (Compiled HTML Help Format), and who have suddenly received the assignment to get the projects localized.

The short answer
Find a localization company who can demonstrate to your satisfaction that it has done this before, and hand off the entire English version of your project - .hpj, .hhc, .hhk, .htm/.html and, of course, the .chm. Then go back to your regularly scheduled crisis. You should give the final version a quick smoke test before releasing it, for your own edification as well as to see whether anything is conspicuously missing or wrong.

The medium answer
Maybe you don't have the inclination or budget to have this done professionally, and you want to localize the CHM in house. Or perhaps you're the in-country partner of a company whose product needs localizing, and you've convinced yourself that it cannot be that much harder than translating a text file, so why not try it?

You're partially right: it's not impossible. In fact, it's even possible to decompile all of the HTML pages out of the binary CHM and start work from there. But your best bet is to obtain the entire help project mentioned above and then use translation memory software to simplify the process. Once you've finished translating, you'll need to compile the localized CHM using Robohelp or another help-authoring product (even hhc.exe).

The long answer
This is the medium answer with a bit more detail and several warnings.
  • There may be a way to translate inside the compiled help file, but I wouldn't trust it. Fundamentally, it's necessary to translate all of the HTML pages, then recompile the CHM; thus, it requires translation talent and some light engineering talent. If you don't have either one, then stop and go back to The Short Answer.
  • hhc.exe is the Microsoft HTML Help compiler that comes with Windows. It's part of the HTML Help Workshop freely available from Microsoft. This workshop is not an authoring environment like Robohelp, but it offers the engineering muscle to create a CHM once you have created all of the HTML content. If you have to localize a CHM without recourse to the original project, you can use hhc.exe to decompile all of the HTML pages out of the CHM.
  • Robohelp combines an authoring environment for creating the HTML pages and the hooks to the HTML Help compiler. As such, it is the one-stop shopping solution for creating a CHM. However, it is known to introduce formatting and features that confuse the standard compiler, such that some Robohelp projects need to be compiled in Robohelp.
  • Robohelp was developed by BlueSky Software, which morphed into eHelp, which was acquired by Macromedia, which Adobe bought. Along the way it made some decisions about Asian languages that resulted in the need to compile Asian language projects with the Asian language version of Robohelp. This non-international approach was complicated by the fact that not all English versions of Robohelp were available for Asian languages. Perhaps Adobe has dealt with this by now, but if you're still authoring in early versions, be prepared for your localization vendor to tell you that it needs to use an even earlier Asian- language version.
  • Because the hierarchical table of contents is not HTML, you may find that you need to assign to it a different encoding from that of the HTML pages for everything to show up properly in the localized CHM, especially in double-byte languages.
  • The main value in a CHM lies in the links from one page to another. In a complex project, these links can get quite long. Translators should stay away from them, and the best way to accomplish that is with translation memory software such as Déjà Vu, SDL Trados, across or Wordfast. These tools insulate tags and other untouchable elements from even novice translators.
We've marveled at how many search engine queries there are about localizing these projects, and we think that Robohelp and the other authoring environments have done a poor job explaining what's involved.

If you liked this article have a look at "Localizing Robohelp Projects."

Labels: , , , , , , , ,

11 May 2007

Localizing RoboHelp projects

Is it time for you to localize you RoboHelp projects? What's involved?

"RoboHelp project" is shorthand for "compiled help system." When this lives on a Windows client computer it is usually HTML Help (CHM) files. There are other variations like Web Help, which are also compiled HTML, but which do not run on the client.

The projects are a set of HTML files, authored in a tool such as--but not limited to--RoboHelp, then compiled into a binary form that allows for indexing, hierarchy and table of contents. Other platforms (Mac OS, Linux, Java) require a different compiler, but the theory is the same.

If you've done localization before, you'll find that RoboHelp projects are relatively easy, compared to a software project. RoboHelp (or whatever your authoring/compilation environment may be) creates a directory structure and file set that is easy to archive and hand off. It includes a main project file, table of contents file and index file. In fact, it's even possible in a pinch to simply hand off the compiled file, and have the localizers decompile it; the files they need will fall into place as a result of the decompilation.

Although you may think of the project as a single entity for localization purposes, each HTML page is a separate component. There may be large numbers of these pages that don't change from one version of your product to the next; nevertheless, you need to hand them off with the project, and you'll likely be charged for a certain amount of "touching" that the localizer's engineers will need to do. You may be able to save them some work and yourself some money by analyzing the project and determining which pages have no translatable changes, but by and large you should consider the costs for touching unchanged pages an unavoidable expense.

The biggest problem with these projects is in-country review. There's no easy way for an in-country reviewer to make changes or post comments in the compiled localized version. We've found that MS Excel is the worst way of doing this (except for all the others), so we've learned to live with it.

In theory, the translators are not mucking about with any tags, so the compiled localized version should work the same as the original. Yeah, right. All the links need to be checked--they do break sometimes--and the index and table of contents should be validated. And, don't forget to try a few searches to make sure they work; your customers surely will, and you want to spare them any unpleasant surprises.

Remember:
  • If you've included graphics in your help project, you'll need to obtain the original source files. These are not GIFs or JPEGs; they will be the application files from which the GIFs and JPEGs were generated. You'll need to hand off files from applications like Adobe Illustrator, or Flash or even PowerPoint, so that the translators can properly edit the text in them. Engineers often do quick mock-ups in Microsoft Word's Word Art that end up in the final product, and it takes a while to track them down.
  • Encoding can be thorny. Some compilers behave oddly if you try to impose the same encoding on both the HTML pages and the table of contents, especially in Japanese, in our experience.

Labels: , , , , ,

20 April 2007

Localization Testbenches, Part IV (Online Help)

What are you using to test your localized products? If you're handing them to your domestic QA team and expecting that they'll intuitively test them with correct language locale settings, you may be in for an unpleasant surprise.

3) Help files
Your online documentation also deserves some testing. After its contents (usually HTML pages or XML documents) have been translated - in the correct encoding for the target language - the help project will be compiled, in the same way that software applications are compiled. This compilation step needs to account for the correct language, locale and encoding, and this doesn't happen by itself, no matter how lucky you may feel today.

Again, it's important to test the help file in an environment that closely matches your customers' environment. Run your Greek help file on a native Greek operating system. Be sure to test the main window, the contents pane and the index for properly displayed characters. Above all, perform a few searches using native characters in the Find field to ensure that your help file's index was properly created and encoded; if your searches are successful, then your customers' searches will probably be successful as well.

Note: HTML Help under Windows has some idiosyncrasies when it comes to the table of contents (TOC) pane and the main window. Most tools like RoboHelp will properly encode the TOC and main pane content for, say Japanese, when all of the content resides in the same project. However, if you're building your HTML help files with your own tools (e.g., Perl scripts and hh.exe), you may find that encoding sauce for the goose is not encoding sauce for the gander. We've found, for example, that the HTML pages displayed in the main window are happy with UTF-8, whereas the TOC pane won't support UTF-8 but will support Shift-JIS.

Labels: , , ,