16 December 2006

Favorite Localization Tools

Here's a short list of Windows-based tools I use a great deal in managing localization projects:

Beyond Compare
- Clients constantly drill me about the differences between the last version of their product and this version, with an eye to the order of magnitude of localization expense they're in for. Beyond Compare is the best tool I've found for finding the files that have changed, then comparing older and newer versions of files in a specialized viewer. Good technical support as well.

EmEditor - As long as you have the font and OS support installed, you can view multi-byte characters in their appropriate applications under English-language Windows, but EmEditor allows you to change the encoding of a text file to better display it, or so that you can edit it. My standard text editor is Ultra-Edit, which has excellent search-and-replace capability, but it's not as deft as EmEditor for multibyte work on an English OS.

SDLX Glue - An obscure utility inside the SDLX suite, this will append up to I don't know how many hundred HTML files together. Translation vendors like it for work on big sites because it slashes the number of files being slung around. Naturally, it includes an unglue utility as well.

FAR - A technical writer introduced me to this utility, which includes a compiler system for HTML Help and MS Help. It will compile CHM files in any language such that, if you have a good HTML authoring tool, you don't need RoboHelp to build your CHMs. (Unfortunately, I've had problems when I've tried to use FAR on projects that have been created in RoboHelp, but there are some ways around them.)

Moreover, FAR stands for "Find And Replace", and this is hands down the best front end on regular expressions that I've ever found. The Holy Grail of search-and-replace is ignoring line breaks, and while regex supports that, not many utilities (that I've found) implement it. For instance, in the text

In a white room

with black curtains

at the station

if your goal was to find "room with black curtains at", most utilities would not be able to locate it because of the line breaks. FAR does find it, and even allows you to replace the text with line breaks. Top-flight technical support also.

Most of these are shareware, but they're well worth the US$25-$50.

(compiling CHMs, finding and replacing across line breaks)

Labels: , , , , , ,

25 September 2006

Doing the Localization Vendor's Work?

Sometimes I know too much about this process.

Or, maybe I'm just too nice a guy.

To make things easier for the vendor (and cheaper for me) I've resolved to carve the 3200 HTML files in the API Reference CHM into different buckets, depending on whether and how much they require translation vs. engineering. Naturally, the ultimate arbiter is the Trados or SDLX analysis that the vendor will perform, but I've already mentioned my concern about false positives and need write no more on the topic here.

My tool of choice is the extremely capable Beyond Compare which, at US$30, is worth it just to see how well thought-out a software package it is. I compare version 3.9 files against version 4 files, tuning the comparison rules to groom the file buckets as accurately as possible.

The distribution is not perfect, if for no other reason than because its first level of triage is the filename and not the file contents, but it's better than guessing, and it's much better than thousands of false positives.

Once I've gone through the files, I'll have a better idea of how to label the buckets in a way that meets both my needs and those of the vendor.

At least, I think I'm being too nice a guy. Maybe this is just a big pain for the vendor, and they're too polite to inform me of that.

Labels: , , , ,