We live in a global economy with interconnected markets. Still, software companies tend to target the local market first and with good reason. The local market provides a certain comfort level that the global market does not. Companies understand their own business climate, language, culture, and needs of their local customers best. However, overlooking the global market can hinder a company’s exposure to potential customers and increased profits.
Localizing a product—translating the product into another language—can be expensive. Localization may cost tens of thousands of dollars for a new translation of a typical commercial-grade application and its accompanying documents and several thousand dollars for each subsequent translation as software upgrades are released. Finally, international sales can be more expensive, as they may require traveling abroad or establishing distributor partnerships in foreign markets.
Despite these disadvantages, international sales opportunities can be extremely profitable, and these profits are not reserved for large, global corporations. I co-founded a small software business in the mid-1990s, which, at its largest, had a dozen full-time employees. International customers comprised over one-third of our sales, despite our product being available only in English and only on physical media that had to be shipped through customs. The increasing availability of high bandwidth Internet connections and web-based software-as-a-service (SaaS) architectures has flattened many of the barriers to international commerce that small businesses faced just a decade ago. With some attention paid to software internationalization, a small business can overcome remaining challenges such as language and culture.
Internationalization, like system security or scalability, is far simpler to design in than to add in later. Even if it has no current plan to localize its software, a company should treat internationalization as a functional product requirement by including it in formal requirements processes, communicating it clearly to the software engineering team, and verifying it in test. Before I discuss best practices, I want to review the process of software localization to provide background for the internationalization strategies listed below.
Software Localization Review
Few software companies have the talent or resources to translate their products themselves. Most use localization vendors who provide translation services on a contract basis. Vendors work from English-language text and provide translated text for a product and its documentation. Localization tools utilize databases called translation memories (TMs) to automate the translation of commonly used or previously translated phrases. Vendors bill translation services by the word, and machine-translated words cost about one-fifth as much as words that have to be translated by a human linguist. Vendors translate text outside of the context of the customer’s application; often companies will also contract with the vendor for linguistic validation—a review of the finished product to ensure that the translation makes sense in context.
As with other outsourced projects, the cost and quality of the deliverable will depend on the following:
Good organization of materials,
Repeatable and well-documented processes,
Clear communication with the vendor, and
Careful management of the vendor relationship.
A company with a well-internationalized product can avoid expensive delays, quality problems, and rework when the time comes to localize the product.
Best Practices Overview
Content Separate from Presentation
The key to internationalizing software is to separate content from code and from the details of its presentation. Content should be considered anything communicated to the user in English, including menus, labels, messages, other textual content, and documentation. It may also include elements beyond the traditional scope of the software product itself such as marketing or training collateral.
Proper internationalization should enable content to be packaged for delivery to the localization vendor and translated content to be received and integrated with minimum disruption. How this goal is best accomplished will depend on how the content is used—in a compiled application, print or online documentation, or a Web site or Web application. Each involves tools for compiling or publishing content, and the choice of tool will dictate the format of the content: Properties files (.properties) are used in Java, resource files (.rc) are used for Microsoft Windows native code, Web application servers use XML, and documentation and online help often use DocBook. These formats are similar with all using plain text, name-value pairs, and Unicode character sets.
The most important thing a company can do to make a product ready for the global market is to treat internationalization as a functional product requirement by including it in formal requirements processes, communicating it clearly to the software engineering team, and verifying it in test. The team’s coding standards should address internationalization best practices; peer reviews and functional tests in Quality Assurance (QA) should reinforce the standards.
Internationalization begins with system architecture and informs the process of design, coding, configuration management, and documentation. System architecture may include operating systems, web servers, databases, programming languages and compilers, and third-party components. Nearly all of these provide internationalization capabilities, but the formats, character sets, and locales they support may vary.
Choose an architecture that will support the locales in the target market, and try to avoid converting between character sets when moving data from one tier of the system to another. Pay particular attention to how the database supports character sets and collation in other languages, as this can differ widely from one database to another. Determine what languages commerical off-the-shelf (COTS) or open-source libraries and components support and confirm that the license permits localization in other languages, if necessary.
The programmer’s role in internationalization is straightforward and critical. Programmers are primarily responsible for separating content from presentation in software, and programmers who use internationalization best practices can save a company in potential costly internationalization errors. Programmers should be encouraged to follow best practices and reminded through peer review or bugs from the QA team when they fall short of them.
Programmers should know the basics by rote:
Never embed strings in code;
Do not split sentences into multiple strings;
Be sure that code and user interfaces will accommodate the expansion of strings due to translation;
Do not include text in graphical elements; and
Use the appropriate character encoding.
Inattention to these core practices can result in serious problems when software is localized. Embedded strings will not be translated, and rooting them out will introduce build breaks and functional regressions. Split sentences may not translate correctly into a language that does not mirror English sentence structure. Strings in Romance and Germanic languages can be twice as long as equivalent English strings, causing screen layouts to overflow or wrap or strings to be truncated. Graphics containing text will have to be updated manually. Incorrect character encoding will result in garbage characters appearing in place of text.
There are other considerations in addition to those listed above, but these are the basics. They are not difficult to implement if they are factored into a software product from the start, and they can be easily verified by code reviews and test processes.
A Localization Plan
The quality of a localization will directly influence customers’ perceptions, so its success is critical. A written localization plan will help ensure effective communication with the vendor and will also help guide internationalization by the company’s development and test teams. Here are the essential elements of a localization plan:
A list of the files and directories to be sent for localization;
The character encoding and escaping (if required) to be used for translated files;
The naming convention and directory structure to be used for translated files;
Any other information that would be helpful to the translator, such as descriptions of files for context, any strings that should not be translated, or conventions used for substitution variables;
A functional test suite to guide the vendor through the product during linguistic validation; and
Instructions for setting up and running the functional tests.
Test and Verification
Quality Assurance (QA) plays a key role. As with any other software functional requirement, internationalization is never truly “done” until it has been verified in test. The simplest way to verify internationalization, if the code supports it, is to strip out the strings files and run through a functional coverage suite to make sure no plain-English text appears, only string names. Internationalization testing should be completed whether or not localization is planned to prevent bugs from entering inadvertently or through new third-party components. The project manager (PM) should schedule internationalization testing after functional testing is complete and, if localization is planned, before the user interface is frozen and string files are sent out for translation.
If the product release is to be localized, the PM should include a user interface freeze as a milestone in the project plan following functional testing and should lock the strings files in source control while the files are out. By adding this step, the PM can manage schedule and costs by limiting updates sent to the vendor.
After translations come back from the vendor, the product team builds the localized product. QA will use the same test suite against the localized build to verify that it remains functional and that the product has been translated. Finally, these are sent back to the vendor for linguistic validation.
The PM should request and retain a copy of the vendor’s TM, which will typically be in translation memory exchange (TMX) format. The TMX should be regarded as an insurance policy if anything should happen to the vendor’s copy or should the company want to change vendors.
The global market offers unique opportunities, and these opportunities are no longer limited to Fortune 500 companies. If a company wants its products to reach an international audience, the company needs to be prepared. Internationalization can be a challenging prospect; however, adopting internationalization best practices in early stages of product development is more simple and costs less than trying to retrofit internationalization in a product that was not designed for it.