A developer guide to software localization

Software has become as multilingual and multicultural as the world itself. Localizing your code can keep your company from a potentially disastrous cultural blunder.

Tom Nolle, Andover Intel

Published: 12 Aug 2020

Business is global, and application users are global. So, software must be global as well. Rarely does one software implementation fit with every geographic environment, and dispersed countries typically have their own unique languages, cultures, systems of measures and cultural expectations -- all factors that application developers need to consider when creating globally reaching apps.

Developers need to understand the basics of software localization to make an app adapt to different conditions. In this tip, we'll review the basic strategies used to localize apps, the challenges of localization and the most important pitfalls to avoid.

Place localization rules in the front end

So, how do you produce software that can be used everywhere, particularly where components of the software may run in different places, even other than where users are located?

Good software localization strategies have to start with internationalizing the design of your applications. Developers need to create a framework where information presentation can be specialized to the language and culture of the user, without having a version of the software for every possible combination of the two.

Distributed systems, particularly those with reusable components, create problems in this phase of localization. Each component might be hosted in a different location and run on different rules. If the application uses shared or reusable components, it's critical to ensure that, when users change, the formatting rules do as well. Treat the user identity as a dynamic part of an application's state.

However, it's difficult to pass localization rules as a dialog state through large, complex groups of processes. The best method is to implement those rules close to the end user -- ideally, at the GUI. This enables the front-end piece of an application to turn those linguistic and cultural rules into code that consistently enforces them.

Create a dedicated localization database

As far as specific localization strategies are concerned, the first is to store all messages displayed to a user, such as textual information and graphics, in a dedicated database. Never incorporate text, graphics or their respective blocks of code directly into applications as local data tables, as they will become buried and difficult to assess. This database approach keeps localized text and graphics in a single place, where they can be reviewed and updated easily.

Visual localization

Localizing visual information isn't the same as simply managing an array of multilingual terms. Things like prices, discounts and tax rates must also be appropriately localized in the application. However, it's smart to include pricing and other country-specific commercial data as part of the state information used for particular localized user dialogs. This reduces the risk of creating errors when the application struggles to differentiate between the semantic localization data and the commercial localization data.

Through the database approach, applications will generate outputs that trigger the front end to convert the data into text and graphics. Never use fixed strings or nonchanging values -- also called literals -- for text and graphics because it will become nearly impossible to convert them down the line. Also, be sure to consistently use UTF-8 character encoding throughout both database and GUI web app processes. This will ensure that characters in all languages are represented appropriately. Be particularly careful when it comes to accent marks and other per-character qualifiers because an omission can easily change the meaning of words in many languages.

The maintenance of the code-to-text databases is critical for localization. When a new condition is added to an application, such as one that requires a specific text or graphic output, new code must be added. Developers must convert this new code into a readable format -- a process called decoding -- for each specific combination of linguistic and cultural rules that apply to a dialog. This enables language and cultural experts to easily review the localization rules, which should happen every time a change in the code occurs.

Set up these types of reviews on a regular schedule, as cultural norms can change quickly. Over time, certain phrases and even emojis can suddenly become outdated -- or, worse, offensive.

Manage localization code and data carefully

When a user-facing message consists of words and variables, such as "There are [number] items in stock," carefully manage how the string is created. Some languages permit developers to chain text to particular sets of data and code. However, the results may be difficult to localize, and this practice can encourage the use of text literals. For instance, in the example above, the text and number should be represented by a combination that differentiates between the text code and the number and localizes each element.

Next, replace raw inputs with choices within a drop-down interface based on the code-to-meaning concept noted above. That reduces the challenge of validating inputs for multiple languages. It also simplifies the structure of the GUI, since the drop-down method should automatically set the screen size of the input and output items. This resolves the problem that arises when the length of words and sentences varies between the supported languages.

That raises another challenge: editing input information. Basic field editing has to be pushed toward the UI as well, often combined with the process of displaying information. One particular challenge is editing dates and currency values because the formatting will vary depending on local practices. In some cases, it may be useful to use a generalized component of logic to display and edit dates, times, and currency, driven by a set of policies that are associated with the language/culture code and associated with the specific user dialog.

Thoroughly test localized apps

All of the aforementioned localization techniques facilitate copious testing. Luckily, concentrating localization procedures at the front end, using code-to-result databases, and removing localization from deeply embedded components significantly condenses and consolidates the scope of changes you need to test.

These techniques make it possible to test localization through automated test generation, so long as the tests generated introduce all possible dialog codes, including linguistic and cultural specifics, as inputs.

A developer guide to software localization

Software has become as multilingual and multicultural as the world itself. Localizing your code can keep your company from a potentially disastrous cultural blunder.

Place localization rules in the front end

Create a dedicated localization database

Visual localization

Manage localization code and data carefully

Thoroughly test localized apps

Dig Deeper on Application development and design

What is the Windows Management Instrumentation Command-line (WMIC) utility?

User input and Java's JOptionPane example

Assessing the competitors of Copilot for Microsoft 365

What is machine translation?