Localization in Unreal

Introduction to Unreal L10N and I18N

Unreal Engine provides built-in localization and internationalization systems.

Internationalization is based on ICU and provides a bunch of locale-aware functions to work with numbers, percentages, date, etc. (link to manual)

Text localization is revolving around FText data type (link to manual). Anything you localize—anything you display to the player, really—should be an FText. You can also mark FTexts as culture-invariant (same as non-translatable), and those won’t be gathered for localization. All other FTexts will be gathered by Unreal and put into manifests and archives, and exported to POs for translation.

Localizable text in Unreal

You can store text directly in widgets and as separate FText instances in code. Or you can store the text in string tables, either binary uasset files or CSV files, and only reference string table entries in widgets and code. There are pros and cons with each of these approaches that aren’t in scope of this intro.

Most of the localization-related features of the Unreal Editor are available via the Localization Dashboard. There, you can set up one or more localization targets with one or more cultures (locales) and rules telling Unreal Editor to gather text from specific directories with source files, assets, and metadata files. You can also gather, export, import, and compile text and dialogue scripts. (link to manual)

Unreal Editor has a built-in translation editor but it’s rudimentary at best. It’s recommended to gather and export the text into PO files, translate them, then import them back into the project and compile the text so that it could be used by the game in runtime.

Unreal PO files

You can read the PO format specs here: (link to PO format). In short, strings are identified by a pair of msgid and msgctxt fields, msgid contains the original text, msgstr contains translations, msgctxt contains context. In case of Unreal, msgctxt always contains namespace and key (namespace,key) of a string. PO entries also support comments, and Unreal will export source references (where the text is stored, so it’s either an asset or a source or metadata file or a string table), and any metadata from string tables as comments. Unreal does not make use of the PO pluralization capabilities but supports its own pluralization syntax inside the strings themselves. More on that below.

PO files are exported to Content/Localization/{target}/{locale}/{target}.po paths, where {target} is your localization target name and {locale} is culture code. For example, Spanish translations for the Game target would be exported to Content/Localization/Game/es/Game.po, while Spanish (Spain) would go to Content/Localization/Game/es-ES/Game.po. And that’s where you should put translated POs for Unreal Editor to import them back into the project.

Warning: Unreal lets you have “translations” in your native culture, basically allowing to use this for proofreading. And these translated native culture strings ( msgstr fields of the native culture PO file) become the source for all other cultures. That makes it important to never use the native culture PO as a source file for other cultures: it will contain the wrong source in msgid and Unreal won’t import translations because of that. Instead, use any of the other locale POs, as they will contain correct source in msgid fields.

(This should be fixed with Unreal PO support.) Warning: Unreal isn’t using #| msgid previous-source-string comments to mark string edits. Because of this, if the source text of an entry is changed, CAT tools don’t see this as an edit but rather as the old string being deleted, and a new one added. That makes you lose translations, history, and comments. There’s a way around it now: after you fix a typo in Unreal Editor but before you upload the new source files to Crowdin, you can also fix this this typo in both source and key fields of the same string on Crowdin. That way, you get to keep the translations along with the translation history and comments, and source and key will be in sync between Crowdin and Unreal.

(This should be fixed with Unreal PO support.) Warning: Unreal is using its own pluralization syntax that is not compatible with ICU. That makes any ICU helpers in CAT tools useless. (link to manual)

Setting things up on Crowdin

When you set up integration between Unreal and Crowdin, be that manual, via the CLI tool, or API, it’s best to do the following:

  1. Set up export rules for the source file you upload as /{target}/%locale%/{target}.po where you replace {target} with your target name (identical to file name). E.g., if you manually upload your Game.po, you’d set the export path to /Game/%locale%/Game.po.
  2. Set up locale mapping in project settings so that your files were exported into the same locale subfolders as those used in Unreal. E.g., if you have fr in your Unreal project but fr-FR on Crowdin, set up a mapping for French (France) on Crowdin to use fr as its locale.

This will result in a zip-file with translated files that has the same folder structure as your Unreal project Content/Localization directory: makes it much easier to copy things back for Unreal.

Unreal variables and formatting tags

By default, Unreal PO files can contain the following variables and formatting tags:

  1. Variables: {level number} or {hero name string}. These are replaced with values in runtime and can be of any data type: number, string, etc. It’s a good idea to give these variables descriptive names and maybe specify data type.
  2. Rich text formatting tags: <>text</>. Style names are defined in style tables. These tags cannot be nested (e.g., <x><y>x+y</>x</> will not work as intended). It’s a good idea to name the styles descriptively and within some kind of a naming system.
  3. Rich text images in empty tags: <img id=""/>. These will be replaced with inline images. It’s a good idea to give those images descriptive names.
  4. Plurals: {x} {x}|plural(one=cat, other=cats) produces 1 cat or 10 cats depending on the value of x. This only works if x is replaced with a number (and not a string that contains a number, for example). The plural construction itself does not print out the number. You can use the variables inside the plural strings (e.g., {x}|plural(one=one cat,other={x} cats) will work just fine for English). If your plural string contains a comma, it should be quoted (e.g., {x}|plural(one="one big, big cat",other={x} cats)).

    This feature is based on ICU, and it supports all the same cultures, but the syntax is not the same as for ICU message formatting.

    Hint: translations can contain plural constructions even if source text does not have them, as long as x is a number and the string goes through the format node or function.

  5. Gender: {g}|gender(masculine,feminine,neuter) produces either masculine, feminine or neuter depending on the value of the d variable. It should be ETextGender type, which can be Masculine, Feminine, or Neuter.
  6. Hangul post-positions: {arg}|hpp(은,는). This allows you to automatically insert proper glyphs in Korean depending on whether the arg ends in a consonant or vowel.