2 minute read

Text capitalization in Unreal is culture-aware, based on ICU, so it works great overall. You can be sure that i in Turkish will become İ, and I will be reserved for any ı in the text.

Auto-capitalization isn’t a good thing to do

Now, I should say that auto-capitalization is not the best practice: e.g., in Turkish, you have to capitalize English names according to English rules, and it’s impossible to distinguish these names from other words automatically.

Auto-capitalization and broken tags in Turkish

There’s also a technical problem with auto caps: rich text tags. Or any other technical part of the text that is processed after being capitalized. They also get capitalized according to Turkish rules, which results in broken tags.

Examples:

  1. <img id="some_picture_id" /> becomes <İMG İD="SOME_PİCTURE_İD" /> and naturally, Unreal doesn’t recognize this as a tag and just displays it as is.

  2. Even if your tag names are all-caps but you inject the image names in lowercase, those will be broken sometimes: <IMG ID="{image_name}" /> becomes <IMG ID="right_stick" /> after you inject the name, which becomes <IMG ID="RİGHT_STİCK" />, and even though Unreal recognizes the tag it can’t find the image because the name’s now broken.

Solution

The solution is simple: you need to capitalize the tags and names yourself, according to the English rules:

  1. If you inject the tags or image names via a variable: Use all-caps for whatever you inject. E.g., if you have something like {key_image} to jump and you inject a rich text image tag instead of the variable, inject it in all caps. If you have the tags and/or image names in lowercase, use FString::ToUpper to capitalize them first. This function is not culture-aware, which is good for tags and names.
  2. If these tags are part of the strings: Ask your Turkish translators to make them all caps when they translate. E.g., <img id="id" /> should become <IMG ID=ID /> in Turkish translation. Better yet, make this conversion automatic during translation import or pre-import text processing, to remove the burden from the translators and leave no room for human error.

Overall, I still prefer to have the tags and IDs in lowercase for the sake of readability and use non-culture-aware capitalization via FString::ToUpper and processing translated text before import.

A word of warning: FString::ToUpper and FString::ToLower are super fast but dumb by design, don’t use them for player-facing text or logic. It won’t work, and there’s another issue because of this. Have a read: /blog/fstring-and-non-ascii-chars/.