Sort Unreal POs to make it less random
Strings in Unreal Engine PO files are sometimes randomized, and it’s a disaster for translation of course. There is no 100% fix for this but sorting it by source references can help quite a lot.
To do this, you need to do the following:
- Install Python
- Install
polib
(hitWin + R
, typecmd
, hitEnter
, then typepip install polib
in the command-line window, hitEnter
) - Copy and save the following code as
sort-po.py
, replacing the file name with the path to your file:
import polib
file_path = "c:/paste/path/to/your/po"
pofile = polib.pofile(
file_path,
wrapwidth=0,
encoding='utf-8-sig'
)
pofile.sort()
pofile.save()
pofile.sort()
is what does the trick.
A nicer script with drag-and-drop functionality
See the sort-po.py in the examples repo on GitHub to get a more advanced script that allows you to drag and drop one or more PO files onto it, creates backups, gives you some info on what’s going on, and is overall easier to use.
How to tell if you’re looking at an Unreal PO?
- Its name is
Game.po
- It has
# Copyright Epic Games, Inc.
in it. - Some or even all keys in
msgctxt
look like,001749B0473C86E14F557CB319B8F27C
- Entries look like this (
Key:
is always present,SourceLocation:
could be optional, path in source reference#:
starts with/Game/
):#. Key: 001749B0473C86E14F557CB319B8F27C #. SourceLocation: /Game/FactoryGame/Recipes/OilRefinery/Recipe_ResidualPlastic.Default__Recipe_ResidualPlastic_C.mDisplayName #: /Game/FactoryGame/Recipes/OilRefinery/Recipe_ResidualPlastic.Default__Recipe_ResidualPlastic_C.mDisplayName msgctxt ",001749B0473C86E14F557CB319B8F27C" msgid "Residual Plastic" msgstr "Residual Plastic"
Why is it randomized at all?
The reason it’s essentially randomized is that Unreal sorts it by key
. And by default, Unreal creates GUID keys for new strings, and a lot of developers just leave them as is. They simply don’t know how bad it is for localization down the line. And if you sort the file by what are essentially hashes, it’s virtually random.
Why does sorting by reference make it better?
The other way to sort the PO is by source references, and that can help a lot with Unreal PO files because:
- Projects are usually structured better, even if localization is neglected, folders, paths, and asset names have meaning
- Strings that belong to the same asset will be grouped together
- Strings that are part of the widget UI will be sorted better as well
- Strings that belong to the same graph or function will be grouped together
- Assets that are related usually live in the same folder, and all strings in them will be next to each other as well
A word of warning: it can’t help you if the project is organized badly or if it’s using some bad practices. E.g., if the game you’re working on has huge blueprint graphs, you’re out of luck, strings within this graph will be grouped together in the sorted file but they will remain randomized nevertheless…