Speak My Language: i18n and l10n in Django

A user of your app emails you and tells you that they would love if the greeting screen were in Spanish.

def greeting():
  print("Hello!")

You think to yourself, “No problem! I’ll just use a dictionary!”

dictionary = {
  "Hello!":"Hola!",
}

def greeting():
  print(dictionary.get("Hello")

Now, after running greeting(), you get “Hola!”. Perfect! You’re done localizing it into Spanish.

Afterwards, Mr. Foobar requests that you localize it into French as well.

Except, you can’t, because dictionaries are one to one. You can’t add a new key value pair of “Hello!” and “Bonjour!”.

But what you could do is add tons of dictionaries, but then that runs into another problem :

It’s difficult to edit, and your translators are lost and confused because they aren’t programmers.

You could do it with a set of parallel arrays, and have something like:

english = ["Hello!"]
spanish = ["Hola!"]
french = ["Bonjour!"]

But what would happen if you didn’t have one word, but a thousand words, in 10 different languages. What then? Ten parallel arrays with a thousand words in each array? (ignoring the fact that a word in English might be two words in Spanish) And what if the localization isn’t fully complete? Would you just insert blank strings into the array to make the array indexes line up?

There are simply too many issues with parallel arrays, and the amount of vertical space you would need to have a set parallel arrays that large would be bad enough to summon Cthulhu from the depths of R’lyeh.

Most importantly, no sane translator is going to waste time counting the indexes to make sure they’re adding the word into the right index.

We need a solution that is easy to understand for translators, easy to maintain, and doesn’t require writing lots of code.

So what’s the solution proposed by Django?

Django’s Solution

Django uses ugettext to do i18n, or internationalization. Internationalization is the act of making all the strings in your code translatable into a different language.

Commonly, you will see ugettext imported like this:

from django.utils.translation import ugettext as _

For reasons that may forever be unknown to me, ugettext is always aliased as an underscore as common convention.

Conventions aside, let’s create a dummy view in Django to show how ugettext works.

from django.shortcuts import render
from django.http import HttpResponse
from django.utils.translation import ugettext as _

def testView(request):
  output = _('Hello!')
  return HttpResponse(output)

To use ugettext, you simply need to call ugettext(‘STRING_HERE’), or _(“STRING_HERE”) since an underscore is aliased to ugettext.

Next, you will need to create a folder called “locale” in the base directory of your project. Then, edit your settings.py and add:

LOCALE_PATHS = (os.path.join(BASE_DIR, "locale"))
#Gives you BASE_DIR/locale

This tells Python that our locale path should be inside the “locale” folder.

Now comes the fun part. Run django-admin makemessages -l es. This will create a .po file called “django.po” in BASE_DIR/locale/sp/LC_MESSAGES, where “es” is español, or Spanish.

The .po extension stands for Portable Object, and it is used to hold all of the phrases in the original language and the translated language.

If you open “django.po”, you will see the following:

#, python-format
msgid "Hello!"
msgstr ""

Now, edit the msgstr so that it reads as follows:

#, python-format
msgid "Hello!"
msgstr "Hola!"

We are adding a new translation for the Spanish version of “Hello!”.

Now, we need to compile our .po file to a .mo file, which Django can use to process our translations.

Run django-admin compilemessages -l es

Once the .mo file is created, Django will be able to search for the input word or phrase, and output the translated word or phrase. After this step is done, the localization for Spanish is complete!

But Why Isn’t My Text Translated?

If you ran the code, you may have noticed that the text didn’t translate. And you’re correct.

Why should it?

Localization is intended to provide different languages to specific users. Django differentiates between Spanish and American users by your browser’s locale. Django will only translate the page if your locale is “es”. If you live in the United States, your locale is most likely “en” or “en-us”.

If you want to test your code, simply change your locale to “es”, and the translation will work appropriately.

Conclusion

Django’s solution is essentially a gigantic dictionary split into multiple files, with each language being represented by a .po file. A huge benefit of this is that the code is partially decoupled from the translation. The .po file is generated from the code, but translators are unlikely to crash the website by editing the .po file, as they only have to insert their translation into the msgstr part.

As a result, the programmers are happy, as the code is separate from the translations, and the translators are happy, as the translations are separate from the code. It’s easy for both parties to access, meaning translators don’t have to fidget with the code, and programmers don’t have to fidget with long bundles of translated phrases.

Django’s solution is a perfect win-win for everyone.