Parsing gettext `.po` files with python

Babel includes a .po files parser written in Python:

http://babel.edgewall.org/

The built-in gettext module works only with binary .mo files.


In most cases you don't need to parse .po files yourself. Developers give translators a .pot template file, they rename it to xx_XX.po and translate the strings. Then you as developer only have to "compile" them to .mo files using GNU's gettext tools (or its Python implementation, pygettext)

But, if you want/need to parse the po files yourself, instead of compiling them, I strongly suggest you to use polib, a well-known python library to handle po files. It is used by several large-scale projects, such as Mercurial and Ubuntu's Launchpad translation engine:

PyPi package home: http://pypi.python.org/pypi/polib/

Code repository: https://github.com/izimobil/polib

(Original repository was hosted at Bitbucket, which no longer supports Mercurial: https://bitbucket.org/izi/polib/wiki/Home)

Documentation: http://polib.readthedocs.org

The import module is a single file, with MIT license, so you can easily incorporate it in your code like this:

import polib
po = polib.pofile('path/to/catalog.po')
for entry in po:
    print entry.msgid, entry.msgstr

It can't be easier than that ;)

Tags:

Python

Gettext