How do you change the character encoding of a postgres database?

To change the encoding of your database:

  1. Dump your database
  2. Drop your database,
  3. Create new database with the different encoding
  4. Reload your data.

Make sure the client encoding is set correctly during all this.

Source: http://archives.postgresql.org/pgsql-novice/2006-03/msg00210.php


First off, Daniel's answer is the correct, safe option.

For the specific case of changing from SQL_ASCII to something else, you can cheat and simply poke the pg_database catalogue to reassign the database encoding. This assumes you've already stored any non-ASCII characters in the expected encoding (or that you simply haven't used any non-ASCII characters).

Then you can do:

update pg_database set encoding = pg_char_to_encoding('UTF8') where datname = 'thedb'

This will not change the collation of the database, just how the encoded bytes are converted into characters (so now length('£123') will return 4 instead of 5). If the database uses 'C' collation, there should be no change to ordering for ASCII strings. You'll likely need to rebuild any indices containing non-ASCII characters though.

Caveat emptor. Dumping and reloading provides a way to check your database content is actually in the encoding you expect, and this doesn't. And if it turns out you did have some wrongly-encoded data in the database, rescuing is going to be difficult. So if you possibly can, dump and reinitialise.