How to migrate a PostgreSQL database into a SQLServer one?

I have found a faster and easier way to accomplish this.

First copy your table (or query) to a tab delimited file like so:

COPY (SELECT siteid, searchdist, listtype, list, sitename, county, street, 
   city, state, zip, georesult, elevation, lat, lng, wkt, unlocated_bool, 
   id, status, standard_status, date_opened_or_reported, date_closed, 
   notes, list_type_description FROM mlocal) TO 'c:\SQLAzureImportFiles\data_script_mlocal.tsv' NULL E''

Next you need to create your table in SQL, this will not handle any schema for you. The schema must match your exported tsv file in field order and data types.

Finally you run SQL's bcp utility to bring in the tsv file like so:

bcp MyDb.dbo.mlocal in "\\NEWDBSERVER\SQLAzureImportFiles\data_script_mlocal.tsv" -S tcp:YourDBServer.database.windows.net -U YourUserName -P YourPassword -c

A couple of things of note that I encountered. Postgres and SQL Server handle boolean fields differently. Your SQL Server schema need to have your boolean fields set to varchar(1) and the resulting data will be 'f', 't' or null. You will then have to convert this field to a bit. doing something like:

ALTER TABLE mlocal ADD unlocated bit;
UPDATE mlocal SET unlocated=1 WHERE unlocated_bool='t';
UPDATE mlocal SET unlocated=0 WHERE unlocated_bool='f';
ALTER TABLE mlocal DROP COLUMN unlocated_bool;

Another thing is the geography/geometry fields are very different between the two platforms. Export the geometry fields as WKT using ST_AsText(geo) and convert appropriately on the SQL Server end.

There may be more incompatibilities needing tweaks like this.

EDIT. So whereas this technique does technically work, I am trying to transfer several million records from 100+ tables to SQL Azure and bcp to SQL Azure is pretty flaky it turns out. I keep getting intermittent Unable to open BCP host data-file errors, the server is intermittently timing out and for some reason some records are not getting transferred with no indications of errors or problems. So this technique is not stable for transferring large amounts of data to Azure SQL.


You should be able to find some useful information in the accepted answer in this Serverfault page: https://serverfault.com/questions/65407/best-tool-to-migrate-a-postgresql-database-to-ms-sql-2005.

If you can get the schema converted without the data, you may be able to shorten the steps for the data by using this command:

pg_dump --data-only --column-inserts your_db_name > data_load_script.sql

This load will be quite slow, but the --column-inserts option generates the most generic INSERT statements possible for each row of data and should be compatible.

EDIT: Suggestions on converting the schema follows:

I would start by dumping the schema, but removing anything that has to do with ownership or permissions. This should be enough:

pg_dump --schema-only --no-owner --no-privileges your_db_name > schema_create_script.sql

Edit this file to add the line BEGIN TRANSACTION; to the beginning and ROLLBACK TRANSACTION; to the end. Now you can load it and run it in a query window in SQL Server. If you get any errors, make sure you go to the bottom of the file, highlight the ROLLBACK statement and run it (by hitting F5 while the statement is highlighted).

Basically, you have to resolve each error until the script runs through cleanly. Then you can change the ROLLBACK TRANSACTION to COMMIT TRANSACTION and run one final time.

Unfortunately, I cannot help with which errors you may see as I have never gone from PostgreSQL to SQL Server, only the other way around. Some things that I would expect to be an issue, however (obviously, NOT an exhaustive list):

  • PostgreSQL does auto-increment fields by linking a NOT NULL INTEGER field to a SEQUENCE using a DEFAULT. In SQL Server, this is an IDENTITY column, but they're not exactly the same thing. I'm not sure if they are equivalent, but if your original schema is full of "id" fields, you may be in for some trouble. I don't know if SQL Server has CREATE SEQUENCE, so you may have to remove those.
  • Database functions / Stored Procedures do not translate between RDBMS platforms. You'll need to remove any CREATE FUNCTION statements and translate the algorithms manually.
  • Be careful about encoding of the data file. I'm a Linux person, so I have no idea how to verify encoding in Windows, but you need to make sure that what SQL Server expects is the same as the file you are importing from PostgreSQL. pg_dump has an option --encoding= that will let you set a specific encoding. I seem to recall that Windows tends to use two-byte, UTF-16 encoding for Unicode where PostgreSQL uses UTF-8. I had some issue going from SQL Server to PostgreSQL due to UTF-16 output so it would be worth researching.
  • The PostgreSQL datatype TEXT is simply a VARCHAR without a max length. In SQL Server, TEXT is... complicated (and deprecated). Each field in your original schema that are declared as TEXT will need to be reviewed for an appropriate SQL Server data type.
  • SQL Server has extra data types for UNICODE data. I'm not familiar enough with it to make suggestions. I'm just pointing out that it may be an issue.