How to migrate a PostgreSQL database into a SQLServer one?
I have found a faster and easier way to accomplish this.
First copy your table (or query) to a tab delimited file like so:
COPY (SELECT siteid, searchdist, listtype, list, sitename, county, street,
city, state, zip, georesult, elevation, lat, lng, wkt, unlocated_bool,
id, status, standard_status, date_opened_or_reported, date_closed,
notes, list_type_description FROM mlocal) TO 'c:\SQLAzureImportFiles\data_script_mlocal.tsv' NULL E''
Next you need to create your table in SQL, this will not handle any schema for you. The schema must match your exported tsv file in field order and data types.
Finally you run SQL's bcp utility to bring in the tsv file like so:
bcp MyDb.dbo.mlocal in "\\NEWDBSERVER\SQLAzureImportFiles\data_script_mlocal.tsv" -S tcp:YourDBServer.database.windows.net -U YourUserName -P YourPassword -c
A couple of things of note that I encountered. Postgres and SQL Server handle boolean fields differently. Your SQL Server schema need to have your boolean fields set to varchar(1) and the resulting data will be 'f', 't' or null. You will then have to convert this field to a bit. doing something like:
ALTER TABLE mlocal ADD unlocated bit;
UPDATE mlocal SET unlocated=1 WHERE unlocated_bool='t';
UPDATE mlocal SET unlocated=0 WHERE unlocated_bool='f';
ALTER TABLE mlocal DROP COLUMN unlocated_bool;
Another thing is the geography/geometry fields are very different between the two platforms. Export the geometry fields as WKT using ST_AsText(geo)
and convert appropriately on the SQL Server end.
There may be more incompatibilities needing tweaks like this.
EDIT. So whereas this technique does technically work, I am trying to transfer several million records from 100+ tables to SQL Azure and bcp to SQL Azure is pretty flaky it turns out. I keep getting intermittent Unable to open BCP host data-file errors, the server is intermittently timing out and for some reason some records are not getting transferred with no indications of errors or problems. So this technique is not stable for transferring large amounts of data to Azure SQL.
You should be able to find some useful information in the accepted answer in this Serverfault page: https://serverfault.com/questions/65407/best-tool-to-migrate-a-postgresql-database-to-ms-sql-2005.
If you can get the schema converted without the data, you may be able to shorten the steps for the data by using this command:
pg_dump --data-only --column-inserts your_db_name > data_load_script.sql
This load will be quite slow, but the --column-inserts
option generates the most generic INSERT statements possible for each row of data and should be compatible.
EDIT: Suggestions on converting the schema follows:
I would start by dumping the schema, but removing anything that has to do with ownership or permissions. This should be enough:
pg_dump --schema-only --no-owner --no-privileges your_db_name > schema_create_script.sql
Edit this file to add the line BEGIN TRANSACTION;
to the beginning and ROLLBACK TRANSACTION;
to the end. Now you can load it and run it in a query window in SQL Server. If you get any errors, make sure you go to the bottom of the file, highlight the ROLLBACK statement and run it (by hitting F5 while the statement is highlighted).
Basically, you have to resolve each error until the script runs through cleanly. Then you can change the ROLLBACK TRANSACTION
to COMMIT TRANSACTION
and run one final time.
Unfortunately, I cannot help with which errors you may see as I have never gone from PostgreSQL to SQL Server, only the other way around. Some things that I would expect to be an issue, however (obviously, NOT an exhaustive list):
- PostgreSQL does auto-increment fields by linking a
NOT NULL INTEGER
field to aSEQUENCE
using aDEFAULT
. In SQL Server, this is anIDENTITY
column, but they're not exactly the same thing. I'm not sure if they are equivalent, but if your original schema is full of "id" fields, you may be in for some trouble. I don't know if SQL Server hasCREATE SEQUENCE
, so you may have to remove those. - Database functions / Stored Procedures do not translate between RDBMS platforms. You'll need to remove any
CREATE FUNCTION
statements and translate the algorithms manually. - Be careful about encoding of the data file. I'm a Linux person, so I have no idea how to verify encoding in Windows, but you need to make sure that what SQL Server expects is the same as the file you are importing from PostgreSQL.
pg_dump
has an option--encoding=
that will let you set a specific encoding. I seem to recall that Windows tends to use two-byte, UTF-16 encoding for Unicode where PostgreSQL uses UTF-8. I had some issue going from SQL Server to PostgreSQL due to UTF-16 output so it would be worth researching. - The PostgreSQL datatype
TEXT
is simply aVARCHAR
without a max length. In SQL Server,TEXT
is... complicated (and deprecated). Each field in your original schema that are declared asTEXT
will need to be reviewed for an appropriate SQL Server data type. - SQL Server has extra data types for
UNICODE
data. I'm not familiar enough with it to make suggestions. I'm just pointing out that it may be an issue.