Get WAL files from AWS RDS PostgreSQL instance

Update: I've posted about this to the AWS forums - please go chime in and ask for it there.


At time of writing, Amazon RDS does not support physical replication outside RDS. You can GRANT users the REPLICATION right using an rds_superuser login, but you can't configure replication entries for outside IPs in pg_hba.conf.

Furthermore, when you create a DB parameter group in RDS, some key parameters are shown but locked, e.g. archive_command, which is locked to /etc/rds/dbbin/pgscripts/rds_wal_archive %p. AWS RDS for PostgreSQL does not appear to expose these WALs for external access (say, via S3) as it would need to if you were to use WAL-shipping replication for external PITR.

So at this point, if you want wal-shipping, don't use RDS. It's a canned easy-to-use database, but easy-to-use often means that it's also limited, and that's certainly the case here. As Joe Love points out in the comments, it provides WAL shipping and PITR within RDS, but you can't get access to the WAL to that from outside RDS.

So you need to use RDS's own backup facilities - dumps, snapshots and its own WAL-based PITR.


Even if RDS did let you make replication connections (for pg_basebackup or streaming replication) and allowed you to access archived WAL, you might not be able to actually consume that WAL. RDS runs a patched PostgreSQL, though nobody knows how heavily patched or whether it significantly alters the on-disk format. It also runs on the architecture selected by Amazon, which is probably x64 Linux, but not easily determined. Since PostgreSQL's on disk format and replication are architecture dependent, you could only replicate to hosts with the same architecture as that used by Amazon RDS, and only if your PostgreSQL build was compatible with theirs.

Among other things this means that you don't have any easy way to migrate away from RDS. You'd have to stop all writes to the database for long enough to take a pg_dump, restore it, and get the new DB running. The usual tricks with replication and failover, with rsync, etc, won't work because you don't have direct access to the DB host.

Even if RDS ran an unpatched PostgreSQL Amazon probably wouldn't want to permit you to do WAL streaming into RDS or import into RDS using pg_basebackup for security reasons. PostgreSQL treats the data directory as trusted content, and if you've crafted any clever 'LANGUAGE c' functions that hook internal functionality or done anything else tricky you might be able to exploit the server to get greater access than you're supposed to have. So Amazon aren't going to permit inbound WAL anytime soon.

They could support outbound WAL sending, but the above issues with format compatibility, freedom to make changes, etc still apply.


Instead you should use a tool like Londiste or Bucardo.


This is now possible using Logical Replication: https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/CHAP_PostgreSQL.html#PostgreSQL.Concepts.General.FeatureSupport.LogicalReplication