Local development and staging with Amazon Redshift

Amazon Redshift was specifically created to run on AWS infrastructure. It is not available as a download. (Interestingly, Amazon DynamoDB does have a downloadable version for development purposes.)

The cheapest option might be to shutdown your Dev & Test instances each night and on weekends. Take a snapshot before deleting the cluster, then create a cluster the next morning based on the snapshot. This can be automated via the AWS Command-Line Interface (CLI), making it easy to schedule with cron or Scheduled Tasks.

You could also have a snapshot of Test data and restore that snapshot each morning, which means the test database doesn't fill-up with test cases.

Another cost saving might be to reduce the number of nodes for the non-production systems. Queries will run slower and the total amount of storage will be reduced, but it could be more cost-effective. Or even use a "Dense Storage" 2TB node instead of several "Dense Compute" SSD instances -- they will provide more storage on less nodes.


In addition to John Rotenstiens which lays out how to reduce costs if you have decided to run a second cluster for staging, there are some other options, for when your use case is non mission critical.

As Redshift is a fork of postgres 8, you can use the Amazon-provided postgresql 8.4 JDBC or ODBC drivers, and point them to a locally running postgres 8 instance. This works well during development, since what works here will usually work on your production system (there are some exceptions).

The other option is to have a separate table on your Redshift cluster to run non-production activities. This might good for you test suite and "final testing" development.

Then you can stage your deploy into production and monitor the staging environment for issues before the full deploy.


Another cost-cutting solution is to treat each database as an environment in a single cluster. Databases cost nothing, and you're allowed 60 of them in a cluster

We've tried the Postgres-as-emulator solution, and it's been kind of OK, but

  • The performance characteristics are radically different
  • It's easy to use Postgres features that are not in Redshift (or vice versa)
  • It's a pain to maintain a schema that has optional parts (indexes for one, sort keys for the other, for example).

We've backed away from that for the moment, though as we get larger we may have to use a hybrid solution where acceptance testing & staging are databases in Redshift, while developers go back to using Postgres.