Amazon RDS for MySQL vs installing MySQL on an Amazon EC2 instance
I'm a big AWS fan in general... but RDS, not so much.
@RolandoMySQLDBA has pointed out are some pretty good points against it.
The only advantage I see in RDS compared to MySQL on EC2 are the ability to do point and click snapshots, clones, and point-in-time recovery, but these are not nearly sufficient to make up for the loss of control and flexibility and they most certainly don't justify the price being higher. RDS is sexy in some ways, but you can't ultimately trust what you can't ultimately fix, because you can't get to all the moving parts.
I don't like not having the SUPER
privilege. I don't like not being able to tail the error log. I don't like not being able to run "top" or "iostat" on my database server to see how the cores and drives are enjoying the load. I don't like not having access to the federated storage engine. I don't like the thought of paying for a hot standyby (multi-AZ) backup master machine that I can't even leverage as a read replica. Sure, there are perfectly reasonable explanations why all of these constraints have to be in place for MySQL to be successfully packaged and sold as RDBMS-in-a-box. The only thing is, RDBMS-in-a-box "solves" a whole series of problems I don't have... and gets in my way, causing new problems.
But the absolute deal-breaker for me with RDS is the complete lack of access to the binary logs and replication. Binlogs, especially row-based, are a fantastic recovery tool for minor disasters, but they are of no help to you if you can't access them. Want to configure an on-premise server at your office as a read-replica of your production database in RDS? Something to take local backups from, do reporting, have on hand for disaster recovery should something unthinkable happen to your data that lives in RDS? That's an idea that is simultaneously obvious and brilliant.
Oops, sorry, direct access to replication is not available. Sure, you can pay for read replicas... but only as other RDS instances. Not a value proposition in my book.
Update: One Significant Change in RDS for MySQL 5.6
I still prefer running my own server (even in EC2) as opposed to running RDS for a number of reasons, including the lack of support for User-Defined Functions, the inability to use the Federated Storage Engine, and the inability to have the one extra connection available for emergency access... however...
Amazon has made a significant change in MySQL 5.6 for RDS that eliminates one of my major objections -- perhaps my largest objection: the binary logs are now accessible and you can run a non-RDS instance as a slave, or connect other utilities to the server that read the binlog stream.
http://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/MySQL.Procedural.Exporting.NonRDSRepl.html
Officially, the documentation indicates that they are exposing this so that you can set up a slave for the purpose of doing a live migration -- you synchronize the foreign future master server from the existing RDS instance using mysqldump
, then connect it to RDS as a slave to get a live feed of updates via the replication stream until your application is migrated to the new master -- but unofficially, you can do this on an ongoing basis as long as you don't expect them to support you... which, to me, seems reasonable.
This was confirmed in a recent Webinar, in a conversation that begins at around 56:45:
"You can keep it in a replicated state indefinitely...
"...as long as you take the responsibility to maintain the replication..."
"We are not preventing you from doing ongoing replication if that's what you want."
This new capability was enough for me to drop my blanket objection to using RDS in our public-facing website-backing MySQL instances, where we don't use FEDERATED
or some of the other things as much or at all.
So I'm still not in favor of it, but I'm no longer against it, since having a live stream of the binary logs puts me ultimately back in control of the data in real time and the responsibility for ensuring that no transactions are lost in a catastrophic outage is back with me, because I, as the DBA, am back in control -- which is exactly how I want it. Having a third party vendor to point fingers at, or file a lawsuit against, or whatever, doesn't get your lost data back if it disappears down a black hole inside a black box.
Management seems to like the "idea" of RDS and doesn't object to the cost difference, so we are now launching all new web sites with RDS behind them.
The point and click point-in-time recovery, I admit, is a nice feature in RDS... it doesn't alter or disrupt your existing machine -- instead, it fires up an entirely new instance, using the backup that was closest-in-time to the selected point in time, and then applies the necessary binlogs to bring that new machine forward to the point in time that you've specified.
Related to this, but in the other direction, it's also possible, now, to use a similar strategy to migrate a live MySQL database into RDS... you can connect an RDS master (presumably, typically, this would be a newly-deployed instance) as a slave of an existing system so that the RDS instance has the live version of the data at the time you migrate into it. Unlike access to the RDS binlogs for outward replication, which only works in 5.6, the inward replication is supported in RDS beginning with 5.5.33 and 5.6.13.
If scaling out DB Servers is not your cup of tea, then Amazon RDS is OK to use because all bells and whistles come with it. Those who simply want moderate HA, backups, and scaling out benefit a great deal.
On the flip side, if you want to scale up hardware, that is out of the question for RDS. What if you want to scale up MySQL's capabilities? Unfortunately, that is out of the question for many aspects one would want.
For example, did you know that two fields are capped across all seven(7) RDS server models?
- max_connections
- innodb_buffer_pool_size
Note the following chart on these two options:
MODEL max_connections innodb_buffer_pool_size
--------- --------------- -----------------------
t1.micro 34 326107136 ( 311M)
m1-small 125 1179648000 ( 1125M, 1.097G)
m1-large 623 5882511360 ( 5610M, 5.479G)
m1-xlarge 1263 11922309120 (11370M, 11.103G)
m2-xlarge 1441 13605273600 (12975M, 12.671G)
m2-2xlarge 2900 27367833600 (26100M, 25.488G)
m2-4xlarge 5816 54892953600 (52350M, 51.123G)
You are not given SUPER privilege and there is no direct access to my.cnf
. In light of this, in order to change my.cnf
options for startup, you must first create a MySQL-based DB Parameter Option List and use the RDS CLI (Command Line Interface)
to change the desired Options. Then, you must do this to import the new options:
- Create a Custom DB Parameter Group (call it
MySettings
) - Download RDS CLI and setup a config file with your AWS Credentials
- Execute the following :
./rds-modify-db-parameter-group MySettings --parameters "name=whateveroption,value=whatevervalue,method=immediate"
- Modify using DB Parameter Option List
MySettings
- Restart the MySQL RDS Instance
Using an API for updating single variable and doing a compulsory restart of the RDS instance to implement the change? That's quite a painful process to tweek any one option.
If you want to scale up MySQL, please use EC2. Then, you can tweek my.cnf
to your liking like you have always done and have been used to.