NoSQL - MongoDB vs CouchDB
If you are coming from the MySQL world, MongoDB is going to "feel" a lot more natural to you because of its query-like language support.
I think that is what makes it so friendly for a lot of people.
CouchDB is fantastic if you want to utilize the really great master-master replication support with a multi-node setup, possibly in different data centers or something like that.
MongoDB's replication (replica sets) is a master-slave-slave-slave-* setup, you can only write to the master in a replica set and read from any of them.
For a standard site configuration, that is fine. It maps to MySQL usage really well.
But if you are trying to create a global service like a CDN that needs to keep all global nodes synced even though read/write to all of them, something like the replication in CouchDB is going to be a huge boon to you.
While MongoDB has a query-like language that you can use and feels very intuitive, CouchDB takes a "map-reduce" approach and this concepts of views. It feels odd at first, but as you get the hang of it, it really starts feeling intuitive.
Here is a quick overview so it makes some sense:
- CouchDB stores all your data in a b-tree
- You cannot "query" it dynamically with something like "SELECT * FROM user WHERE..."
- Instead, you define discrete "views" of your data... "here is a view of all my users", "here is a view of all users older than 10" "here is a view of all users older than 30" and so on.
- These views are defined using map-reduce approach and are defined as JavaScript functions.
- When you define a view, the DB starts feeding all the documents of the DB you assigned the view to, through it and recording the results of your functions as the "index" on that data.
- There are some basic queries you can do on the views like asking for a specific key (ID) or range of IDs regardless of what your map/reduce function does.
- Read through these slides, it's the best clarification of map/reduce in Couch I've seen.
So both of these sources use JSON documents, but CouchDB follows this more "every server is a master and can sync with the world" approach which is fantastic if you need it, while MongoDB is really the MySQL of the NoSQL world.
So if that sounds more like what you need/want, go for that.
Little differences like Mongo's binary protocol vs the RESTful interface of CouchDB are all minor details.
If you want raw speed and to hell with data safety, you can make Mongo run faster than CouchDB as you can tell it to operate out of memory and not commit things to disk except for sparse intervals.
You can do the same with Couch, but it's HTTP-based communication protocol is going to be 2-4x slower than raw binary communication with Mongo in this "speed over everything!" scenario.
Keep in mind that raw crazy insane speed is useless if a server crash or disk failure corrupts and toasts your DB into oblivion, so that data point isn't as amazing as it might seem (unless you are doing real-time trading systems on Wall Street, in which case look at Redis).
Hope that all helps!
See following links
- CouchDB Vs MongoDB
- MongoDB or CouchDB - fit for production?
- DB-Engines - Comparison CouchDB vs. MongoDB
Update: I found great comparison of NoSQL databases.
MongoDB (3.2)
- Written in: C++
- Main point: JSON document store
- License: AGPL (Drivers: Apache)
- Protocol: Custom, binary (BSON)
- Master/slave replication (auto failover with replica sets)
- Sharding built-in
- Queries are javascript expressions
- Run arbitrary javascript functions server-side
- Has geospatial indexing and queries
- Multiple storage engines with different performance characteristics
- Performance over features
- Document validation
- Journaling
- Powerful aggregation framework
- On 32bit systems, limited to ~2.5Gb
- Text search integrated
- GridFS to store big data + metadata (not actually an FS)
- Data center aware
Best used: If you need dynamic queries. If you prefer to define indexes, not map/reduce functions. If you need good performance on a big DB. If you wanted CouchDB, but your data changes too much, filling up disks.
For example: For most things that you would do with MySQL or PostgreSQL, but having predefined columns really holds you back.
CouchDB (1.2)
- Written in: Erlang
- Main point: DB consistency, ease of use
- License: Apache
- Protocol: HTTP/REST
- Bi-directional (!) replication,
- continuous or ad-hoc,
- with conflict detection,
- thus, master-master replication. (!)
- MVCC - write operations do not block reads
- Previous versions of documents are available
- Crash-only (reliable) design
- Needs compacting from time to time
- Views: embedded map/reduce
- Formatting views: lists & shows
- Server-side document validation possible
- Authentication possible
- Real-time updates via '_changes' (!)
- Attachment handling
Best used: For accumulating, occasionally changing data, on which pre-defined queries are to be run. Places where versioning is important.
For example: CRM, CMS systems. Master-master replication is an especially interesting feature, allowing easy multi-site deployments.