What are the differences between NoSQL and a traditional RDBMS?

NoSQL stands for "Not only SQL" and usually means that the database is not a relational database, which have been very popular the last decades.

The reason why NoSQL has been so popular the last few years is mainly because, when a relational database grows out of one server, it is no longer that easy to use. In other words, they don't scale out very well in a distributed system. All of the big sites that you mentioned Google, Yahoo, Facebook and Amazon (I don't know much about Digg) have lots of data and store the data in distributed systems for several reasons. It could be that the data doesn't fit on one server, or there are requirements for high availability.

CAP Theorem

The properties of a distributed system can be described by the CAP Theorem. Of the three properties you can only have at most two:

Consistency
Availability
tolerance to network Partitioning

Amazon Dynamo uses Eventual Consistency to come close to get all three properties. The paper Dynamo: Amazon’s Highly Available Key-value Store is worth reading when learning about NoSQL databases and distributed systems. Amazon Dynamo has the A and P properties.

Google take a different approach with BigTable, that has the C and A properties.

Other NoSQL databases

As I wrote in the beginning there are many other kind of NoSQL databases, that are designed for different requirements. E.g. graph databases like Neo4j, document databases like CouchDB and multimodel / object databases like OrientDB.

Finally I would like to say that relational databases will remain popular. They are very flexible and maintainable. But they are not always the best choice.

NoSQL is a very broad term and typically is referred to as meaning "Not Only SQL." The term is dropping out of favor in the non-RDBMS community.

You'll find that NoSQL database have few common characteristics. They can be roughly divided into a few categories:

key/value stores
Bigtable inspired databases (based on the Google Bigtable paper)
Dynamo inspired databases
distributed databases
document databases

This is a huge question, but it's fairly well answered in this Survey of Distributed Databases.

For a short answer:

NoSQL databases may dispense with various portions of ACID in order to achieve certain other benefits--partition tolerance, performance, to distribute load, or to scale linearly with the addition of new hardware.

As far as when to use them--that depends entirely on the needs of your application.

NoSQL is a kind of database that doesn't have a fixed schema like a traditional RDBMS does. With the NoSQL databases the schema is defined by the developer at run time. They don't write normal SQL statements against the database, but instead use an API to get the data that they need. The NoSQL databases can usually scale across different physical servers easily without needing to know which server the data you are looking for is on.

However there are some trade offs for all this flexibility: The NoSQL databases are pretty feature lacking compared to the RDBMS systems like SQL Server, Oracle, DB2, MySQL, etc. There's no Service Broker, Transaction logging, ETL packages, etc.

NoSQL isn't something that is new. It has actually been around for 50-60 years. Back then it was called COBOL. Same exact idea, just a different group came up with it.

What are the differences between NoSQL and a traditional RDBMS?

CAP Theorem

Other NoSQL databases

Tags:

Rdbms

Nosql

Database Recommendation

Related

Recent Posts