What does "Document-oriented" vs. Key-Value mean when talking about MongoDB vs Cassandra?

A key-value store provides the simplest possible data model and is exactly what the name suggests: it's a storage system that stores values indexed by a key. You're limited to query by key and the values are opaque, the store doesn't know anything about them. This allows very fast read and write operations (a simple disk access) and I see this model as a kind of non volatile cache (i.e. well suited if you need fast accesses by key to long-lived data).

A document-oriented database extends the previous model and values are stored in a structured format (a document, hence the name) that the database can understand. For example, a document could be a blog post and the comments and the tags stored in a denormalized way. Since the data are transparent, the store can do more work (like indexing fields of the document) and you're not limited to query by key. As I hinted, such databases allows to fetch an entire page's data with a single query and are well suited for content oriented applications (which is why big sites like Facebook or Amazon like them).

Other kinds of NoSQL databases include column-oriented stores, graph databases and even object databases. But this goes beyond the question.

See also

  • Comparing Document Databases to Key-Value Stores
  • Analysis of the NoSQL Landscape
  • Thinking about NoSQL databases (classification and use cases)

Well, I've been investigating NoSQL myself the past month or so. I think it generally could be stated something like

  • KV stores doesnt know of the value content actually stored for a key
  • Document based lets you define secondary indexes within the value content, as the db knows the document structure (e.g. tags of a blog post).
  • NoSQL solutions each have specific features which should be taken into consideration, such as
    • Special datatypes in a KV store (e.g. sets with left/right pop/push like in redis)
    • easy scale up/down cluster as riak says it has (I havent tried it ... yet)
    • pluggable data store as in Voldemort
    • build-in web configuration and web app support like in CouchDB / couchapp

A document-oriented database, or document store, is for storing, retrieving, and managing document-oriented information, which is semi-structured data.. Key- value store is inherit of Document Oriented database. The difference lies in the way the data is processed; in a key-value store the data is considered to be inherently opaque to the database, whereas a document-oriented system relies on internal structure in the document in order to extract metadata that the database engine uses for further optimization.

If we deals about difference between MOngoDb and Cassandra. MongoDB acts much like a relational database. Its data model consists of a database at the top level, then collections which are like tables in MySQL (for example) and then documents which are contained within the collection, like rows in MySQL. Each document has a field and a value where this is similar to columns and values in MySQL. Fields can be simple key / value e.g. { 'name': 'David Mytton' } but they can also contain other documents e.g. { 'name': { 'first' : David, 'last' : 'Mytton' } }. In Cassandra documents are known as “columns” which are really just a single key and value. e.g. { 'key': 'name', 'value': 'David Mytton' }. There’s also a timestamp field which is for internal replication and consistency. The value can be a single value but can also contain another “column”. These columns then exist within column families which order data based on a specific value in the columns, referenced by a key.

But , At the top level there is a keyspace, which is similar to the MongoDB database.