Database with "Open Schema" - Good or Bad Idea?

I noted that they did not mention anything about the ease or difficulty in creating reports against that data. When used in a narrow set of circumstances, EAVs can be beneficial. As a central part of most systems it will become a nightmare when you hit reporting. The problem with EAVs is that most of the benefit is at the outset of the project and most of the pain is later in the analysis and reporting especially due to the severe lack of data integrity. "Not having to worry about foreign keys" to me sounds like a nightmare of orphaned rows. Add in the use of surrogate keys for everything and you have a tangled morass which generally ends in a complete rewrite


Most of the really big web sites end up using some sort of incredibly simple on the database side of things. This has the advantage that it's fast and scalable. It has the disadvantage that all the relationships that you'd get the database to enforce automatically (via triggers and such) you need to enforce yourself in your client code instead. Maintaining consistency is a pain in the neck, and there's almost always at least some chance that your data will be inconsistent, at least for short periods of time.

For a social networking site, it's a worthwhile compromise. Data that's mostly right most of the time is adequate (e.g., who really cares if the number of up-votes you receive for an item is really 20 milliseconds out of date when it's sent), and keeping costs reasonable while scaling to support a gazillion users matters a lot.


This is a data model known as EAV for entity-attribute-value. It has its uses. A prime example is patient test data which is naturally sparse since there are hundreds of thousands of tests which might be run, but typically only a handful are present for a patient. A table with hundreds of thousands of columns is silly, but a table with EAV makes good sense.