HBase column families: how many?

As per Apache HBase wiki Hbase will face performance issues more than 2 or 3 Column families.


There is a limit to the number of column families in HBase. There is one MemStore(Its a write cache which stores new data before writing it into Hfiles) per Column Family, when one is full, they all flush.

The more you add column families there will be more MemStore created and Memstore flush will be more frequent. It will degrade the performance.


The idea behind column families is great - unfortunately the current HBase implementation does not handle a lot of column families well. Basically you should try to stick with one and add a second if you have radically different access patterns. Also see HBase manual

What you can do is keep your different "family" as columns with different prefix. HBase is sparse so it won't take more space and you can still get just one "family" with a columnPrefix filter on scans if you need to

Tags:

Hbase