Which DBMS is good for super-fast reads and a simple data structure?
If you are not that concerned with relational logic, want really fast read speed, and you are willing to work with an RDBMS, I would prejudicially venture to say MySQL. Why ???
The MyISAM storage engine has an option that can allow the physical structure of the table to be augmented for better performance. What is that option ? The ALTER TABLE option ROW_FORMAT.
For example, the book MySQL Database Design and Tuning recommends using ROW_FORMAT=FIXED on pages 72,73. This will internally convert all VARCHAR fields to CHAR. It will make the MyISAM table larger, but executed SELECTs against it will be much faster. I can personally attest to this. I once had a table that was 1.9GB. I changed the format with ALTER TABLE tblname ROW_FORMAT=FIXED. The table ended up 3.7GB. The speed of the SELECTs against it was 20-25% faster without improving or changing anything else.
What if you already have a MyISAM table that is populated with data ? You could get metrics for recommended column definitions based on the data present in the MyISAM table. What query presents those metrics ?
SELECT * FROM tblname PROCEDURE ANALYSE();
PROCEDURE ANALYSE() This will not display data. It will read the value of every column and recommend column definitions. Example, if you have a type column whose values are 1-4, it would sugggest using an ENUM of those 4 values. You could then choose to use TINYINT or CHAR(1) since they take the same amount of space (1 byte).
Here is something else to consider: Since you were thinking about using a NoSQL DB, have you ever thought of using MyISAM in a NoSQL manner ? This is quite possible. Page 175 of the same book I mentioned suggests using HANDLER structures to read a table without the relational baggage. In fact, page 175 gives this example:
CREATE TABLE customer_mileage_details
(
customer_id INT NOT NULL,
ff_number CHAR(10) NOT NULL,
transaction_date DATE NOT NULL,
mileage SMALLINT NOT NULL,
INSERT(customer_id),
INSERT (ff_number,transaction_date)
) ENGINE = MYISAM;
This table contains millions of rows. Suppose that you need to create a data analysis apllication that has the following requirements:
- It needs to retrieve blocks of information as quickly as possible.
- Based on user input or other factors, it will likely "jump around" in the table.
- It is not concerned with concurrency or other data integrity issues.
- Cross-application table locking is not required.
These commands allow quick-and-dirty reads from the table:
HANDLER customer_mileage_details OPEN;
HANDLER customer_mileage_details READ ff_number FIRST WHERE ff_number=('aaetm-4441');
HANDLER customer_mileage_details READ NEXT LIMT 10;
HANDLER customer_mileage_details CLOSE;
I hope this give food for thought. Please look into it.
CAVEAT
What is very ironic about about me writing this particular post is that I wrote an earlier post about HANDLER being used in Percona Server binaries and thinking that using it was out-of-date. Since that older post, I have never thought I would ever write something in support of HANDLER structures. I now stand corrected.
The first thing that comes to mind is a particular RDBMS that's familiar to me. I recognize, however, that it may not be the best for this application.
So, my advice is to go with a database that are familiar to you. If you're familiar with Redis or MongoDB, then go with one of those. If you're more familiar with SQLite, then chose that.
On a database of this size, it's all going to be pretty quick. Even databases that are more disk-heavy will use some sort of caching so that disk speed isn't too much of a concern.