MySQL 5.0 indexes - Unique vs Non Unique

UNIQUE and PRIMARY KEY are constraints, not indexes. Though most databases implement these constraints by using an index. The additional overhead of the constraint in addition to the index is insignificant, especially when you count the cost of tracking down and correcting unintentional duplicates when (not if) they occur.

Indexes are usually more effective if there you have a high selectivity. This is the ratio of number of distinct values to the total number of rows.

For example, in a column for Social Security Number, you may have 1 million rows with 1 million distinct values. So the selectivity is 1000000/1000000 = 1.0 (although there are rare historical exceptions, SSN's are intended to be unique).

But another column in that table, "gender" may only have two distinct values over 1 million rows. 2/1000000 = very low selectivity.

An index with a UNIQUE or PRIMARY KEY constraint is guaranteed to have a selectivity of 1.0, so it will always be as effective as an index can be.

You asked about the difference between a primary key and a unique constraint. Chiefly, it's that you can have only one primary key constraint per table (even if that constraint's definition includes multiple columns), whereas you can have multiple unique constraints. A column with a unique constraint may permit NULLs, whereas columns in primary key constraints must not permit NULLs. Otherwise, primary key and unique are very similar in their implementation and their use.

You asked in a comment about whether to use MyISAM or InnoDB. In MySQL, they use the term storage engine. There are bunch of subtle differences between these two storage engines, but the chief ones are:

InnoDB supports transactions, so you can choose to roll back or commit changes. MyISAM is effectively always autocommit.
InnoDB enforces foreign key constraints. MyISAM doesn't enforce or even store foreign key constraints.

If these features are things you need in your application, then you should use InnoDB.

To respond to your comment, it's not that simple. InnoDB is actually faster than MyISAM in quite a few cases, so it depends on what your application's mix of selects, updates, concurrent queries, indexes, buffer configuration, etc.

See http://www.mysqlperformanceblog.com/2007/01/08/innodb-vs-myisam-vs-falcon-benchmarks-part-1/ for a very thorough performance comparison of the storage engines. InnoDB wins over MyISAM frequently enough that it's clearly not possible to say one is faster than the other.

As with most performance-related questions, the only way to answer it for your application is to test both configurations using your application and a representative sample of data, and measure the results.

On a non-unique index that just happens to be unique and a unique index? I'm not sure, but I'd guess not a lot. The optimiser should examine the cardinality of the index and use that (it will always be the number of rows, for a unique index).

As far as a primary key is concerned, probably quite a lot, but it depends which engine you use.

The InnoDB engine (which is used by many people) always clusters rows on the primary key. This means that the PK is essentially combined with the actual row data. If you're doing a lot of lookups by PK (or indeed, range scans etc), this is a Good Thing, because it means that it won't need to fetch as many blocks from the disc.

A non-PK unique index will never be clustered in InnoDB.

On the other hand, some other engines (MyISAM in particular) don't cluster the PK, so the primary key is just like a normal unique index.

MySQL 5.0 indexes - Unique vs Non Unique

Tags:

Mysql

Indexing

Related

Recent Posts