Index performance for CHAR vs VARCHAR (Postgres)
CHAR
and VARCHAR
are implemented exactly the same in Postgres (and Oracle). There is no difference in speed when using those data types.
However, there is one difference that can make a difference in performance: a char
column is always padded to the defined length. So if you define a column as char(100)
and one as varchar(100)
but only store 10 characters in each, the char(100)
column uses 100 characters for each value (the 10 characters you stored, plus 90 spaces), whereas the varchar
column only stores 10 characters.
Comparing 100 character with 100 characters is going to be slower than comparing 10 characters with 10 characters - although I doubt you can actually measure this difference in a SQL query.
If you declare both with the length of 10 characters and always store exactly 10 characters in them, then there is absolutely no difference whatsoever (this is true for Oracle and Postgres)
So the only difference is the padding that is done for the char
data type.
Also keep in mind that there's often a very big difference between a CHAR and a VARCHAR when doing index comparisons
The above quote is only true if (and only if) the char
column is defined too wide (i.e. you are wasting space due to padding). If the length of the char
column is always used completely (so no padding occurs), then the above quote is wrong (at least for Postgres and Oracle)
From my point of view, the char
data type does not really have any real-word use. Just use varchar
(or text
in Postgres) and forget that char
exists.
I agree with everything said by a_horse_with_no_name, and I generally agree with Erwin's comment advice:
No, char is inferior (and outdated). text and varchar perform (almost) the same.
Metadata
With one minor exception, the only time I use char()
is when I want the meta-data to say this MUST have have x-characters. Though I know that char()
only complains if the input is over the limit, I'll frequently protect against underruns in a CHECK
constraint. For example,
CREATE TABLE foo (
x char(10) CHECK ( length(x) = 10 )
);
INSERT INTO foo VALUES (repeat('x', 9));
I do this for a few reasons,
char(x)
is sometimes inferred with schema-loaders as being a fixed-width column. This may make a difference in a language that is optimized for fixed-width strings.- It establishes a convention that makes sense and is easily enforced. I can write a schema-loader in a language to generate code from this convention.
Need an example of where I may do this,
- Two-letter state abbreviations, though because this list can be enumerated, I'll typically do it with an
ENUM
. - Vehicle Identification Numbers
- Model Numbers (of fixed size)
On errors
Notice some people may be uncomfortable with the incongruity of error messages on both sides of the limit, but it doesn't bother me
test=# INSERT INTO foo VALUES (repeat('x', 9));
ERROR: new row for relation "foo" violates check constraint "foo_x_check"
DETAIL: Failing row contains (xxxxxxxxx ).
test=# INSERT INTO foo VALUES (repeat('x', 11));
ERROR: value too long for type character(10)
Contrast with varchar
Moreover, I think the above suggestion fits really well with a convention of almost always use text
. You ask about varchar(n)
too. I never use that. At least, I can't remember the last time I used varchar(n)
.
- If a spec has a static-width field that I trust, I use
char(n)
, - Otherwise, I use
text
which is effectivelyvarchar
(no limit)
If I found a spec that had variable-length text-keys that were meaningful and that I trusted to have a constant max-length, I would use varchar(n)
too. However, I can't think of anything that fits that criteria.
Additional notes
char
here is not to be confused with"char"
which is a one-byte type and has solid performance and space-saving benefits.
Related Q & A:
- Would index lookup be noticeably faster with char vs varchar when all values are 36 chars
- What is the overhead for varchar(n)?