In what cases are BLOB and TEXT stored in-line on InnoDB?
Keep in mind that there are 4 "row formats". A main difference between then has to do with how wide columns are handled.
The reference points to an Answer written early in 2010, a few months before DYNAMIC
and COMPRESSED
were introduced in the "InnoDB plugin".
So, I claim that that other Q&A is out of date! That is the 754 Upvotes are no longer valid.
Here are some 'current' manual references:
(emphasis mine)
From https://dev.mysql.com/doc/refman/5.6/en/innodb-row-format-overview.html :
Columns such as BLOB and VARCHAR that are too long to fit on a B-tree page are stored on separately allocated disk pages called overflow pages.
From https://dev.mysql.com/doc/refman/5.6/en/innodb-row-format-dynamic.html :
When a table is created with ROW_FORMAT=DYNAMIC or ROW_FORMAT=COMPRESSED, InnoDB can store long variable-length column values (for VARCHAR, VARBINARY, and BLOB and TEXT types) fully off-page, with the clustered index record containing only a 20-byte pointer to the overflow page.
[DYNAMIC and COMPRESSED] When the row is too long, InnoDB chooses the longest columns for off-page storage until the clustered index record fits on the B-tree page. TEXT and BLOB columns that are less than or equal to 40 bytes are always stored in-line.
The DYNAMIC format is based on the idea that if a portion of a long data value is stored off-page, it is usually most efficient to store all of the value off-page. With DYNAMIC format, shorter columns are likely to remain in the B-tree node, minimizing the number of overflow pages needed for any given row.
The COMPRESSED row format uses similar internal details for off-page storage as the DYNAMIC row format
From https://dev.mysql.com/doc/refman/5.6/en/innodb-physical-record.html (even a big CHAR
may be treated like TEXT
):
InnoDB [all formats?] encodes fixed-length fields greater than or equal to 768 bytes in length as variable-length fields, which can be stored off-page. For example, a CHAR(255) column can exceed 768 bytes if the maximum byte length of the character set is greater than 3, as it is with utf8mb4.
ROW_FORMAT=DYNAMIC and ROW_FORMAT=COMPRESSED handle CHAR storage in the same way as ROW_FORMAT=COMPACT.
(The links are for 5.6, but I think the text applies at least to newer versions.)
I see nothing specific that says that VARCHAR(...)
and *TEXT
are handled differently.
For InnoDB the rule goes like this:
- A record can not be fragmented.
- At least two records must fit in a page.
- If a record size is less than ~7k, all field values are stored in-page.
- If the record size is more than ~7k the first 768 (
COMPACT
format) or 20 bytes (DYNAMIC
format) are stored in-page. The remaining part is stored off-page.
How to verify that.
CREATE TABLE `t1` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`name` text,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1
mysql> insert into t1(name) select repeat('a', 10);
Query OK, 1 row affected (0.02 sec)
Records: 1 Duplicates: 0 Warnings: 0
mysql> insert into t1(name) select repeat('a', 8000);
Query OK, 1 row affected (0.00 sec)
Records: 1 Duplicates: 0 Warnings: 0
Then parse the t1.ibd
with stream_parser.
stream_parser -f t1.ibd
Check content of the PRIMARY index page:
# hexdump -C pages-t1.ibd/FIL_PAGE_INDEX/0000000000000046.page
00000000 97 e5 5e 36 00 00 00 03 ff ff ff ff ff ff ff ff |..^6............|
00000010 00 00 00 00 00 27 b9 44 45 bf 00 00 00 00 00 00 |.....'.DE.......|
00000020 00 00 00 00 00 18 00 02 1f f3 80 04 00 00 00 00 |................|
00000030 00 a2 00 02 00 01 00 02 00 00 00 00 00 00 00 00 |................|
00000040 00 00 00 00 00 00 00 00 00 2e 00 00 00 18 00 00 |................|
00000050 00 02 00 f2 00 00 00 18 00 00 00 02 00 32 01 00 |.............2..|
00000060 02 00 1c 69 6e 66 69 6d 75 6d 00 03 00 0b 00 00 |...infimum......|
00000070 73 75 70 72 65 6d 75 6d 0a 00 00 00 10 00 23 80 |supremum......#.|
00000080 00 00 01 00 00 00 00 05 10 ae 00 00 01 22 01 10 |............."..|
00000090 61 61 61 61 61 61 61 61 61 61 40 9f 00 00 00 18 |aaaaaaaaaa@.....| <- this is the first record
000000a0 ff ce 80 00 00 02 00 00 00 00 05 11 af 00 00 01 |................|
000000b0 23 01 10 61 61 61 61 61 61 61 61 61 61 61 61 61 |#..aaaaaaaaaaaaa| <- this is the beginning of the second record
000000c0 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 |aaaaaaaaaaaaaaaa|
* <- The star means identical strings are omitted, -v will show full output.
00001ff0 61 61 61 00 00 00 00 00 00 00 00 00 00 00 00 00 |aaa.............|
00002000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
00003ff0 00 00 00 00 00 70 00 63 97 e5 5e 36 00 27 b9 44 |.....p.c..^6.'.D|
00004000 97 e5 5e 36 00 00 00 03 ff ff ff ff ff ff ff ff |..^6............|
00004010 00 00 00 00 00 27 b9 44 45 bf 00 00 00 00 00 00 |.....'.DE.......|
00004020 00 00 00 00 00 18 00 02 1f f3 80 04 00 00 00 00 |................|
00004030 00 a2 00 02 00 01 00 02 00 00 00 00 00 00 00 00 |................|
00004040 00 00 00 00 00 00 00 00 00 2e 00 00 00 18 00 00 |................|
00004050 00 02 00 f2 00 00 00 18 00 00 00 02 00 32 01 00 |.............2..|
00004060 02 00 1c 69 6e 66 69 6d 75 6d 00 03 00 0b 00 00 |...infimum......|
00004070 73 75 70 72 65 6d 75 6d 0a 00 00 00 10 00 23 80 |supremum......#.|
00004080 00 00 01 00 00 00 00 05 10 ae 00 00 01 22 01 10 |............."..|
00004090 61 61 61 61 61 61 61 61 61 61 40 9f 00 00 00 18 |aaaaaaaaaa@.....|
000040a0 ff ce 80 00 00 02 00 00 00 00 05 11 af 00 00 01 |................|
000040b0 23 01 10 61 61 61 61 61 61 61 61 61 61 61 61 61 |#..aaaaaaaaaaaaa|
000040c0 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 |aaaaaaaaaaaaaaaa|
*
00005ff0 61 61 61 00 00 00 00 00 00 00 00 00 00 00 00 00 |aaa.............|
00006000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
00007ff0 00 00 00 00 00 70 00 63 97 e5 5e 36 00 27 b9 44 |.....p.c..^6.'.D|
00008000
This is for COMPACT
format. For DYNAMIC
(except COMPRESSED
) the procedure is same.