Why using innodb_file_per_table?

I don't think it's a matter of performance but of management.

With separate file per table, you can store different databases in different storage devices for example.

You can deal with the case of very large databases in file systems that can't handle big files (at least postpone the problem until one table reaches the file size limit).

You don't have uncontrolled tablespace growth. If you have some big tables that you drop, the ibdata file stays small.

One aspect that may have some effect on performance is the fragmentation of table data and indexes, which will be limited per table. But that needs testing to be confirmed.


Why using innodb_file_per_table?

Because it is easier to manage individual since it can be done at the file-level. This means that even if the server is down, you can still copy data by copying the table files whereas using a shared table-space means either copying everything which can be unnecessarily massive, or finding some way to get the server running to extract data (you really don’t want to manually extract the data with a hex-editor).

Someone warned that you cannot simply copy and paste .ibd files from one server to another. This may be true, but it should not apply to backups on the same server (I am using the term backup here in the traditional sense of making a copy; i.e., not drastically changing the whole thing). Moreover, ibdata1 is automatically recreated on startup (as seen in the delete ibdata1 step of most “converting to file-per-table” guides). As such, you do not need to copy ibdata1 in addition to your .ibd files (and their corresponding .frm, etc. files).

If trying to recover a lost table, it should be sufficient to copy its .ibd and .frm file, as well as information_schema (which is much smaller than ibdata1). That way, you can put them in a dummy server and extract your table without having to copy the whole, massive thing.

However, the claim for better performance is questionable. … With innodb_file_per_table, more disk I/O operations are needed; and this is significant in complicated JOINs and FOREIGN KEY constraints.

Not surprisingly, the performance will depend entirely on the specific database(s) in use. One person will have (even vastly) different results from another.

It is true that there will be more disk I/O operations with file-per-table, but only slightly more. Think about how the system works.

  • For a monolithic database:

    1. Server is started
    2. ibdata1 is opened
    3. Header and meta-data are read
    4. Structures and meta-data are cached in memory
    5. Queries happen
      1. Server accesses the disk and reads the data from the already opened ibdata1
      2. Server may cache the data in memory
  • For a per-table database:

    1. Server is started
    2. ibdata1 is opened
    3. Header and meta-data are read
    4. Each individual .ibd file is opened
    5. Header and meta-data are read from each .ibd file
    6. Structures and meta-data are cached in memory
    7. Queries happen
      1. Server access the disk and reads the data from the already opened .ibd file
      2. Server may cache the data in memory

You will notice that when the server is running, you cannot move the data files because the server has open handles to them. This is because when it starts up, it opens them and leaves them open. It does not open and close them for each individual query.

As such, there is only some more I/O operations at the beginning, when the server starts up; not while it is running. Further, while each individual .ibd file has its own separate overhead (file signatures, structures, etc.), they are cached in memory and not re-read for each query. Moreover, the same structures are read even with a shared table-space, so there is barely any (if any at all) more memory required.

Does innodb_file_per_table has an effect on a better performance of mysql?

Actually, if anything, the performance may in fact be worse.

When using a shared table-space, read and write operations can sometimes/often be combined so that the server reads a swatch of data from multiple tables in one go from ibdata.

However, if the data is spread out amongst multiple files, then it has to perform a separate I/O operation for each one individually.

Of course this is again entirely dependent on the database in question; the real-world performance impact would depend on size, query frequency, and internal fragmentation of the shared table-space. Some people may notice a large difference while others may not see any impact at all.

Tablespace is shared on single ibdata; how dedicated tablespaces for separate tables can save disk space?

It does not. If anything, it increases disk usage some.

I don’t have a 60GB database to test with, but my “paltry” personal database which contains my WordPress installation and some a few small tables for personal use and development testing weighed in at ~30MB while using a shared table-space. After converting it to file-per-table, it bloated to ~85MB. Even by dropping everything and re-importing, it was still >60MB.

This increase is due to two factors:

  • The absolute minimum size for ibdata1 is—for some reason—10MB, even if you have nothing but information_schema stored in it.

  • With a shared table-space, only ibdata1 has overhead like file signatures, meta-data, etc., but with per-table, each individual .ibd file has all of that. This means that the total (even with a hypothetical <10MB ibdata1) would be somewhat larger by at least:

    GetTotalSizeofOverhead() * GetNumTables()
    

Obviously these are not going to be huge increases (unless you are using a host that limits your database size or storing them on a flash-drive, etc.), but they are increases nonetheless, and while by switching (every) table to file-per-table you can shrink ibdata1 down to 10MB, the overall total will invariably be more than it was.


This is my reason for ALWAYS using innodb_file_per_table:

Without file per table, the ibdata file never compresses or shrinks or lessens in space ever. Not when you delete a row, drop a table, or a database. 2GB of data can become a 20GB file in no time if you have an active queuing system.

Let's say you want to make a backup of your current 1GB table before an alter, then drop it afterward. You're stuck with a GB of now unused space in your ibdata. Bummer.

There's probably endless examples of instances where temporary measures inflate the single data file, but suffice it to say that in my opinion, there's never a reason to NOT use innodb_file_per_table

Also, here's a good post to read: http://code.openark.org/blog/mysql/reasons-to-use-innodb_file_per_table