ZFS pool slow sequential read
I managed to get speeds very close to the numbers I was expecting.
I was looking for 400MB/sec and managed 392MB/sec. So I say that is problem solved. With the later addition of a cache device, I managed 458MB/sec read (cached I believe).
1. This at first was achieved simply by increasing the ZFS dataset recordsize
value to 1M
zfs set recordsize=1M pool2/test
I believe this change just results in less disk activity, thus more efficient large synchronous reads and writes. Exactly what I was asking for.
Results after the change
- bonnie++ = 226MB write, 392MB read
- dd = 260MB write, 392MB read
- 2 processes in parallel = 227MB write, 396MB read
2. I managed even better when I added a cache device (120GB SSD). The write is a tad slower, I'm not sure why.
Version 1.97 ------Sequential Output------ --Sequential Input- --Random-
Concurrency 1 -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP
igor 63G 208325 48 129343 28 458513 35 326.8 16
The trick with the cache device was to set l2arc_noprefetch=0
in /etc/modprobe.d/zfs.conf. It allows ZFS to cache streaming/sequential data. Only do this if your cache device is faster than your array, like mine.
After benefiting from the recordsize change on my dataset, I thought it might be a similar way to deal with poor zvol performance.
I came across severel people mentioning that they obtained good performance using a volblocksize=64k
, so I tried it. No luck.
zfs create -b 64k -V 120G pool/volume
But then I read that ext4 (the filesystem I was testing with) supports options for RAID like stride
and stripe-width
, which I've never used before. So I used this site to calculate the settings needed: https://busybox.net/~aldot/mkfs_stride.html and formatted the zvol again.
mkfs.ext3 -b 4096 -E stride=16,stripe-width=32 /dev/zvol/pool/volume
I ran bonnie++
to do a simple benchmark and the results were excellent. I don't have the results with me unfortunately, but they were atleast 5-6x faster for writes as I recall. I'll update this answer again if I benchmark again.