How to fix intermittant "No space left on device" errors during mv when device has plenty of space?
Bug in the implementation of ext4 feature dir_index
which you are using on your destination filesystem.
Solution : recreate filesytem without dir_index. Or disable feature using tune2fs (some caution required, see related link Novell SuSE 10/11: Disable H-Tree Indexing on an ext3 Filesystem which although relates to ext3 may need similar caution.
(get a really good backup made of the filesystem)
(unmount the filesystem)
tune2fs -O ^dir_index /dev/foo
e2fsck -fDvy /dev/foo
(mount the filesystem)
- ext4: Mysterious “No space left on device”-errors
ext4 has a feature called dir_index enabled by default, which is quite susceptible to hash-collisions.
......
ext4 has the possibility to hash the filenames of its contents. This enhances performance, but has a “small” problem: ext4 does not grow its hashtable, when it starts to fill up. Instead it returns -ENOSPC or “no space left on device”.
Suggestions for better-than-ext4 choices for storing masses of small files:
If you're using the filesystem as an object store, you might want to look at using a filesystem that specializes in that, possibly to the detriment of other characteristics. A quick Google search found Ceph, which appears to be open source, and can be mounted as a POSIX filesystem, but also accessed with other APIs. I don't know if it's worth using on a single host, without taking advantage of replication.
Another object-storage system is OpenStack's Swift. Its design docs say it stores each object as a separate file, with metadata in xattrs. Here's an article about it. Their deployment guide says they found XFS gave the best performance for object storage. So even though the workload isn't what XFS is best at, it was apparently better than the competitors when RackSpace was testing things. Possibly Swift favours XFS because XFS has good / fast support for extended attributes. It might be that ext3/ext4 would do ok on single disks as an object-store backend if extra metadata wasn't needed (or if it was kept inside the binary file).
Swift does the replication / load-balancing for you, and suggests that you give it filesystems made on raw disks, not RAID. It points out that its workload is essentially worst-case for RAID5 (which makes sense if we're talking about a workload with writes of small files. XFS typically doesn't quite pack them head-to-tail, so you don't get full-stripe writes, and RAID5 has to do some reads to update the parity stripe. Swift docs also talk about using 100 partitions per drive. I assume that's a Swift term, and isn't talking about making 100 different XFS filesystems on each SATA disk.
Running a separate XFS for every disk is actually a huge difference. Instead of one gigantic free-inode map, each disk will have a separate XFS with separate free-lists. Also, it avoids the RAID5 penalty for small writes.
If you already have your software built to use a filesystem directly as an object store, rather than going through something like Swift to handle the replication / load-balancing, then you can at least avoid having all your files in a single directory. (I didn't see Swift docs say how they lay out their files into multiple directories, but I'm certain they do.)
With almost any normal filesystem, it will help to used a structure like
1234/5678 # nested medium-size directories instead of
./12345678 # one giant directory
Probably about 10k entries is reasonable, so taking a well-distributed 4 characters of your object names and using them as directories is an easy solution. It doesn't have to be very well balanced. The odd 100k directory probably won't be a noticeable issue, and neither will some empty directories.
XFS is not ideal for huge masses of small files. It does what it can, but it's more optimized for streaming writes of larger files. It's very good overall for general use, though. It doesn't have ENOSPC
on collisions in its directory indexing (AFAIK), and can handle having one directory with millions of entries. (But it's still better to use at least a one-level tree.)
Dave Chinner had some comments on XFS performance with huge numbers of inodes allocated, leading to slow-ish touch
performance. Finding a free inode to allocate starts taking more CPU time, as the free inode bitmap gets fragmented. Note that this is not an issue of one-big-directory vs. multiple directories, but rather an issue of many used inodes over the whole filesystem. Splitting your files into multiple directories helps with some problems, like the one that ext4 choked on in the OP, but not the whole-disk problem of keeping track of free space. Swift's separate-filesystem-per-disk helps with this, compared to on giant XFS on a RAID5.
I don't know if btrfs is good at this, but I think it may be. I think Facebook employs its lead developer for a reason. :P Some benchmarks I've seen, of stuff like untarring a Linux kernel source, show btrfs does well.
I know reiserfs was optimized for this case, but it's barely, if at all, maintained anymore. I really can't recommend going with reiser4. It might be interesting to experiment, though. But it's by far the least future-proof choice. I've also seen reports of performance degradation on aged reiserFS, and there's no good defrag tool. (google filesystem millions of small files
, and look at some of the existing stackexchange answers.)
I'm probably missing something, so final recommendation: ask about this on serverfault! If I had to pick something right now, I'd say give BTRFS a try, but make sure you have backups. (esp. if you use BTRFS's build-in multiple disk redundancy, instead of running it on top of RAID. The performance advantages could be big, since small files are bad news for RAID5, unless it's a read-mostly workload.)
For this issue below is what i did to fix( you may need sudo access for below steps):
Used space of Inodes was 100% which can be retrieved using below command
df -i /
Filesystem Inodes IUsed IFree IUse% Mounted on
/dev/xvda1 524288 524288 o 100% /
- Need to free up the iNoted, hence need to find the files which has here number of i nodes using below command:
Try to find if this is an inodes problem with:
df -ih
Try to find root folders with large inodes count:
for i in /*; do echo $i; find $i |wc -l; done
Try to find specific folders:
for i in /src/*; do echo $i; find $i |wc -l; done
- now we have zeroed down to the folder with large number of files in it. Run the below commands one after the other to avoid any error (In my case the actual folder was /var/spool/clientmqueue):
find /var/spool/clientmqueue/ -type f -mtime +1050 -exec rm -f {} + find /var/spool/clientmqueue/ -type f -mtime +350 -exec rm -f {} + find /var/spool/clientmqueue/ -type f -mtime +150 -exec rm -f {} + find /var/spool/clientmqueue/ -type f -mtime +50 -exec rm -f {} +