I've just "mv"ed a 49GB directory to a bad file path, is it possible to restore the original state of the files?
When moving files between filesystems, mv
doesn't delete a file before it's finished copying it, and it processes files sequentially (I initially said it copies then deletes each file in turn, but that's not guaranteed — at least GNU mv
copies then deletes each command-line argument in turn, and POSIX specifies this behaviour). So you should have at most one incomplete file in the target directory, and the original will still be in the source directory.
To move things back, add the -i
flag so mv
doesn't overwrite anything:
sudo mv -i ~/my_data_on_60GB_partition/* /media/admin/my_data/
(assuming you don't have any hidden files to restore from ~/my_data_on_60GB_partition/
), or better yet (given that, as you discovered, you could have many files waiting to be deleted), add the -n
flag so mv
doesn't overwrite anything but doesn't ask you about it:
sudo mv -n ~/my_data_on_60GB_partition/* /media/admin/my_data/
You could also add the -v
flag to see what's being done.
With any POSIX-compliant mv
, the original directory structure should still be intact, so alternatively you could check that — and simply delete /media/admin/my_data
... (In the general case though, I think the mv -n
variant is the safe approach — it handles all forms of mv
, including e.g. mv /media/admin/my_data/* my_data_on_60GB_partition/
.)
You'll probably need to restore some permissions; you can do that en masse using chown
and chmod
, or restore them from backups using getfacl
and setfacl
(thanks to Sato Katsura for the reminder).
After getting Stephen Kitt's answer and discussing this command as a potential solution:
sudo mv -i ~/my_data_on_60GB_partition/* /media/admin/my_data/
I decided to hold off on running it until I got my head around what was happening, this answer describes what I found out and ended up doing.
I'm using Gnu mv
which copies files to the target, then only if the copy operation is successful, it deletes the original.
However I wanted to confirm whether mv
performs this sequence one file at a time, if that was true, the original folder contents would have cleanly been sliced into two parts, one part shifted to the destination, the other still left behind at the source. And possibly there would have one file that was interrupted during the copy which would common between the two directories - and it would likely be malformed.
To discover files that were common between the two directories, I ran:
~% sudo diff -r --report-identical-files my_data_on_60GB_partition/. /media/admin/mydata/. | grep identical | wc -l
14237
This result suggested there were 14,237 instances of the same files in both the source and target directories, I confirmed by checking the files manually - yes there were many of the same files in both directories. This suggests that only after mv
copies great swathes of files does it perform the deletion of the source files. A quick lookup in info
on mv
command showed
It [
mv
] first uses some of the same code that's used bycp -a
to copy the requested directories and files, then (assuming the copy succeeded) it removes the originals. If the copy fails, then the part that was copied to the destination partition is removed.
I didn't run the command but I suspect if I tried to run
sudo mv -i ~/my_data_on_60GB_partition/* /media/admin/my_data/
The -i
prompt before overwrite likely would have triggered more than 14,000 times.
So then to find out how many total files in the newly created directory:
~% sudo find my_data_on_60GB_partition/ -type f -a -print | wc -l
14238
So then if there was a total of 14238 regular files in the new directory and 14237 had identical originals back in the source, that means there was only one file in the new directory that didn't have a corresponding identical file back on the source. To find out what that file was, I ran rsync back in the direction of the source:
~% sudo rsync -av --dry-run my_data_on_60GB_partition/ /media/admin/my_data
sending incremental file list
./
Education_learning_reference/
Education_learning_reference/Business_Education/
Education_learning_reference/Business_Education/Business_education_media_files/
Education_learning_reference/Business_Education/Business_education_media_files/Jeff Hoffman - videos/
Education_learning_reference/Business_Education/Business_education_media_files/Jeff Hoffman - videos/Jeff and David F interview/
Education_learning_reference/Business_Education/Business_education_media_files/Jeff Hoffman - videos/Jeff and David F interview/018 business plans-identifying main KPIs.flv
sent 494,548 bytes received 1,881 bytes 330,952.67 bytes/sec
total size is 1,900,548,824 speedup is 3,828.44 (DRY RUN)
A quick check confirmed that this was the malformed file, where the file existed on both the source and the destination, destination file=64MB, original=100MB. This file and its directory hierarchy was still owned by root and had not yet had the original permissions restored.
So in summary:
- all the files which
mv
never reached are still back in their original locations (obviously) - all the files which
mv
did copy completely still had their original copies in the source directory - the file which was only partially copied still had the original back in the source directory
In other words all the original files were still intact and the solution in this case was to simply delete the new directory!
I just thought I'd comment that some people may be tempted to toss 'xargs' into the mix to run things in parallel. That gives me the willies and I really like the rsync solution above.
As to the filesystem stuff about moving and copying and when exactly the original gets deleted, the VFS and the underlying filesystem(s) coordinate to guarantee per-file atomicity before getting to that delete step. So even if it gets interrupted before the target file is fully written, all of the locking in the VFS is real strict and protects against stuff like random data interleaving even in parallel cases. (I worked on Linux VFS and NFS4 stuff)
Adding 'xargs' to the mix would probably have made the double-sanity-checking step a headache, with multiple files in mid-transit. I wish I'd had more system level-scripting. Good reminders for me!
Loved the question, good for cobwebs, and makes me love rsync again. Cheers!