Files.walk(), calculate total size
No, this exception cannot be avoided.
The exception itself occurs inside the the lazy fetch of Files.walk()
, hence why you are not seeing it early and why there is no way to circumvent it, consider the following code:
long size = Files.walk(Paths.get("C://"))
.peek(System.out::println)
.mapToLong(this::count)
.sum();
On my system this will print on my computer:
C:\
C:\$Recycle.Bin
Exception in thread "main" java.io.UncheckedIOException: java.nio.file.AccessDeniedException: C:\$Recycle.Bin\S-1-5-18
And as an exception is thrown on the (main) thread on the third file, all further executions on that thread stop.
I believe this is a design failure, because as it stands now Files.walk
is absolutely unusable, because you never can guarantee that there will be no errors when walking over a directory.
One important point to notice is that the stacktrace includes a sum()
and reduce()
operation, this is because the path is being lazily loaded, so at the point of reduce()
, the bulk of stream machinery gets called (visible in stacktrace), and then it fetches the path, at which point the UnCheckedIOException
occurs.
It could possibly be circumvented if you let every walking operation execute on their own thread. But that is not something you would want to be doing anyway.
Also, checking if a file is actually accessible is worthless (though useful to some extent), because you can not guarantee that it is readable even 1ms later.
Future extension
I believe it can still be fixed, though I do not know how FileVisitOption
s exactly work.
Currently there is a FileVisitOption.FOLLOW_LINKS
, if it operates on a per file basis, then I would suspect that a FileVisitOption.IGNORE_ON_IOEXCEPTION
could also be added, however we cannot correctly inject that functionality in there.
I found that using Guava's Files class solved the issue for me:
Iterable<File> files = Files.fileTreeTraverser().breadthFirstTraversal(dir);
long size = toStream( files ).mapToLong( File::length ).sum();
Where toStream
is my static utility function to convert an Iterable to a Stream. Just this:
StreamSupport.stream(iterable.spliterator(), false);
The short answer is you can't.
The exception is coming from FileTreeWalker.visit
.
To be precise, it is trying to build a newDirectoryStream
when it fails (this code is out of your control):
// file is a directory, attempt to open it
DirectoryStream<Path> stream = null;
try {
stream = Files.newDirectoryStream(entry);
} catch (IOException ioe) {
return new Event(EventType.ENTRY, entry, ioe); // ==> Culprit <==
} catch (SecurityException se) {
if (ignoreSecurityException)
return null;
throw se;
}
Maybe you should submit a bug.
2017 for those who keep arriving here.
Use Files.walk() when you are certain of the file system behaviour and really want to stop when there is any error. Generally Files.walk is not useful in standalone apps. I make this mistake so often, perhaps I am lazy. I realize my mistake the moment I see the time taken lasting more than a few seconds for something small like 1 million files.
I recommend walkFileTree
. Start by implementing the FileVisitor interface, here I only want to count files. Bad class name, I know.
class Recurse implements FileVisitor<Path>{
private long filesCount;
@Override
public FileVisitResult preVisitDirectory(Path dir, BasicFileAttributes attrs) throws IOException {
return FileVisitResult.CONTINUE;
}
@Override
public FileVisitResult visitFile(Path file, BasicFileAttributes attrs) throws IOException {
//This is where I need my logic
filesCount++;
return FileVisitResult.CONTINUE;
}
@Override
public FileVisitResult visitFileFailed(Path file, IOException exc) throws IOException {
// This is important to note. Test this behaviour
return FileVisitResult.CONTINUE;
}
@Override
public FileVisitResult postVisitDirectory(Path dir, IOException exc) throws IOException {
return FileVisitResult.CONTINUE;
}
public long getFilesCount() {
return filesCount;
}
}
Then use your defined Class like this.
Recurse r = new Recurse();
Files.walkFileTree(Paths.get("G:"), r);
System.out.println("Total files: " + r.getFilesCount());
I am sure you know how to modify your own class'es implementation of the FileVisitor<Path>
Interface class to do other things like filesize
with the example I posted. Refer to the docs for other methods in this
Speed:
- Files.walk : 20+ minutes and failing with exception
- Files.walkFileTree: 5.6 seconds, done with perfect answer.
Edit: As with everything, use tests to confirm the behaviour Handle Exceptions, they do still occur except for the ones we choose not to care about as above.