HDFS file watcher
Hadoop 2.6 introduced DFSInotifyEventInputStream
that you can use for this. You can get an instance of it from HdfsAdmin
and then just call .take()
or .poll()
to get all the events. Event types include delete, append and create which should cover what you're looking for.
Here's a basic example. Make sure you run it as the hdfs
user as the admin interface requires HDFS root.
public static void main( String[] args ) throws IOException, InterruptedException, MissingEventsException
{
HdfsAdmin admin = new HdfsAdmin( URI.create( args[0] ), new Configuration() );
DFSInotifyEventInputStream eventStream = admin.getInotifyEventStream();
while( true ) {
EventBatch events = eventStream.take();
for( Event event : events.getEvents() ) {
System.out.println( "event type = " + event.getEventType() );
switch( event.getEventType() ) {
case CREATE:
CreateEvent createEvent = (CreateEvent) event;
System.out.println( " path = " + createEvent.getPath() );
break;
default:
break;
}
}
}
}
Here's a blog post that covers it in more detail:
http://johnjianfang.blogspot.com/2015/03/hdfs-6634-inotify-in-hdfs.html?m=1