How to find out which line separator BufferedReader#readLine() used to split the line?
To be in phase with the BufferedReader class, you may use the following method that handles \n, \r, \n\r and \r\n end line separators:
public static String retrieveLineSeparator(File file) throws IOException {
char current;
String lineSeparator = "";
FileInputStream fis = new FileInputStream(file);
try {
while (fis.available() > 0) {
current = (char) fis.read();
if ((current == '\n') || (current == '\r')) {
lineSeparator += current;
if (fis.available() > 0) {
char next = (char) fis.read();
if ((next != current)
&& ((next == '\r') || (next == '\n'))) {
lineSeparator += next;
}
}
return lineSeparator;
}
}
} finally {
if (fis!=null) {
fis.close();
}
}
return null;
}
After reading the java docs (I confess to being a pythonista), it seems that there isn't a clean way to determine the line-end encoding used in a specific file.
The best thing I can recommended is that you use BufferedReader.read()
and iterate over every character in the file. Something like this:
String filename = ...
br = new BufferedReader( new FileInputStream(filename));
while (true) {
String l = "";
Char c = " ";
while (true){
c = br.read();
if not c == "\n"{
// do stuff, not sure what you want with the endl encoding
// break to return endl-free line
}
if not c == "\r"{
// do stuff, not sure what you want with the endl encoding
// break to return endl-free line
Char ctwo = ' '
ctwo = br.read();
if ctwo == "\n"{
// do extra stuff since you know that you've got a \r\n
}
}
else{
l = l + c;
}
if (l == null) break;
...
l = "";
}
BufferedReader.readLine()
does not provide any means of determining what the line break was. If you need to know, you'll need to read characters in yourself and find line breaks yourself.
You may be interested in the internal LineBuffer class from Guava (as well as the public LineReader class it's used in). LineBuffer
provides a callback method void handleLine(String line, String end)
where end
is the line break characters. You could probably base something to do what you want on that. An API might look something like public Line readLine()
where Line
is an object that contains both the line text and the line end.