read file backwards (last line first)

It goes like this:

  1. Seek to one byte before the end of the file using fseek. There's no guarantee that the last line will have an EOL so the last byte doesn't really matter.
  2. Read one byte using fgetc.
  3. If that byte is an EOL then the last line is a single empty line and you have it.
  4. Use fseek again to go backwards two bytes and check that byte with fgetc.
  5. Repeat the above until you find an EOL. When you have an EOL, the file pointer will be at the beginning of the next (from the end) line.
  6. ...
  7. Profit.

Basically you have to keep doing (4) and (5) while keeping track of where you were when you found the beginning of a line so that you can seek back there before starting your scan for the beginning of the next line.

As long as you open your file in text mode you shouldn't have have to worry about multibyte EOLs on Windows (thanks for the reminder Mr. Lutz).

If you happen to be given a non-seekable input (such as a pipe), then you're out of luck unless you want to dump your input to a temporary file first.

So you can do it but it is rather ugly.

You could do pretty much the same thing using mmap and a pointer if you have mmap available and the "file" you're working with is mappable. The technique would be pretty much the same: start at the end and go backwards to find the end of the previous line.


Re: "I am the one creating this file. So, can I create in a way its in the reverse order? Is that possible?"

You'll run into the same sorts of problems but they'll be worse. Files in C are inherently sequential lists of bytes that start at the beginning and go to the end; you're trying to work against this fundamental property and going against the fundamentals is never fun.

Do you really need your data in a plain text file? Maybe you need text/plain as the final output but all the way through? You could store the data in an indexed binary file (possibly even an SQLite database) and then you'd only have to worry about keeping (or windowing) the index in memory and that's unlikely to be a problem (and if it is, use a "real" database); then, when you have all your lines, just reverse the index and away you go.


In pseudocode:

open input file
while (fgets () != NULL)
{
   push line to stack
}
open output file
while (stack no empty)
{
   pop stack
   write popped line to file
}

The above is efficient, there is no seek (a slow operation) and the file is read sequentially. There are, however, two pitfalls to the above.

The first is the fgets call. The buffer supplied to fgets may not be big enough to hold a whole line from the input in which case you can do one of the following: read again and concatenate; push a partial line and add logic to the second half to fix up partial lines or wrap the line into a linked list and only push the linked list when a newline/eof is encountered.

The second pitfall will happen when the file is bigger than the available ram to hold the stack, in which case you'll need to write the stack structure to a temporary file whenever it reaches some threshold memory usage.

Tags:

C

File