Binary search in a sorted text file

(This is not a correct answer to your question, just a starting point.)

I used sgrep (sorted grep) in a similar situation.

Unfortunately (we need the current state) it does not have a byte-offset output; but I think it could be easily added.


I'm not aware of some standard tool doing this. However you can write your own. For example the following ruby script should do the job.

file, key = ARGV.shift, ARGV.shift
min, max = 0, File.size(file)

File.open(file) do |f|
  while max-min>1 do
    middle = (max+min)/2
    f.seek middle
    f.readline
    if f.eof? or f.readline>=key
      max = middle
    else
      min = middle
    end
  end
  f.seek max
  f.readline
  p f.pos+1
end

It's a bit tricky because after the seek you are usually in the middle of some line and therefore need to do one readline to get to the beginning of the following line, which you can read and compare to your key.