Faster than binary search for ordered list

One possibility is to treat it like finding the roots of a function. Basically, finding:

a[i] <= i <= a[i + 1]

Is equivalent to:

a[i] - i <= 0 <= a[i + 1] - i

Then you could try something like Newton's method and so on. These kinds of algorithms frequently converge faster than a binary search when they work, but I don't know of one that is guaranteed to converge for all input.

http://en.wikipedia.org/wiki/Root-finding_algorithm


You can do better than O(log n) if the values are integers, in which case the best worst-case running time you can achieve, in terms of n, is O(sqrt(log n)). Otherwise, there is no way to beat O(log n) unless there are patterns in the input sequence. There are two approaches used to beat O(log n) in the case of integers.

First, you can use y-fast trees which work by storing in a hash table all prefixes for which you are storing at least one integer with that prefix. This enables you to perform a binary search to find the length of the longest matching prefix. This enables you to find the successor of an element for which you are searching in time O(log w) where w is the number of bits in a word. There are some details to work though to make this work and use only linear space, but they aren't too bad (see the link below).

Second, you can use fusion trees, which use bit tricks to enable you to perform w^O(1) comparisons in just a constant number of instructions, yielding a running time of O(log n / log w).

The optimum tradeoff between these two data structures occurs when log w = sqrt(log n), giving a running time of O(sqrt(log n)).

For details on the above, see lectures 12 and 13 of Erik Demaine's course: http://courses.csail.mit.edu/6.851/spring07/lec.html


What about the following algo? it is called Exponential Search and is one of the variations of binary search. http://en.m.wikipedia.org/wiki/Exponential_search

Searching for element k in sorted array A of size n. Lookup A[2^i] for i=0, 1, 2,... until you go beyond k's position in A. then do a binary search on the part of the array left (smaller) than i.

int exponential_search(int A[], int key)
{
  // lower and upper bound for binary search
  int lower_bound = 0;
  int upper_bound = 1;

  // calculate lower and upper bound
  while (A[upper_bound] < key) {
    lower_bound = upper_bound;
   upper_bound = upper_bound * 2;
  }
  return binary_search(A, key, lower_bound, upper_bound);
}

This algo will run on O(log idx) where idx is the index of k in A. (both stpes are in log idx). In the worst case, the algo is in O(log idx), if k is amongst the largest elements of A or bigger than any element of A. The multiplicative constant is larger than for binary search but the algo would run faster for very large arrays and when looking for data that's towards the beginning of the array.

I'D like to have some idea of the minimal size n where this algo becomes preferable to binary search, but I don't know.