How to reduce the InterpolatingFunction building overhead?

You can use binary search with Compile. I failed inlining (Compile was complaining endlessly about types mismatch), so I included a binary search directly into Compile-d function. The code for binary search itself corresponds to the bsearchMin function from this answer.

Clear[linterp];
linterp =
   Compile[{{lst, _Real, 2}, {pt, _Real}},
     Module[{pos  = -1 , x = lst[[All, 1]], y = lst[[All, 2]], n0 = 1, 
          n1 = Length[lst], m = 0},
      While[n0 <= n1, m = Floor[(n0 + n1)/2];
        If[x[[m]] == pt,
          While[x[[m]] == pt  && m < Length[lst], m++];
          pos = If[m == Length[lst], m, m - 1];
          Break[];
        ];
        If[x[[m]] < pt, n0 = m + 1, n1 = m - 1]
      ];
      If[pos == -1, pos = If[x[[m]] < pt, m, m - 1]];
      Which[
        pos == 0,
           y[[1]],
        pos == Length[x],
           y[[-1]],
        True,
        y[[pos]] + (y[[pos + 1]] - y[[pos]])/(x[[pos + 1]] - 
              x[[pos]])*(pt - x[[pos]])
      ]],
      CompilationTarget -> "C"];

This is about 20 times faster, on my benchamrks:

AbsoluteTiming[
   Table[Interpolation[list,InterpolationOrder->1][q],{q,0.0006,0.4,0.00001}];
]

{1.453,Null}

AbsoluteTiming[
   Table[linterp[list,q],{q,0.0006,0.4,0.00001}];
]

{0.063,Null}


Combinatorica functions are often not well optimized, so there may very well be a faster binary search algorithm. If that can be found, this might be effective:

Needs["Combinatorica`"]

f[{{a_, b_}, {c_, d_}}][x_] := b + (d - b)/(c - a) (x - a)

list[[Floor@{#, # + 1}]] & @ BinarySearch[list[[All, 1]], 0.33]

f[%][0.33]
8.86436

Check:

Interpolation[list, InterpolationOrder -> 1][0.33]
8.86436

Here is a linear interpolation routine that uses binary search with a few refinements (in particular, the binary search is skipped in the case of equispaced abscissas), as well as a stabilized version of the linear interpolation formula:

lerp = Compile[{{dat, _Real, 2}, {x, _Real}},
  Module[{n = Length[dat], k = 1, l, m, r, xa, ya},
         {xa, ya} = Transpose[dat];
         l = Min[Max[2, 1 + Quotient[x - First[xa],
                                     (Last[xa] - First[xa])/(n - 1)]], n - 1];

         If[xa[[l]] <= x,
            r = l + 1;
            While[r < n && xa[[r]] <= x,
                  l = r; k *= 2; r = Min[l + k, n]],
            {l, r} = {l - 1, l};
            While[1 < l && x < xa[[l]],
                  r = l; k *= 2; l = Max[1, r - k]]];

         While[r - l > 1,
               m = Quotient[l + r, 2];
               If[x < xa[[m]], r = m, l = m]];

         ({xa[[r]] - x, x - xa[[l]]}/(xa[[r]] - xa[[l]])).ya[[{l, r}]]],
         RuntimeOptions -> "Speed"]

Even without the compilation to C, the method is quite fast on my box:

AbsoluteTiming[Table[Interpolation[data, InterpolationOrder -> 1][q],
                     {q, 0.0006, 0.4, 0.00001}];][[1]]
   15.206078

AbsoluteTiming[Table[lerp[data, q], {q, 0.0006, 0.4, 0.00001}];][[1]]
   0.693506