Bradley adaptive thresholding algorithm
You cannot create the integral image with PIL the way that you are doing it because the image that you are packing data into cannot accept values over 255. The values in the integral image get very large because they are the sums of the pixels above and to the left (see page 3 of your white paper, below).
They will grow much much larger than 255, so you need 32 bits per pixel to store them.
You can test this by creating a PIL image in "L" mode and then setting a pixel to 1000000 or some large number. Then when you read back the value, it will return 255.
>>> from PIL import Image
>>> img = Image.new('L', (100,100))
>>> img.putpixel((0,0), 100000)
>>> print(list(img.getdata())[0])
255
EDIT: After reading the PIL documentation, you may be able to use PIL if you create your integral image in "I" mode instead of "L" mode. This should provide 32 bits per pixel.
For that reason I recommend Numpy instead of PIL.
Below is a rewrite of your threshold function using Numpy instead of PIL, and I get the correct/expected result. Notice that I create my integral image using a uint32 array. I used the exact same C example on Github that you used for your translation:
import numpy as np
def adaptive_thresh(input_img):
h, w = input_img.shape
S = w/8
s2 = S/2
T = 15.0
#integral img
int_img = np.zeros_like(input_img, dtype=np.uint32)
for col in range(w):
for row in range(h):
int_img[row,col] = input_img[0:row,0:col].sum()
#output img
out_img = np.zeros_like(input_img)
for col in range(w):
for row in range(h):
#SxS region
y0 = max(row-s2, 0)
y1 = min(row+s2, h-1)
x0 = max(col-s2, 0)
x1 = min(col+s2, w-1)
count = (y1-y0)*(x1-x0)
sum_ = int_img[y1, x1]-int_img[y0, x1]-int_img[y1, x0]+int_img[y0, x0]
if input_img[row, col]*count < sum_*(100.-T)/100.:
out_img[row,col] = 0
else:
out_img[row,col] = 255
return out_img