Crop raster in memory with python GDAL bindings
With GDAL 2.2.0+ use the VSIMEM filesystem (http://www.gdal.org/gdal_virtual_file_systems.html). It allows you to treat blocks of memory as a file within the virtual file system. This version uses the projWin parameter of gdal_translate to clip from a bounding box.
gdal.Translate('/vsimem/clip.tif', 'path/to/input.tif', projWin=[ulx, uly, lrx, lry])
You can then open the output using the standard approach:
ds = gdal.Open('/vsimem/clip.tif')
Note that Translate was used over Warp because Warp can get buggy when specifying a gdal dataset object as the input. Warp is also more traditionally used for clipping along vector features, Translate is simple the better option for clipping from a bounding box.
Using the VSIMEM filesystem is a simpler option than creating your own memory dataset to store the output because it is directly taking the output of Translate and writing it into memory. On the other hand, any discrepancies between your output clip and your instantiated memory dataset will result in weird interactions.
As you have not need to keep the cropped raster you do not need 'gdal.Warp' neither. You can use 'ReadAsArray' GDAL method, by taking in account point upper left and point bottom right of bounding box, for calculating indexes column, row for each point and each raster. It will extract adequate raster portion for calculations in each particular case .
In following code, I used only one raster but it's easy to generalize for a complete raster directory. It calculates mean for its computed raster portion as part of your "perform come calculations with..." requirement.
from osgeo import gdal
import numpy as np
dataset = gdal.Open("/home/zeito/pyqgis_data/aleatorio.tif")
band = dataset.GetRasterBand(1)
geotransform = dataset.GetGeoTransform()
xinit = geotransform[0]
yinit = geotransform[3]
xsize = geotransform[1]
ysize = geotransform[5]
#p1 = point upper left of bounding box
#p2 = point bottom right of bounding box
p1 = (355374.4285, 4472950.6531) #(6, 5)
p2 = (356048.0437, 4472512.1678) #(12, 14)
row1 = int((p1[1] - yinit)/ysize)
col1 = int((p1[0] - xinit)/xsize)
row2 = int((p2[1] - yinit)/ysize)
col2 = int((p2[0] - xinit)/xsize)
data = band.ReadAsArray(col1, row1, col2 - col1 + 1, row2 - row1 + 1)
#perform come calculations with ...
mean = np.mean(data)
print mean
Situation can be observed at following image for my example:
After running it at Python Console was printed 52.8714285714 value; as it was expected.