Is there a python equivalent of gdaltindex?
I dont't know, but it's easy to do with osgeo.gdal, GeoPandas and shapely box ( shapely.geometry.box(minx, miny, maxx, maxy, ccw=True)
= bounding box)¶
Original raster files
import os
StartDir = "/Shared/scan_ign/68"
for dir, subdir, files in os.walk(StartDir):
for fname in files:
if fname.endswith(".tif"):
print(fname,)
68_1.tif
68_2.tif
68_3.tif
68_4.tif
68_5.tif
68_6.tif
68_7.tif
68_8.tif
With gdalindex
gdaltindex index.shp *.tif
Control
import geopandas as gpd
index = gpd.read_file('index.shp')
print index.head()
location geometry
0 68_1.tif POLYGON ((226018.4754020752 58174.93482261647,...
1 68_2.tif POLYGON ((234019.0396324735 58174.8250049785, ...
2 68_3.tif POLYGON ((242020.0156871924 58175.55617328714,...
3 68_4.tif POLYGON ((250020.4746237129 58176.34366682862,...
4 68_5.tif POLYGON ((226019.5264973617 48175.2918232924, ...
With osgeo.gdal
from osgeo import gdal
import geopandas as gpd
from shapely.geometry import box
# compute the bounding box of a gdal raster file
def bounds_raster(path):
raster = gdal.Open(path)
ulx, xres, xskew, uly, yskew, yres = raster.GetGeoTransform()
lrx = ulx + (raster.RasterXSize * xres)
lry = uly + (raster.RasterYSize * yres)
return box(lrx,lry,ulx,uly)
# creation of the index file
df = gpd.GeoDataFrame(columns=['location','geometry'])
# iterate through multiple tif files in a folder
for dir, subdir, files in os.walk(StartDir):
for fname in files:
if fname.endswith(".tif"):
df = df.append({'location':fname, 'geometry': bounds( os.path.join(dir+"/", fname))},ignore_index=True)
print df.head()
location geometry
0 68_1.tif POLYGON ((226018.4754020752 48173.94230128427,...
1 68_2.tif POLYGON ((234019.0396324735 48173.84503544428,...
2 68_3.tif POLYGON ((242020.0156871924 48174.31088528392,...
3 68_4.tif POLYGON ((250020.4746237129 48174.72574336932,...
4 68_5.tif POLYGON ((226019.5264973617 38173.0677178388, ...
# save resulting shapefile
df.to_file("tile-index.shp")
As Spacedman points out, you can also use rasterio
import rasterio
df = gpd.GeoDataFrame(columns=['location','geometry'])
for dir, subdir, files in os.walk(StartDir):
for fname in files:
if fname.endswith(".tif"):
bounds =rasterio.open(os.path.join(dir+"/", fname)).bounds
df = df.append({'location':fname, 'geometry': box(bounds[0], bounds[1], bounds[2], bounds[3])},ignore_index=True)
df.to_file("tile-index2.shp")
As I understand it, gdaltindex
returns a feature for each raster input, as a rectangular polygon of the bounds of each raster. I don't know of a ready plug-in solution for doing gdaltindex
in pure python (as opposed to shelling out to run gdaltindex
) and I'll assume you've searched for it. The parts to build a solution are available though.
You can use rasterio
to read rasters and get the bounds:
>>> r = rasterio.open("mwi_lc_1990.tif")
>>> r.bounds
BoundingBox(left=454965.0, bottom=8094361.0, right=823965.0, top=8974795.0)
>>> r.bounds.left
454965.0
And then you can use the fiona
package to create shapefiles.