GDAL and Python: How to get coordinates for all cells having a specific value?

In GDAL you can import the raster as a numpy array.

from osgeo import gdal
import numpy as np

r = gdal.Open("path/to/raster")
band = r.GetRasterBand(1) #bands start at one
a = band.ReadAsArray().astype(np.float)

Then using numpy it is a simple matter to get the indexes of an array matching a boolan expression:

(y_index, x_index) = np.nonzero(a > threshold)
#To demonstate this compare a.shape to band.XSize and band.YSize

From the raster geotransform we can get information such as the upper left x and y coordinates and the cell sizes.

(upper_left_x, x_size, x_rotation, upper_left_y, y_rotation, y_size) = r.GetGeoTransform()

The upper left cell corresponds to a[0, 0]. The Y size will always be negative, so using the x and y indices you can calculate the coordinates of each cell based on the indexes.

x_coords = x_index * x_size + upper_left_x + (x_size / 2) #add half the cell size
y_coords = y_index * y_size + upper_left_y + (y_size / 2) #to centre the point

From here it's a simple enough matter to create a shapefile using OGR. For some sample code see this question for how to generate a new dataset with point information.


Why not use the Sexante toolbox in QGIS? It's kind of like the Model Builder for ArcGIS. That way, you can chain operations and treat it as one operation. You can automate the deletion of intermediate files and removal of unwanted records if I'm not mistaken.