Looping through all raster cell values using GDAL via Python?
You may read it as array, using numpy:
from osgeo import gdal
import sys
import numpy as np
src_ds = gdal.Open( "INPUT.tif" )
print "[ RASTER BAND COUNT ]: ", src_ds.RasterCount
for band in range( src_ds.RasterCount ):
band += 1
print "[ GETTING BAND ]: ", band
srcband = src_ds.GetRasterBand(band)
stats = srcband.GetStatistics( True, True )
print "[ STATS ] = Minimum=%.3f, Maximum=%.3f, Mean=%.3f, StdDev=%.3f" % ( \
stats[0], stats[1], stats[2], stats[3] )
rast_array = np.array(src_ds.GetRasterBand(1).ReadAsArray())
print rast_array
Using a sample raster, the above code will return:
[ RASTER BAND COUNT ]: 1
[ GETTING BAND ]: 1
[ STATS ] = Minimum=1683.000, Maximum=1900.000, Mean=1820.854, StdDev=59.329
[[1900 1900 1898 1895 1892 1887 1879 1871 1863 1852 1845 1837 1824 1802
1743 1725 1713 1705 1699 1693 1687 1683]
[1897 1896 1894 1892 1890 1884 1877 1869 1862 1854 1847 1838 1820 1800
1745 1729 1719 1712 1706 1701 1696 1695]
[1892 1891 1890 1888 1885 1881 1875 1868 1861 1855 1849 1837 1817 1794
1747 1732 1725 1720 1714 1710 1707 1706]
[1887 1885 1884 1882 1880 1878 1873 1867 1860 1855 1849 1833 1815 1789
1749 1738 1732 1728 1723 1720 1718 1715]
[1882 1880 1878 1876 1875 1873 1871 1866 1861 1855 1849 1832 1817 1795
1756 1744 1740 1737 1733 1730 1728 1725]
[1880 1877 1874 1873 1870 1868 1867 1865 1860 1855 1850 1841 1834 1817
1795 1769 1749 1746 1743 1740 1736 1731]
[1880 1876 1873 1870 1869 1866 1863 1862 1859 1856 1852 1847 1843 1841
1824 1812 1802 1775 1758 1747 1740 1733]
[1879 1876 1873 1870 1869 1866 1863 1860 1858 1855 1852 1850 1847 1843
1831 1819 1803 1782 1763 1747 1738 1730]
[1879 1877 1874 1872 1869 1866 1864 1861 1858 1855 1852 1850 1850 1848
1836 1816 1794 1775 1754 1744 1736 1728]
[1880 1877 1875 1872 1869 1867 1864 1862 1858 1854 1850 1848 1850 1850
1840 1806 1786 1767 1749 1742 1734 1726]
[1881 1879 1876 1873 1870 1866 1864 1861 1857 1851 1843 1840 1841 1850
1827 1797 1782 1769 1752 1742 1733 1723]
[1882 1879 1876 1873 1870 1867 1864 1861 1855 1848 1839 1835 1833 1836
1810 1794 1783 1771 1758 1747 1737 1729]
[1882 1880 1876 1873 1869 1866 1862 1858 1854 1849 1838 1833 1826 1814
1800 1792 1782 1773 1762 1752 1742 1733]
[1881 1878 1874 1870 1867 1863 1860 1856 1853 1849 1840 1835 1821 1813
1798 1790 1783 1774 1766 1757 1748 1738]]
(If you want to print each value separately, it could be easy to edit the code).
The problem has been resolved in GDAL does not ignore NoData value
f = gdal.Open("a.tif")
bands = f.RasterCount
print bands
3
for j in range(bands):
band = f.GetRasterBand(j+1)
stats = band.GetStatistics( True, True )
print "[ STATS ] = Minimum=%.3f, Maximum=%.3f, Mean=%.3f, StdDev=%.3f" % ( stats[0], stats[1], stats[2], stats[3] )
[ STATS ] = Minimum=17.000, Maximum=255.000, Mean=220.586, StdDev=39.705
[ STATS ] = Minimum=64.000, Maximum=255.000, Mean=214.975, StdDev=36.926
[ STATS ] = Minimum=45.000, Maximum=255.000, Mean=179.029, StdDev=68.234
But if you use band.ReadAsArray()
(= Numpy array)
for j in range(bands):
band = f.GetRasterBand(j+1)
data = band.ReadAsArray()
print "[ Numpy ] = Minimum=%.3f, Maximum=%.3f, Mean=%.3f, StdDev=%.3f" % (data.min(), data.max(), data.mean(), data.std())
[ Numpy ] = Minimum=0.000, Maximum=255.000, Mean=220.477, StdDev=42.584
[ Numpy ] = Minimum=31.000, Maximum=255.000, Mean=214.955, StdDev=39.558
[ Numpy ] = Minimum=0.000, Maximum=255.000, Mean=178.856, StdDev=69.535
Why? The problem is (GDAL does not ignore NoData value)
GetStatistics will reuse previously computed statistics if they exist (i.e computed before you set the NoData value). You can use stats = band.ComputeStatistics(0) instead of GetStatistics to force the statistics to be recomputed.
for j in range(bands):
band = f.GetRasterBand(j+1)
stats = band.ComputeStatistics(0)
print "[ STATS ] = Minimum=%.3f, Maximum=%.3f, Mean=%.3f, StdDev=%.3f" % ( stats[0], stats[1], stats[2], stats[3] )
[ STATS ] = Minimum=0.000, Maximum=255.000, Mean=220.477, StdDev=42.584
[ STATS ] = Minimum=31.000, Maximum=255.000, Mean=214.955, StdDev=39.558
[ STATS ] = Minimum=0.000, Maximum=255.000, Mean=178.856, StdDev=69.535
...Or you could just convert it to an ESRI Ascii Raster and achieve effectively the same result in much less time.
Here's an example of an ascii raster from the documentation:
ncols 480
nrows 450
xllcorner 378923
yllcorner 4072345
cellsize 30
nodata_value -32768
43 2 45 7 3 56 2 5 23 65 34 6 32 54 57 34 2 2 54 6
35 45 65 34 2 6 78 4 2 6 89 3 2 7 45 23 5 8 4 1 62 ...
GDAL can do the conversion very fast, you can then read the file, or whatever is required. There is nothing wrong with the other answer. I only suggest this because I find NUMPY very slow for cell-by-cell operations.