Calculating block statistics from land cover raster using R?

You can use the aggregate function in the raster package to do this. For example, to produce a raster at half the resolution of the original, showing the proportion of cells covered with land class 1:

aggregate(r, fact=2, fun=function(vals, na.rm) {
  sum(vals==1, na.rm=na.rm)/length(vals)
})

To do this for all land classes:

cov_pct <- lapply(unique(r), function(land_class) {
             aggregate(r, fact=2, fun=function(vals, na.rm) {
               sum(vals==land_class, na.rm=na.rm)/length(vals)
             })
           })

I agree that raster::aggregate() works well in most cases, especially when a user-defined function is involved as in the answer given by @dbaston. However, you might want to have a look at the velox package as an alternative approach that comes in particularly handy when dealing with larger rasters.

The only caveat here is that VeloxRaster_aggregate() accepts "standard" aggregate functions (ie. c("sum", "mean", "min", "max", "median") only – any work steps beyond that level need to be carried out beforehand. Nevertheless, the resulting speed gain surely justifies the need for a little more code. In the example below, I increased the ncell() of your input raster by a factor of 100 and compared runtimes. As seen from the benchmark results, velox outperforms the raster::aggregate() based approach by a factor of 13, on average, when fed with the larger raster image.

library(raster)

r = raster(ncols = 1e3, nrows = 1e3
           , xmn = 100, xmx = 1100, ymn = 0, ymx = 1000
           , crs="+proj=lcc +lat_1=35 +lat_2=65 +lat_0=52 +lon_0=10 +x_0=4000000 +y_0=2800000 +ellps=GRS80 +towgs84=0,0,0,0,0,0,0 +units=m +no_defs")

x = 0:5
set.seed(123) # make sampling reproducible
r[] <- sample(x, 1e6, replace = TRUE)


### benchmark ----

library(microbenchmark)
library(velox)

microbenchmark(

  ## raster approach
  rasterAgg = {
    lst1 = lapply(unique(r), function(land_class) {
      aggregate(r, fact = 2, fun = function(vals, na.rm) {
        sum(vals == land_class, na.rm = na.rm)/length(vals)
      })
    })
  }

  ## velox approach
  , veloxAgg = {
    vls = r[]

    lst2 = lapply(unique(r), function(i) {
      # set the current land cover class to 1, everything else to 0
      ids = (vls == i)
      vls[ids] = 1; vls[!ids] = 0
      rst = setValues(r, vls)

      # create velox raster and aggregate
      vlx = velox(rst)
      vlx$aggregate(factor = 2, aggtype = "mean")
      vlx$as.RasterLayer()
    })
  }

  ## repeat every chunk n times
  , times = 10)

And here are the benchmark results:

Unit: milliseconds
expr       min        lq      mean    median        uq       max neval cld
rasterAgg 5844.2713 5950.7560 6140.8879 6186.1065 6354.8678 6370.6968    10   b
veloxAgg   319.2344  336.5698  472.3171  375.2693  622.2361  744.8573    10  a

You can also plot() or subtract the single output layers in lst1 and lst2 from each other in order to verify that the two approaches create identical results.

As a reference: using your original 1e4 pixels raster as input, VeloxRaster_aggregate() performs only 4 times faster on my machine (at least on my machine), which leads me to the conclusion that the raster based approach is absolutely suitable for smaller images. However, velox is definitely worth a deeper look when dealing with larger datasets that still fit well into memory.

Calculating block statistics from land cover raster using R?

Tags:

R

Raster

Land Cover

Land Use

Related

Recent Posts