How to interpret GRASS v.kernel results?

The results estimate points per unit area. As a check, you should multiply the density values by the area of a cell and add up these values over the grid: the total should equal the sum of the original data. (These two values often differ for two reasons, boundary effects and numerical imprecision. The boundary effects occur because the density map can spread data off the edge of the map and those values don't get recovered from the density grid. But the differences ought to be small.)

One image I have used in classes asks students to imagine the kernel as a bucket of sand: you upend the bucket at a point, allowing the sand to slump. The slumping barely occurs for short half-widths but is extensive for large band-widths (maybe the sand is wetter ;-). Regardless, it's always the same amount of sand left, no matter how slumping occurs. Now go dump one bucket at the location of each point (or, more generally, if there is a positive value x associated with each data point, first put an amount of sand in the bucket proportional to x and then dump it). The sand slumps. It piles up in areas where there are lots of buckets. The density grid gives you the height of the piled sand at the center of each grid cell. Multiplying this by a cell's area estimates the volume of sand occupying each cell. Summing this cell volume over any region (such as a Census block) estimates the total volume of sand in that region, which represents the total amount of quantity x you think is in the region.


Have you seen the Geospatial Analysis web book? They have a quite detailed section on point density, which covers Gaussian functions. Even in general I feel it's a very useful resource.


Here's a grossly oversimplified way to think about it:

Imagine a dartboard with several rings radiating out from the center. At each location in the result, a score is computed by putting the dartboard over the location and seeing where the vector points are on the dartboard. From that the score is tallied and the raster is made.

There are many variables to how this is computed:

-- the size of the dartboard (the kernel)

-- the shape of the dartboard (2D isometric or 'the same in every direction in x/y', i.e. a flat circle)

-- the way the dartboard assigns points (Gaussian implies a 'normal' distribution, i.e. higher scores as the point gets closer to the center, in a bell curve shape)

The advantage is that it will compute a much smoother version without large (discontinuous) jumps that can take in information with a wider and more consistent radius. It also will be less affected by the differences in size/shape of the areas used.

Think about using Nearest Neighbors on counties: On the east coast they are much smaller than the Midwest, but the number of neighbors is similar and largely affected the geometry of the boundary. Which is more dense? If your kernel radius is 50 miles you'd get a much different answer that described their relative density much more accurately.

Tags:

Grass

Density