Image segmentation by pixel value

Update: If you want 4-neighborhood, you can use MorphologicalComponents to do most of the work, which is fast and easy to implement (that was my original attempt, see below). But I don't think this can be adapted for 8-neighborhood. For 8-neighborhood, I would implement the standard 2-pass connected component labeling algorithm (this might be what MorphologicalComponents does internally). The idea is:

  • in the first pass, scan pixels from top to bottom
  • if there's a connected neighbor pixel left above, assign that label to the current pixel
  • if there are more than two different labels, store the label pair for later
  • if there's no connected neighbor, assign a new label

So, the pipeline looks something like this:

enter image description here

So the first pass assigns the labels 4 and 5 to the pixels at the right, and stores the information that 4 and 2 should be in the same component, and that 5 and 1 should be in the same component. The second pass then only requires a table lookup for each pixel.

The first pass does most of the work, and returns a "preliminary" label matrix, and a set of label pairs that should go into the same component.

Sadly, there's no efficient compilable set-data structure, so I'm using a global Association variable (connectedIndices) which means a few MainEvaluate calls in the compiled function:

isConnected = With[{t = threshold}, Compile[{x, y}, Abs[x - y] <= t]];

firstPass = With[{connectedFn = isConnected},
  Compile[{{pixels, _Real, 2}},
   Module[{x, y, componentIndex, neighborOffsets, offset, index, 
     newIndex, componentCount = 0, w, h, parent, child, relabel},
    {h, w} = Dimensions[pixels];
    componentIndex = ConstantArray[0, {h, w}];
    neighborOffsets = {{-1, -1}, {-1, 0}, {-1, 1}, {0, -1}};
    (* scan every pixel *)
    connectedIndices = <||>;
    Do[
     (
      index = 0;
      Do[
       (* find connected neighbors above and to the left *)
       If[y + offset[[1]] >= 1 && x + offset[[2]] >= 1 &&
         y + offset[[1]] <= h && x + offset[[2]] <= w &&
         connectedFn[pixels[[y + offset[[1]], x + offset[[2]]]], 
          pixels[[y, x]]],
        newIndex = componentIndex[[y + offset[[1]], x + offset[[2]]]];
        If[index != 0 && index != newIndex,
         (* more than one label found in neighborhood: 
         newIndex and index really are the same component - 
         save information for second pass *)
         AssociateTo[connectedIndices, MinMax[{index, newIndex}] -> 1];
         ];
        index = newIndex;
        ], {offset, neighborOffsets}];
      If[index == 0,
       (* no label found in neighborhood: new component *)
       index = ++componentCount];
      componentIndex[[y, x]] = index;
      ),
     {y, 1, Length[pixels]},
     {x, 1, Length[pixels[[1]]]}
     ];
    Return[componentIndex]
    ], {{relabelConnectedComponents[__], _Integer, 1}}, 
   CompilationOptions -> {"InlineCompiledFunctions" -> True}, 
   CompilationTarget -> "C"]]

The second pass is straightforward: We can use graph functions to get a relabeling-lookup table from the connected label pairs:

Clear[relabelConnectedComponents]
relabelConnectedComponents[connectedIndices_, componentCount_] := 
 Module[{comp, relabel, count},
  comp = SortBy[
    ConnectedComponents[
     Graph[Range[componentCount], 
      UndirectedEdge @@@ connectedIndices]], Min];
  relabel = ConstantArray[0, componentCount];
  Do[
   Do[relabel[[oldIndex]] = index, {oldIndex, comp[[index]]}],
   {index, Length[comp]}];
  relabel]

(note that this graph only contains one node per component not per pixel, so this is much much faster than using ConnectedComponents on the pixels directly)

And apply that lookup table:

Clear[secondPass]
secondPass[componentIndex_] := Module[{relabel},
   (* second pass - relabel connected components *)
   If[Length[Keys@connectedIndices] == 0, componentIndex,
    relabel = 
     relabelConnectedComponents[Keys[connectedIndices], 
      Max[componentIndex]];
    relabel[[#]] & /@ componentIndex
    ]];

Usage: secondPass@firstPass[data]

This seems to be about 5-6 times slower than the built-in MorphologicalComponents, which means it's about 50% slower than the method below, which works on upsample data. But it should work for any neighborhood.


(Original answer, using MorphologicalComponents)

First, two small utility functions: upsample duplicates every value in a list, showGrid displays a matrix of numbers:

upsample = Riffle[#, #] &;    

showGrid[d_, opt___] := 
 Grid[d /. n_?NumberQ :> Item[n, Background -> ColorData[95][n]], opt,
   ItemSize -> {1.5, 2}, 
  Dividers -> {{{{True, False}}, -1 -> True}, {{{True, False}}, -1 -> 
      True}}]

You could upsample the data (duplicate every row and column):

dataUpsampled = upsample /@ upsample[data];
showGrid[dataUpsampled]

enter image description here

Then you take the differences between rows, columns and across corners, and compare the differences to the threshold:

padAndThreshold = 
  PadRight[UnitStep[Abs[#] - threshold - 1], 
    Dimensions[dataUpsampled]] &;
dx = padAndThreshold[Differences /@ dataUpsampled];
dy = padAndThreshold[Differences@dataUpsampled];
dxy = padAndThreshold[
   dataUpsampled[[2 ;;, 2 ;;]] - dataUpsampled[[;; -2, ;; -2]]];
dyx = padAndThreshold[
   dataUpsampled[[2 ;;, ;; -2]] - dataUpsampled[[;; -2, 2 ;;]]];

and combine them:

boundaries = UnitStep[-(dx + dy + dxy + dyx)];
showGrid[boundaries]

enter image description here

now you can use MorphologicalComponents on this array:

showGrid[MorphologicalComponents[boundaries]]

enter image description here

To get the original sized result, simply remove the even rows and columns:

comp = MorphologicalComponents[boundaries][[;; ;; 2, ;; ;; 2]];
showGrid[comp, Dividers -> All]

enter image description here


Almost,for your example three:

threshold = 2;
coor = Catenate[Array[{##} &, Dimensions[data]]];
rule = Dispatch[Thread[coor -> Flatten[data]]];
Rotate[g = EdgeDelete[NearestNeighborGraph[coor,DistanceFunction -> 
     ChessboardDistance], _?(Abs[Subtract @@ (# /. rule)] > threshold &)], -Pi/2]

MatrixForm[SparseArray[Catenate[Tuples /@ MapIndexed[Rule, ConnectedComponents[g]]]]]

$\left( \begin{array}{ccccc} 4 & 1 & 1 & 1 & 3 \\ 2 & 2 & 2 & 1 & 1 \\ 2 & 2 & 1 & 1 & 1 \\ 2 & 1 & 1 & 1 & 1 \\ 1 & 1 & 1 & 1 & 1 \\ \end{array} \right)$


Here is a solution based on yode's code, but it uses depth-first search which runs in linear time with the size of data.

N0 = First@Dimensions[data];
coor = Catenate[
   CoordinateBoundingBoxArray[{{1, 1}, Dimensions[data]}]];
g = NearestNeighborGraph[coor, DistanceFunction -> ChessboardDistance];
vls = VertexList[g];
els = {};

isConnected = (If[
     Abs[data[[Sequence @@ #[[1]]]] - data[[Sequence @@ #[[2]]]]] <= 
      threshold, AppendTo[els, UndirectedEdge[#[[1]], #[[2]]]]]; 
    Pause[0.1]; &);

Dynamic[connect = 
     Graph[vls, els, VertexLabels -> "Name", 
      VertexCoordinates -> (coor /. {i_, j_} -> {j, N0 - i})]]    

DepthFirstScan[g, {"FrontierEdge" -> isConnected, 
  "BackEdge" -> isConnected, "ForwardEdge" -> isConnected, 
  "CrossEdge" -> isConnected}]

This shows the action of the process

enter image description here