C#: Avoid infinite recursion when traversing object graph

If the loops can be generalised (you can have any number of elements making up the loop), you can keep track of objects you've seen already in a HashSet and stop if the object is already in the set when you visit it. Or add a flag to the objects which you set when you visit it (but you then have to go back & unset all the flags when you're done, and the graph can only be traversed by a single thread at a time).

Alternatively, if the loops will only be back to the parent, you can keep a reference to the parent and not loop on properties that refer back to it.

For simplicity, if you know the parent reference will have a certain name, you could just not loop on that property :)


What a coincidence; this is the topic of my blog this coming Monday. See it for more details. Until then, here's some code to give you an idea of how to do this:

static IEnumerable<T> Traversal<T>(
    T item,
    Func<T, IEnumerable<T>> children)
{
    var seen = new HashSet<T>();
    var stack = new Stack<T>();
    seen.Add(item);
    stack.Push(item); 
    yield return item;
    while(stack.Count > 0)
    {
        T current = stack.Pop();
        foreach(T newItem in children(current))
        {
            if (!seen.Contains(newItem))
            {
                seen.Add(newItem);
                stack.Push(newItem);
                yield return newItem;
            }
        }
    } 
}

The method takes two things: an item, and a relation that produces the set of everything that is adjacent to the item. It produces a depth-first traversal of the transitive and reflexive closure of the adjacency relation on the item. Let the number of items in the graph be n, and the maximum depth be 1 <= d <= n, assuming the branching factor is not bounded. This algorithm uses an explicit stack rather than recursion because (1) recursion in this case turns what should be an O(n) algorithm into O(nd), which is then something between O(n) and O(n^2), and (2) excessive recursion can blow the stack if the d is more than a few hundred nodes.

Note that the peak memory usage of this algorithm is of course O(n + d) = O(n).

So, for example:

foreach(Node node in Traversal(myGraph.Root, n => n.Children))
  Console.WriteLine(node.Name);

Make sense?

Tags:

C#

Recursion