Breaking cycles in a digraph with the condition of preserving connectivity for certain nodes

The problem as stated is NP-Hard. Not sure if it is in NP either. In order to verify NP-hardness of the problem, consider graphs such that every blue node has an incoming edge from an orange node. For such graphs, what we need is that the graph after removing edges continues to be strongly connected. We also assume that maximum number of cycles need to be removed.

Now, in order to break as many cycles as possible with a minimum of removed edges, assume that the maximum number of cycles that can be removed for a graph G while continuing to be strongly connected be removable(G) = k. This is a well-defined quantity for any graph G. Thus we need a graph G' that is a subgraph of G with number of cycles being cycles(G)-k. Now maximizing k is equivalent to minimizing the number of cycles that survive in G'. This is what makes the problem hard.

Consider the Hamiltonian Cycle problem that is known to be NP-hard. Assume we have a program breakCycles(G) that computes a graph G' as a subgraph of G with maximum number of cycles removed (with minimal number of edges removed) or cycles(G') = cycles(G) - k. Then, it is straightforward to see that the Hamiltonian cycle problem can also be solved using breakCycles(G) by just providing input graph G to breakCycles to obtain the graph G' and return true iff G' is a simple cycle involving all vertices (of G).


Update : In order to obtain a practical solution, let's look at obtaining a graph with minimal cycles, that is a subgraph of the blue nodes such that removing any edge will result in loss of connectivity for those nodes that have an orange node incident to it.

Solving the above problem is much easier and should work well in practice

def getRemovableEdges(G, edgeLst, initNodes):
    import networkx as nx
    removableEdgeLst = []
    for (u,v) in edgeLst:
        G.remove_edge(u, v)
        f = nx.floyd_warshall(G)
        addEdge = True
        for s in initNodes:
            if 'inf' in list(map(str, f[s].values())):
                G.add_edge(u,v)
                addEdge = False
                break
        if addEdge:
            removableEdgeLst.append((u,v))
    return removableEdgeLst

To try it on the example provided, we need to first initialize the graph

DG = nx.DiGraph()
DG.add_nodes_from(range(1,8))
DG.add_edges_from([(1,2), (2,3), (3,4), (3,5), (4,5), (5,1), (5,4), (5,7), (6,4), (7,6)]);

With our graph initialized above, we execute the function as below...


In [5]: %time eL = getRemovableEdges(DG, list(DG.edges()), [2, 5])                                                                                                                                     
CPU times: user 791 µs, sys: 141 µs, total: 932 µs
Wall time: 936 µs

In [6]: DG.remove_edges_from(eL);                                                                                                                                                                      

In [7]: plt.subplot(121) 
   ...: nx.draw(DG, with_labels=True, font_weight='bold');                                                                                                                                             
   ...: plt.show();                                                                                                                                                                                    

We get the graph as below,

Final Graph after removing cycles


This is not a direct answer to your problem, just some insights I got while thinking about your question.

You are currently looking into your problem in a bottom-up approach, where you start with the original graph and you start removing edges until you find a good solution. The problem like you are solving it has a really high worst case complexity, since you are using combinatorics.

With this approach, you could implement a greedy edge removal solution, where you grab all the edges belonging to simple cycles and remove them until there's no connection between the orange nodes.

You could also try to find the Minimum Spanning Strong Subdigraph (MSSS) - but it's still NP-Hard. Assuming all blue nodes had at least one orange node connected to them, this would be the optimal solution since it reduces the cycles as much as possible. Any edge added to the solution of the resulting graph would actually create a new cycle, since we are talking about strongly connected components. This solution would be an upper bound of your problem in most cases, but if you have a high proportion of blue nodes connected to orange nodes, compared to all blue ones, probably isn't that far off.

However, another way to looking into this problem would be with a top-down approach, where you start with an empty graph, and start adding edges until you have all orange nodes connected with blue ones. This approach would relax your requirement of removing the minimum amount of edges, since implicitly this types of solutions will add the smallest amount of edges as possible.

An approach to this problem with this mindset would be to find the minimum spanning arborescence with multiple root nodes. There isn't a solution for finding an arborescence with multiple root nodes, but again you could implement a greedy approach for this:

  1. Set all edges weights to 1, except the edge between orange and blue nodes which are set to 0
  2. Find the minimum spanning arborescence for a single orange node
  3. Set the weight of all edges belonging to this arborescence to 0
  4. Repeat 2 and 3 for the remaining orange nodes

You can find the minimum spanning arborescence with Edmonds algorithm, but there are better ones out there.

Hope that these notes helped somehow! The problem you are dealing with is a rather difficult one indeed.