DFS(G) input: G = (V,E) in adjacency list representation output: Vertices labelled by connected components connected_component = 0 for all v in V, set visited(v) = False for all v in V if not visited(v) then connected_component++ explore(v) return connected_component_number (defined in explore)
Explore(z) input: vertex z output: side effect, it defines the connected component_number on some vertices and marks them as visited. connected_component_number(z) = connected_component visited(z) = True for all (z, w) in E if not visited(w) Explore(w)
As we run through every vertex once and then every edge twice we have .
First for undirected graphs we just keep track of the previous vertex and find a spanningsub-forest for the graph. We can use this to find paths between vertices by going back to the root vertices of the trees.
DFS(G) input: G = (V,E) in adjacency list representation output: Vertices labelled by connected components connected_component = 0 for all v in V, set visited(v) = False, previous(v) = Null for all v in V if not visited(v) then connected_component++ explore(v) return previous
Explore(z) input: vertex z connected_component_number(z) = connected_component visited(z) = True for all (z, w) in E if not visited(w) previous(w) = z Explore(w)
To do this we are going to use a DFS algorithm like above but we are going to track pre/postorder numbers.
DFS(G) input: G = (V,E) in adjacency list representation output: Vertices labelled by connected components clock = 1 for all v in V, set visited(v) = False for all v in V if not visited(v) then explore(v) return post (defined in Explore)
Explore(z) input: vertex z pre(z) = clock, clock ++ visited(z) = True for all (z, w) in E if not visited(w) Explore(w) post(z) = clock, clock++
Example
Suppose we have the following graph and let be the root node. Suppose we explore edges alphabetically and lets run the algorithm above on it.
As we are using DFS we explore far first and then slowly come back. Which gives us the following DFS tree with the pre/post numbers.
Suppose we have a DAG from the lemma above we can run a DFS algorithm starting at the root of and the post ordering will provide a topological sorting of the vertices of .
Given a linear ordering we know the minimal vertex is a source and the maximal vertex is a sink. This gives us another algorithm to find a topologically sorting:
Find a sink vertex, label it and then delete it.
Repeat (1) until the graph is empty
Strongly connected components
Recall the definitions
Strongly connected (directed graphs)
Strongly connected
Let be a directed graph. We say two vertices are strongly connected if there exists paths to and to .
Let be a directed graph. Suppose has strongly connected components . We can define the strongly connected component graph as a directed graph with vertices and edges
If we run a DFS algorithm starting at using an alphabetical ordering on the vertices then has the lowest post order number but is in the source strongly connected component.
A vertex with the highest post order number lies in a source SCC
The edge set for is
therefore the edge set for is
Which by the definition of reverse could be rephrased as
This is exactly the definition of the edge set for giving the required statement.
Therefore if we can find a source vertex in then we have found a sink vertex in . This is exactly what our algorithm will depend on.
Algorithm for finding the strongly connected components
SCC(G): Input: directed graph G = (V,E) in adjacency list Output: labeling of V by strongly connected component 1. Construct G^R 2. Run DFS on G^R 3. Order V by decreasing post order number. 4. Run directed DFS on G using the vertex ordering from before and label the connected components we reach.
Additional property
In addition to labelling the connected components the order we have labelled them is in reverse topological ordering. This is because we always start at the next sink vertex after we have labelled a component.