-
Notifications
You must be signed in to change notification settings - Fork 114
Chapter 7: Graphs: Only Connect!
The following are the code listings for Chapter 7. [book.py]
- Listing 7-1. A program that builds the graph in Figure 7-2
- Listing 7-2. Linked list implementation of Stack data type
- Listing 7-3. Depth First Search of graph from designated source node, src
- Listing 7-4. Recovering actual path from node_from[]
- Listing 7-5. Breadth First Search of graph from designated source node
- Listing 7-6. A Guided Search using Manhattan distance to control search
- Listing 7-7. Code fragment showing performance based on the number of edges
- Listing 7-8. Recursive implementation of Depth First Search on a directed graph
- Listing 7-9. Detecting cycles in a directed graph using Depth First Search
- Listing 7-10. Topological sort over the directed graph
- Listing 7-11. Structure of an indexed min priority queue
- Listing 7-12. Decreasing priority for a value in IndexedMinPQ
- Listing 7-13. Removing highest-priority value in IndexedMinPQ
- Listing 7-14. Dijkstra’s algorithm to solve single-source shortest path problem
- Listing 7-15. Recovering actual path from edge_to[]
- Listing 7-16. Bellman–Ford algorithm implementation
- Listing 7-17. Floyd–Warshall Algorithm
- Listing 7-18. Code to recover the shortest path as computed by Floyd–Warshall
import networkx as nx
G = nx.Graph() ❶
G.add_node('A2') ❷
G.add_nodes_from(['A3', 'A4', 'A5']) ❸
G.add_edge('A2', 'A3') ❹
G.add_edges_from([('A3', 'A4'), ('A4', 'A5')]) ❺
for i in range(2, 6):
G.add_edge('B{}'.format(i), 'C{}'.format(i)) ❻
if 2 < i < 5:
G.add_edge('B{}'.format(i), 'B{}'.format(i+1))
if i < 5:
G.add_edge('C{}'.format(i), 'C{}'.format(i+1))
>>> print(G.number_of_nodes(), 'nodes.') ❼
>>> print(G.number_of_edges(), 'edges.')
>>> print('adjacent nodes to C3:', list(G['C3'])) ❽
>>> print('edges adjacent to C3:', list(G.edges('C3'))) ❾
12 nodes.
12 edges.
adjacent nodes to C3: ['C2', 'B3', 'C4']
edges adjacent to C3: [('C3', 'C2'), ('C3', 'B3'), ('C3', 'C4')]❶ nx.Graph() constructs a new undirected graph.
❷ A node can be any hashable Python object except None. Strings are a good choice.
❸ Add multiple nodes from a list using add_nodes_from().
❹ Add an edge between two nodes, u and v, with add_edge(u, v).
❺ Add multiple edges from a list using add_edges_from().
❻ If an edge is added to a graph before its nodes are, the corresponding nodes are automatically added to the graph.
❼ A graph can report its number of nodes and edges.
❽ Find the nodes adjacent to v by using the G[v] lookup capability.
❾ Find the edges adjacent to v by using the G.edges(v) function.
class Stack:
def __init__(self):
self.top = None ❶
def is_empty(self):
return self.top is None ❷
def push(self, val):
self.top = Node(val, self.top) ❸
def pop(self):
if self.is_empty(): ❹
raise RuntimeError('Stack is empty')
val = self.top.value ❺
self.top = self.top.next ❻
return val❶ Initially, top is None, reflecting an empty Stack.
❷ A Stack is empty if top is None.
❸ Ensures new Node is the first one in linked list, with existing linked list becoming the rest.
❹ An empty Stack causes a RuntimeError.
❺ Extract the newest value from top of stack to be returned.
❻ Reset Stack so the next Node is now on top (if None, then Stack becomes empty).
def dfs_search(G, src): ❶
marked = {} ❷
node_from = {} ❸
stack = Stack()
marked[src] = True ❹
stack.push(src)
while not stack.is_empty(): ❺
v = stack.pop()
for w in G[v]:
if not w in marked:
node_from[w] = v ❻
marked[w] = True ❼
stack.push(w)
return node_from ❽❶ Conduct a Depth First Search over graph, G, starting from source node, src.
❷ The marked dictionary records nodes that have already been visited.
❸ Record how search got to each node: node_from[w] is the prior node working backward to src.
❹ Mark and place src node into Stack to start the search. The top node in the Stack represents the next node to explore.
❺ If the Depth First Search has not yet completed, v is the next node to explore.
❻ For each unmarked node, w, adjacent to v, remember that to get to w, the search came from v.
❼ Push w onto the top of the stack and mark it so it won't be visited again.
❽ Return the structure of the search that records for each node, v, the prior node from a search initiated at src.
def path_to(node_from, src, target): ❶
if not target in node_from:
raise ValueError('Unreachable') ❼
path = []
v = target ❷
while v != src:
path.append(v) ❸
v = node_from[v] ❹
path.append(src) ❺
path.reverse() ❻
return path❶ node_from structure is needed to recover path from src to any target.
❷ To recover the path, set v to target node.
❸ As long as v is not src, append v to path, a backward list of nodes found on path from src to target.
❹ Continue backward by setting v to the prior node recorded by node_from[v].
❺ Once src is encountered, the while loop terminates, so src must be appended to complete the backward path.
❻ Return the reverse of path to produce the proper ordering from src to target.
❼ If node_from[] doesn't contain target, then it is not reachable from src.
def bfs_search(G, src): ❶
marked = {} ❷
node_from = {} ❸
q = Queue()
marked[src] = True ❹
q.enqueue(src)
while not q.is_empty(): ❺
v = q.dequeue()
for w in G[v]:
if not w in marked:
node_from[w] = v ❻
marked[w] = True ❼
q.enqueue(w)
return node_from ❽❶ Conduct a Breadth First Search over graph, G, starting from source node, src.
❷ The marked dictionary records nodes that have already been visited.
❸ Record how search got to each node: node_from[w] is the prior node working backward to src.
❹ Mark and place src node into Queue to start the search. The first node in the Queue represents the next node to explore.
❺ If the Breadth First Search has not yet completed, v is the next node to explore.
❻ For each unmarked node, w, adjacent to v, remember that to get to w, the search came from v.
❼ Place w as last node to explore at the end of the queue, and mark it so it isn't visited multiple times.
❽ Return the structure of the search that records for each node, v, the prior node from a search initiated at src.
def guided_search(G, src, target): ❶
from ch04.heap import PQ
marked = {} ❷
node_from = {} ❸
pq = PQ(G.number_of_nodes()) ❹
marked[src] = True
pq.enqueue(src, -distance_to(src, target)) ❺
while not pq.is_empty(): ❻
v = pq.dequeue()
for w in G.neighbors(v):
if not w in marked:
node_from[w] = v ❼
marked[w] = True
pq.enqueue(w, -distance_to(w, target)) ❽
return node_from ❾
def distance_to(from_cell, to_cell):
return abs(from_cell[0] - to_cell[0]) + abs(from_cell[1] - to_cell[1])❶ Conduct a Guided Search over graph, G, starting from source node, src, knowing the target node to locate.
❷ The marked dictionary records nodes that have already been visited.
❸ Record how search got to each node: node_from[w] is the prior node working backward to src.
❹ Using the heap-based priority queue, you must pre-allocate sufficient space to include, potentially, all nodes in the graph.
❺ Mark and place src node into max priority queue to start the search using as its priority the negative of its distance to target.
❻ If the Guided Search has not yet completed, the node closest to target is the next node to explore.
❼ For each unmarked node, w, adjacent to v, remember that to get to w, the search came from v.
❽ Place w into its appropriate location in the priority queue, using the negative of the Manhattan distance as priority, and mark it so it isn't visited multiple times.
❾ Return the structure of the search that records for each node, v, the prior node from a search initiated at src.
while not stack.is_empty():
v = stack.pop()
for w in G[v]:
if not w in marked:
marked[w] = True
stack.push(w)
...def dfs_search(G, src): ❶
marked = {} ❷
node_from = {} ❸
def dfs(v): ❹
marked[v] = True ❺
for w in G[v]:
if not w in marked:
node_from[w] = v ❻
dfs(w) ❼
dfs(src) ❽
return node_from ❾❶ Conduct a Depth First Search over graph G starting from source node, src.
❷ The marked dictionary records nodes that have already been visited.
❸ Record how dfs() found each node: node_from[w] is the prior node working backward to src.
❹ Recursive method to continue search from an unmarked node, v.
❺ Be sure to mark that v has been visited.
❻ For each unmarked node, w, adjacent to v, remember that to get to w, the search came from v.
❼ In the recursive case, continue search in direction of unmarked node, w. When recursive call ends, continue with for loop over w.
❽ Invoke the initial recursive call on source node, src.
❾ Return the structure of the search that records for each node, v, the prior node from a search initiated at src.
def has_cycle(DG):
marked = {}
in_stack = {}
def dfs(v): ❶
in_stack[v] = True ❷
marked[v] = True ❸
for w in DG[v]:
if not w in marked:
if dfs(w): ❹
return True
else:
if w in in_stack and in_stack[w]: ❺
return True
in_stack[v] = False ❻
return False
for v in DG.nodes(): ❼
if not v in marked:
if dfs(v): ❽
return True
return False❶ Conduct a Depth First Search over graph, DG, starting from v.
❷ in_stack records nodes that are in the recursive call stack. Mark that v is now part of a recursive call.
❸ The marked dictionary records nodes that have already been visited.
❹ For each unmarked node, w, adjacent to v, initiate recursive dfs() on w, and if True is returned, a cycle is detected, so it returns True as well.
❺ If a node, w, is marked as visited, it could still be in our call stack--if it is, then a cycle has been detected.
❻ Equally important, when the dfs() recursive call ends, set in_stack[v] to False since v is no longer on the call stack.
❼ Investigate each unmarked node in the directed graph.
❽ If invoking dfs(v) on a node, v, detects a cycle, return True immediately.
def topological_sort(DG):
marked = {}
postorder = [] ❶
def dfs(v): ❷
marked[v] = True ❸
for w in DG[v]:
if not w in marked:
dfs(w) ❹
postorder.append(v) ❺
for v in DG.nodes():
if not v in marked: ❻
dfs(v)
return reversed(postorder) ❼❶ Use a list to store (in reverse order) a linear ordering of nodes to be processed.
❷ Conduct a Depth First Search over DG starting from v.
❸ The marked dictionary records nodes that have already been visited.
❹ For each unmarked node, w, adjacent to v, recursively explore dfs(w).
❺ When dfs(v) gets to this key step, it has fully explored all nodes that (recursively) depend on v, so append v to postorder.
❻ Ensure that all unmarked nodes are visited. Note that each time dfs(v) is invoked, a different subset of graph DG is explored.
❼ Because the list holds the linear ordering in reverse order, return its reverse.
class IndexedMinPQ:
def less(self, i, j): ❶
return self.priorities[i] > self.priorities[j]
def swap(self, i, j):
self.values[i],self.values[j] = self.values[j],self.values[i] ❷
self.priorities[i],self.priorities[j] = self.priorities[j],self.priorities[i]
self.location[self.values[i]] = i ❸
self.location[self.values[j]] = j
def __init__(self, size):
self.N = 0
self.size = size
self.values = [None] * (size+1) ❹
self.priorities = [None] * (size+1)
self.location = {} ❺
def __contains__(self, v): ❻
return v in self.location
def enqueue(self, v, p):
self.N += 1
self.values[self.N], self.priorities[self.N] = v, p ❼
self.location[v] = self.N ❽
self.swim(self.N)❶ Because this is a min priority queue, item i is of lower priority than item j if its priority is a larger numeric value.
❷ swap() switches the values and priorities for items i and j.
❸ swap() updates the respective locations for items i and j.
❹ values stores the value of the nth item; priorities stores the priorities of the nth item.
❺ location is a dictionary that returns the index position into values and priorities for each value that is enqueued.
❻ Unlike a traditional priority queue, an indexed min priority queue can inspect location to determine in amortized O(1) time whether a value is being stored in the priority queue.
❼ To enqueue a (v, p) entry, place v in values[N] and p in priorities[N], which is next available bucket.
❽ enqueue() must also associate this new index location with v before invoking swim() to guarantee the heap-ordered property.
def decrease_priority(self, v, lower_priority):
idx = self.location[v] ❶
if lower_priority >= self.priorities[idx]: ❷
raise RuntimeError('...')
self.priorities[idx] = lower_priority ❸
self.swim(idx) ❹❶ Find the location, idx, in the heap where v is found.
❷ If lower_priority is actually not lower than the existing priority in priorities[idx], raise a RuntimeError.
❸ Change the priority for value v to be the lower priority.
❹ Reestablish heap-ordered property if necessary by swimming this value up.
def dequeue(self):
min_value = self.values[1] ❶
self.values[1] = self.values[self.N] ❷
self.priorities[1] = self.priorities[self.N]
self.location[self.values[1]] = 1
self.values[self.N] = self.priorities[self.N] = None ❸
self.location.pop(min_value) ❹
self.N -= 1 ❺
self.sink(1)
return min_value ❻❶ Remember min_value, the value with highest priority.
❷ Move the item at location N to the top-level location 1 and ensure that location records the new index position for this value.
❸ Remove all trace of the former min_value being removed.
❹ Remove min_value entry from location dictionary.
❺ Reduce number of entries before invoking sink(1) to reestablish heap-ordered property.
❻ Return the value associated with entry of highest priority (which is the one with smallest magnitude).
def dijkstra_sp(G, src):
N = G.number_of_nodes()
inf = float('inf') ❶
dist_to = {v:inf for v in G.nodes()}
dist_to[src] = 0
impq = IndexedMinPQ(N) ❷
impq.enqueue(src, dist_to[src])
for v in G.nodes():
if v != src:
impq.enqueue(v, inf)
def relax(e):
n, v, weight = e[0], e[1], e[2][WEIGHT] ❺
if dist_to[n] + weight < dist_to[v]: ❻
dist_to[v] = dist_to[n] + weight ❼
edge_to[v] = e ❽
impq.decrease_priority(v, dist_to[v]) ❾
edge_to = {} ❸
while not impq.is_empty():
n = impq.dequeue() ❹
for e in G.edges(n, data=True):
relax(e)
return (dist_to, edge_to)❶ Initialize dist_to dictionary to infinity for all nodes except src, which is 0.
❷ Enqueue all N nodes into impq to prepare for while loop.
❸ edge_to[v] records the edge terminating at v found during the search.
❹ Find node, n, that has shortest computed path from src. Explore its edges (n, v, weight) to see if a new shortest path to v has been found. networkx requires data = True to retrieve edge weights.
❺ Extract n, v, and weight from the edge (n, v).
❻ If distance to n added to edge weight to v is smaller than best path so far to v, then a shorter path has been found.
❼ Update the shortest known distance to v.
❽ Record the edge (n, v) that brought Dijkstra's algorithm to v along the new shortest path.
❾ Most importantly, reduce priority in impq to new shortest distance so while loop will be able to retrieve node with shortest computed path.
def edges_path_to(edge_to, src, target): ❶
if not target in edge_to:
raise ValueError('Unreachable') ❼
path = []
v = target ❷
while v != src:
path.append(v) ❸
v = edge_to[v][0] ❹
path.append(src) ❺
path.reverse() ❻
return path❶ edge_to[] structure is needed to recover path from src to any target.
❷ To recover the full path, start at target.
❸ As long as v is not src, append v to path, a backward list of nodes found on path from src to target.
❹ Set v to become u, the prior node in the edge (u, v) from edge_to[v].
❺ Once src is encountered, the while loop terminates, so src must be appended to complete the backward path.
❻ Reverse the list so all nodes appear in proper order from src to target.
❼ If target is not in edge_to[], it is not reachable from src.
def bellman_ford(G, src):
inf = float('inf')
dist_to = {v:inf for v in G.nodes()} ❶
dist_to[src] = 0
edge_to = {} ❷
def relax(e):
u, v, weight = e[0], e[1], e[2][WEIGHT]
if dist_to[u] + weight < dist_to[v]: ❺
dist_to[v] = dist_to[u] + weight ❻
edge_to[v] = e ❼
return True ❽
return False
for i in range(G.number_of_nodes()): ❸
for e in G.edges(data=True): ❹
if relax(e):
if i == G.number_of_nodes()–1: ❾
raise RuntimeError('Negative Cycle exists in graph.')
return (dist_to, edge_to)❶ Initialize dist_to dictionary to infinity for all nodes except src, which is 0.
❷ edge_to[v] records the edge terminating at v found during the search.
❸ Make N passes over the graph.
❹ For every edge e = (u,v) in the graph, use the same relax() concept as Dijkstra's algorithm; see if e improves on an existing shortest path from src to v by going through u.
❺ If distance to u added to edge weight to v is smaller than best path so far to v, then a shorter path has been found.
❻ Update the shortest known distance to v.
❼ Record the edge (u, v) that brought algorithm to v along the new shortest path.
❽ If relax() returns True, then a new shortest path was found to v.
❾ Bellman–Ford makes N passes over all E edges. If in the final pass, an edge, e, is found that still reduces the shortest distance from src to some v, there must be a negative cycle in the graph.
def floyd_warshall(G):
inf = float('inf')
dist_to = {} ❶
node_from = {}
for u in G.nodes():
dist_to[u] = {v:inf for v in G.nodes()} ❷
node_from[u] = {v:None for v in G.nodes()} ❸
dist_to[u][u] = 0 ❹
for e in G.edges(u, data=True): ❺
v = e[1]
dist_to[u][v] = e[2][WEIGHT]
node_from[u][v] = u ❻
for k in G.nodes():
for u in G.nodes():
for v in G.nodes():
new_len = dist_to[u][k] + dist_to[k][v] ❼
if new_len < dist_to[u][v]:
dist_to[u][v] = new_len ❽
node_from[u][v] = node_from[k][v]
return (dist_to, node_from) ❾❶ dist_to and node_from will be two-dimensional structures. Each is a dictionary containing sub-dictionaries dist_to[u] and node_from[u].
❷ For each u row, initialize dist_to[u][v] to be infinity to reflect that each node, v, is initially unreachable.
❸ For each u row, initialize node_from[u][v] to be None to reflect that there may not even be a path from u to v.
❹ Make sure to set dist[u][u] = 0 to reflect that the distance from u to itself is 0.
❺ For each edge e = (u,v) from u, set dist_to[u][v] = weight of e to reflect that the shortest distance from u to v is the edge weight for e.
❻ Record that u is the last node on the shortest path from u to v. In fact, it is the only node on the path, which contains just the edge (u, v).
❼ Select three nodes—k, u, and v—and compute new_len, the total path length from u to k added to the path length from k to v.
❽ If new_len is smaller than the computed length of the shortest path from u to v, set dist_to[u][v] to new_len to record the shortest distance, and record that the last node on the shortest path from u to v is the last node on the shortest path from k to v.
❾ Return computed dist_to[][] and node_from[][] structures so actual shortest paths can be computed for any two nodes.
def all_pairs_path_to(node_from, src, target): ❶
if node_from[src][target] is None:
raise ValueError('Unreachable') ❼
path = []
v = target ❷
while v != src:
path.append(v) ❸
v = node_from[src][v] ❹
path.append(src) ❺
path.reverse() ❻
return path❶ node_from[][] structure is needed to recover path from src to any target.
❷ To recover the full path, start at target.
❸ As long as v is not src, append v to path, a backward list of nodes found on path from src to target.
❹ Set v to become the prior node in the search as recorded by node_from[src][v].
❺ Once src is encountered, the while loop terminates, so src must be appended to complete the backward path.
❻ Reverse the list so all nodes appear in proper order from src to target.
❼ If node_from[target] is None, target is not reachable from src.
All content drawn from Learning Algorithms: A Programmer’s Guide to Writing Better Code (c) 2021