Algorithm Design 2
1.3 dfs application
1 | function dfs(u){ |
THM parenthesis
on dfs tree , if \(u\) is one ancestor of \(v\) , then \([v.d,v.f]\subset[u.d,u.f]\)
on dfs tree , otherwise , then \([u.d,u.f]\cap [v.d,v.f]=\varnothing\)
THM white-path
on dfs tree , at the time when \(u\) is discovered ,
\(u\) is one ancestor of \(v\) \(\Leftrightarrow\) \(\exists\) white path from \(u\) to \(v\)
即: \(v\) 是 \(u\) 的后代\(\Leftrightarrow\)在 \(u\) 刚访问到的时候一定存在一条完全没有被访问过的路径 \(u\to v\) .
- Proof : use parenthesis THM
\(\Rightarrow\) trivial
\(\Leftarrow\) proof by contradiction
Consider a white path \(x_1=u,x_2,\cdots,x_m=v\) , and \(x_k\) is the last vertex that is a descendent of \(u\) (including \(u\) itself) .
We need to prove that \(x_{k+1}\) is also a descendent of \(u\) , leading to contradiction .
Therefore , \(u.d<x_k.d<x_k.f<u.f\) ,
Case 1 : \(x_k.d<x_{k+1}.d<x_{k+1}.f<x_k.f\) -> \(x_{k+1}\) is also a descendent of \(u\) .
Case 2 : \(x_k.d<x_k.f<x_{k+1}.d<x_{k+1}.f\) : Impossible .
Strongly Connected Components (SCC)
View : any directed graph can be viewed as a DAG of SCC
Find SCC ? Kosaraju's Algorithm
Algorithm
1
2
3
4dfs1(G);// compute the finishing time u.f
G_=reverse_edge(G);
dfs2(G_); // In main loop , consider vertices in decreasing order of u.f
Claim : Each dfs-tree in dfs2(G_) is a SCCProof
Intuition : find and "delete" sink components in dfs2 , like a-topological order in \(G\_\) ( i.e. topological order in \(G\) )
Lemma : If we start DFS at a node in a sink component , we will visit precisely all vertices in this component .
Trivial .
Lemma (key) : node with largest \(u.f\) belongs to a start component in \(G\) (i.e. a sink component in \(G\_\))
Only need to prove the following lemma .
Lemma ( for proving key Lemma ) : \(C,D\) are two SCC , \(D\) is reachable from \(C\) ,
Then for \(v\in C\) , which is the firstly visited vertex in \(C\) , then \(\forall u\in D,v.f>u.f\)
Proof :
Case 1 : \(v\) is also the firstly visited vertex in \(C\cup D\) , then by white-path THM , all nodes in \(C\cup D\) are descendent of \(v\) , so \(\forall u\in C\cup D\backslash \{v\} , v.f>u.f\) .
Case 2 : \(\exists y\in D\) , \(y\) is the firstly visited vertex in \(C\cup D\) , so \(v\) is not a descendent of \(y\) since \(C\) is not reachable from \(D\) .
Therefore , by parenthesis THM , \(y.d\le u.d<u.f\le y.f<v.d<v.f\) for all \(u\in D\) .
Chapter 2 : Greedy Algorithm
2.1 Interval Scheduling
Description
Input : \(n\) jobs \(s_i,f_i\)
Goal : maximize the #jobs , s.t. at most one job at a time .
Another view
Connect all jobs pairs \([s_i,f_i],[s_j,f_j]\) if \([s_i,f_i]\cap [s_j,f_j]\neq \varnothing\) .
Goal \(\Leftrightarrow\) maximum independent set .
In general graph : NP-hard for general graph .
Algorithm
Repeat : Select the available jobs that finishes first .
Proof of optimality :
Method : [Exchange Argument] : Compare our solution and the optimal solution .
SOL : \(i_1,i_2,\cdots ,i_m\) , OPT : \(j_1,j_2,\cdots,j_k\) .
- Claim : If \(f_{i_1}\le f_{j_1}\) , then \(f_{i_r}\le f_{j_r}\) for all \(r\ge 1\) .
Proof : [Induction]
If the claim is true for \(r-1\) , suppose the claim is not true for \(r\) , i.e. \(f_{i_r}>f_{j_r}\) . Since \(f_{i_{r-1}}\le f_{j_{r-1}}\) , and \(s_{j_r}>f_{j_{r-1}}\) , then \(f_{i_{r-1}}<s_{j_r}\) , so \(j_r\) is also available , but has earlier finish time , contradict .
- If \(m<k\) , then \(f_{i_m}<f_{j_m}\) . We can use \(j_{m+1},\cdots,j_k\) after \(i_m\) .
2.2. Scheduling to minimize lateness
Description
Input : \(n\) jobs , each job \(i\) has ddl \(d_i\) and length \(t_i\) . Def lateness : \(l_i:=\max\{0,f_i-d_i\}\)
Goal : find a schedule of all \(n\) jobs , and minimize the maximal lateness .
Equal formularization
Goal : find a permutation \(\{p_i\}\in S_n\) , then \(f_i=\sum_{j=1}^i t_{p_j}\) , \(l_i:=\max\{0,f_i-d_{p_i}\}\) . Minimize \(\max\{l_i\}\) .
\(\Leftrightarrow\) \(l_i:=f_i-d_{p_i}\) , Minimize \(\max\{0,\max\{l_i\}\}\) .
Algorithm
Schedule the job in increasing order of \(d_i\) .
Proof of optimality
Def (Inversion) : Consider a schedule \(A'\) , \((i,j)\) is an inversion if \(i\) is scheduled before \(j\) but \(d_i>d_j\) .
If OPT$$ SOL , there must be an inversion , then there must be an adjacent inversion . Suppose \((i,i+1)\) is an inversion , then \(d_{p_i}>d_{p_{i+1}}\) .
Let \(f=\sum_{j=1}^{i-1}t_{p_j}\) , so \(f_i=f+t_{p_i}\) , \(f_{i+1}=f+t_{p_i}+t_{p_{i+1}}\) , so \(l_i=f+t_{p_i}-d_{p_i}\) , \(l_{i+1}=f+t_{p_i}+t_{p_{i+1}}-d_{p_{i+1}}\) .
If swap \((i,i+1)\) , then \(f_{i+1}'=f+t_{p_{i+1}}\) , \(f_{i}'=f+t_{p_{i+1}}+t_{p_i}\) , so \(l_{i+1}'=f+t_{p_{i+1}}-d_{p_{i+1}}\) , \(l_i'=f+t_{p_{i+1}}+t_{p_i}-d_{p_i}\) .
Therefore , obviously , \(l_{i+1}'<l_{i+1}\) . Since \(d_{p_i}>d_{p_{i+1}}\) , then \(l_i'<l_i\) . Therefore , swap can lead to better solution , so OPT is not optimal .
2.3. Shortest Path (without \(w<0\))
Description
Input : weighted graph \(G=(V,E)\) , start vertex \(s\) .
Output : \(d(u)\) for all \(u\in V\) , where \(d(u)=\min_{p\text{ is a path }s\to u}\{\sum_{e\in p}l(e)\}\) .
Algorithm [Dijkstra 1959]
1
2
3
4
5
6
7
8
9
10
11// Init
for(u in V){
d[u]=+inf;
}
d[s]=0; S.insert(s);
// Main Algorithm
while(S != V){
v=argmin(d[u]+l(e) |v: v in V\S , e=(u,v) , u in S );
d[v]=d[u]+l(e);
S.insert(v);
}Proof
Prove by induction on \(S\) . \(\forall v\in S , d(v)=\min dist(s,v)\) ,
Suppose we grow \(S\) by adding \(v\) , suppose the proposition does not hold . Then \(d(u)+l_e\) is not the shortest distance to \(v\) . Let \(p\) be the shortest path from \(s\) to \(v\) .
Let \(w\) be the last vertex that is in \(S\) , then by induction , all vertices on \(p\) from \(s\) to \(w\) are all in \(S\) . Let \(p'=path(s,u)+e\) .
\(p': s\to w\to u\to v\) , \(p:s\to w\to x\to v\) . (\(x\notin S\)) . Since \(d(u)+l_e\) is minimal , \(d(u)+l_e\le d(w)+l_{w,x}\) , but \(dist(p)=d(w)+l_{w,x}+dist(x,v)\) , where \(dist(x,v)\ge 0\) , and \(dist(p)<dist(p')=d(u)+l_e\) , so \(d(w)+l_{w,x}\le dist(p)<d(u)+l_e\) , contradict .