Algorithm Design 2

1.3 dfs application

1
2
3
4
5
6
7
8
9
10
11
12
13
function dfs(u){
++time;
discover[u]=time;
col[u]=grey;
for(v in Neighbour(u) ){
if(col[v]==white){
dfs(v)
}
}
col[u]=black;
++time;
finish[u]=time;
}
  1. THM parenthesis

    on dfs tree , if \(u\) is one ancestor of \(v\) , then \([v.d,v.f]\subset[u.d,u.f]\)

    on dfs tree , otherwise , then \([u.d,u.f]\cap [v.d,v.f]=\varnothing\)

  2. THM white-path

    on dfs tree , at the time when \(u\) is discovered ,

    \(u\) is one ancestor of \(v\) \(\Leftrightarrow\) \(\exists\) white path from \(u\) to \(v\)

    即: \(v\)\(u\) 的后代\(\Leftrightarrow\)\(u\) 刚访问到的时候一定存在一条完全没有被访问过的路径 \(u\to v\) .

    • Proof : use parenthesis THM

    \(\Rightarrow\) trivial

    \(\Leftarrow\) proof by contradiction

    Consider a white path \(x_1=u,x_2,\cdots,x_m=v\) , and \(x_k\) is the last vertex that is a descendent of \(u\) (including \(u\) itself) .

    We need to prove that \(x_{k+1}\) is also a descendent of \(u\) , leading to contradiction .

    Therefore , \(u.d<x_k.d<x_k.f<u.f\) ,

    Case 1 : \(x_k.d<x_{k+1}.d<x_{k+1}.f<x_k.f\) -> \(x_{k+1}\) is also a descendent of \(u\) .

    Case 2 : \(x_k.d<x_k.f<x_{k+1}.d<x_{k+1}.f\) : Impossible .

  3. Strongly Connected Components (SCC)

    1. View : any directed graph can be viewed as a DAG of SCC

    2. Find SCC ? Kosaraju's Algorithm

      1. Algorithm

        1
        2
        3
        4
        dfs1(G);// compute the finishing time u.f
        G_=reverse_edge(G);
        dfs2(G_); // In main loop , consider vertices in decreasing order of u.f
        Claim : Each dfs-tree in dfs2(G_) is a SCC
      2. Proof

        Intuition : find and "delete" sink components in dfs2 , like a-topological order in \(G\_\) ( i.e. topological order in \(G\) )

        1. Lemma : If we start DFS at a node in a sink component , we will visit precisely all vertices in this component .

          Trivial .

        2. Lemma (key) : node with largest \(u.f\) belongs to a start component in \(G\) (i.e. a sink component in \(G\_\))

          Only need to prove the following lemma .

        3. Lemma ( for proving key Lemma ) : \(C,D\) are two SCC , \(D\) is reachable from \(C\) ,

          Then for \(v\in C\) , which is the firstly visited vertex in \(C\) , then \(\forall u\in D,v.f>u.f\)

          Proof :

          Case 1 : \(v\) is also the firstly visited vertex in \(C\cup D\) , then by white-path THM , all nodes in \(C\cup D\) are descendent of \(v\) , so \(\forall u\in C\cup D\backslash \{v\} , v.f>u.f\) .

          Case 2 : \(\exists y\in D\) , \(y\) is the firstly visited vertex in \(C\cup D\) , so \(v\) is not a descendent of \(y\) since \(C\) is not reachable from \(D\) .

          Therefore , by parenthesis THM , \(y.d\le u.d<u.f\le y.f<v.d<v.f\) for all \(u\in D\) .

Chapter 2 : Greedy Algorithm

2.1 Interval Scheduling

  1. Description

    Input : \(n\) jobs \(s_i,f_i\)

    Goal : maximize the #jobs , s.t. at most one job at a time .

  2. Another view

    Connect all jobs pairs \([s_i,f_i],[s_j,f_j]\) if \([s_i,f_i]\cap [s_j,f_j]\neq \varnothing\) .

    Goal \(\Leftrightarrow\) maximum independent set .

    In general graph : NP-hard for general graph .

  3. Algorithm

    Repeat : Select the available jobs that finishes first .

  4. Proof of optimality :

    Method : [Exchange Argument] : Compare our solution and the optimal solution .

    SOL : \(i_1,i_2,\cdots ,i_m\) , OPT : \(j_1,j_2,\cdots,j_k\) .

    1. Claim : If \(f_{i_1}\le f_{j_1}\) , then \(f_{i_r}\le f_{j_r}\) for all \(r\ge 1\) .

    Proof : [Induction]

    If the claim is true for \(r-1\) , suppose the claim is not true for \(r\) , i.e. \(f_{i_r}>f_{j_r}\) . Since \(f_{i_{r-1}}\le f_{j_{r-1}}\) , and \(s_{j_r}>f_{j_{r-1}}\) , then \(f_{i_{r-1}}<s_{j_r}\) , so \(j_r\) is also available , but has earlier finish time , contradict .

    1. If \(m<k\) , then \(f_{i_m}<f_{j_m}\) . We can use \(j_{m+1},\cdots,j_k\) after \(i_m\) .

2.2. Scheduling to minimize lateness

  1. Description

    Input : \(n\) jobs , each job \(i\) has ddl \(d_i\) and length \(t_i\) . Def lateness : \(l_i:=\max\{0,f_i-d_i\}\)

    Goal : find a schedule of all \(n\) jobs , and minimize the maximal lateness .

  2. Equal formularization

    Goal : find a permutation \(\{p_i\}\in S_n\) , then \(f_i=\sum_{j=1}^i t_{p_j}\) , \(l_i:=\max\{0,f_i-d_{p_i}\}\) . Minimize \(\max\{l_i\}\) .

    \(\Leftrightarrow\) \(l_i:=f_i-d_{p_i}\) , Minimize \(\max\{0,\max\{l_i\}\}\) .

  3. Algorithm

    Schedule the job in increasing order of \(d_i\) .

  4. Proof of optimality

    1. Def (Inversion) : Consider a schedule \(A'\) , \((i,j)\) is an inversion if \(i\) is scheduled before \(j\) but \(d_i>d_j\) .

    2. If OPT$$ SOL , there must be an inversion , then there must be an adjacent inversion . Suppose \((i,i+1)\) is an inversion , then \(d_{p_i}>d_{p_{i+1}}\) .

      Let \(f=\sum_{j=1}^{i-1}t_{p_j}\) , so \(f_i=f+t_{p_i}\) , \(f_{i+1}=f+t_{p_i}+t_{p_{i+1}}\) , so \(l_i=f+t_{p_i}-d_{p_i}\) , \(l_{i+1}=f+t_{p_i}+t_{p_{i+1}}-d_{p_{i+1}}\) .

      If swap \((i,i+1)\) , then \(f_{i+1}'=f+t_{p_{i+1}}\) , \(f_{i}'=f+t_{p_{i+1}}+t_{p_i}\) , so \(l_{i+1}'=f+t_{p_{i+1}}-d_{p_{i+1}}\) , \(l_i'=f+t_{p_{i+1}}+t_{p_i}-d_{p_i}\) .

      Therefore , obviously , \(l_{i+1}'<l_{i+1}\) . Since \(d_{p_i}>d_{p_{i+1}}\) , then \(l_i'<l_i\) . Therefore , swap can lead to better solution , so OPT is not optimal .

2.3. Shortest Path (without \(w<0\))

  1. Description

    Input : weighted graph \(G=(V,E)\) , start vertex \(s\) .

    Output : \(d(u)\) for all \(u\in V\) , where \(d(u)=\min_{p\text{ is a path }s\to u}\{\sum_{e\in p}l(e)\}\) .

  2. Algorithm [Dijkstra 1959]

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    // Init
    for(u in V){
    d[u]=+inf;
    }
    d[s]=0; S.insert(s);
    // Main Algorithm
    while(S != V){
    v=argmin(d[u]+l(e) |v: v in V\S , e=(u,v) , u in S );
    d[v]=d[u]+l(e);
    S.insert(v);
    }
  3. Proof

    Prove by induction on \(S\) . \(\forall v\in S , d(v)=\min dist(s,v)\) ,

    Suppose we grow \(S\) by adding \(v\) , suppose the proposition does not hold . Then \(d(u)+l_e\) is not the shortest distance to \(v\) . Let \(p\) be the shortest path from \(s\) to \(v\) .

    Let \(w\) be the last vertex that is in \(S\) , then by induction , all vertices on \(p\) from \(s\) to \(w\) are all in \(S\) . Let \(p'=path(s,u)+e\) .

    \(p': s\to w\to u\to v\) , \(p:s\to w\to x\to v\) . (\(x\notin S\)) . Since \(d(u)+l_e\) is minimal , \(d(u)+l_e\le d(w)+l_{w,x}\) , but \(dist(p)=d(w)+l_{w,x}+dist(x,v)\) , where \(dist(x,v)\ge 0\) , and \(dist(p)<dist(p')=d(u)+l_e\) , so \(d(w)+l_{w,x}\le dist(p)<d(u)+l_e\) , contradict .