Algorithm Design 11

发表于 2023-12-29 分类于课程笔记，算法设计阅读次数：本文字数： 1.4k 阅读时长 ≈ 5 分钟

[2023 秋] 算法设计第十一课课堂笔记 / [2023 Fall] Algorithm Design Lecture 11 Notes

Chapter 8 Network Flow

8.3 Application

8.3.7 Project Selection Problem

Description

A set of projects , project \(i\) has profit \(p_i\) ( \(p_i\) can be positive or negative)

A set of procedure constraints : given by DAG , \(e=(i,j)\) means \(i\) can be selected only if \(j\) is selected .

Goal : Select a subset of projects \(A\) , s.t. \(\sum_{i\in A}p_i\) is maximized .
Flow Construction

Min-Cut Model
- \((i,j)\) : give \(\infty\) capacity , then it cannot be a cut . Then it cannot be \(i\) in \(s\) part while \(j\) in \(t\) part .
- Define in \(s\) part : selected , in \(t\) part : not selected .
- If \(p_i\ge 0\) , then have edge \((s,i,p_i)\) ; If \(p_i<0\) , then have edge \((i,t,-p_i)\) .
- Answer is \(\sum_{p_i\ge 0}p_i-\mathtt{MinCut}\)
Formal construction :
- Vertices : \(V'=V\cup \{s,t\}\)
- Edges :
  - \(\forall (i,j)\in E\) , \((i,j,\infty)\)
  - \(\forall i\in V,p_i\ge 0\) , \((s,i,p_i)\)
  - \(\forall i\in V,p_i<0\) , \((i,t,-p_i)\)
- Find a Min-Cut in \(G'\) : \((A,V'-A)\)
  
  The minimum value of cut \((A,V'-A)\) is \[ \begin{aligned} &\quad \sum_{p_i\ge 0,i\notin A}p_i+\sum_{p_i<0,i\in A}(-p_i)\\ &=\sum_{p_i\ge 0}p_i-\left(\sum_{p_i\ge 0,i\in A}p_i+\sum_{p_i<0,i\in A}p_i\right) \end{aligned} \] Which is equivalent to maximize \[ \sum_{p_i\ge 0,i\in A}p_i+\sum_{p_i<0,i\in A}p_i=\sum_{i\in A}p_i \]

8.3.8 Densest Subgraph Problem

Description

Given undirected graph \(G=(V,E)\) . Density of induced subgraph \(G[S]\) , \(S\subseteq V\) is defined as \(\frac{|E\cap G[S]|}{|S|}\)

Goal : find a subgraph of maximum density .
More generalized (but easier problem) : Baseball Elimination

Input :
- A set \(S\) of teams . For each \(x\in S\) , its current score \(w_x\) .
- For \(x,y\in S\) , they will need to play \(g_{x,y}\) more games .
- For each game , winner gains \(1\) point , loser gains \(0\) point . Not consider draw case .
Consider a special team \(z\) , determine theoretically whether \(z\) can win the league ( have the highest score ) (not necessarily unique)

WOLG , we can let \(z\) win all games it needs to play , and the current score of \(z\) becomes \(m\) .

\(z\) cannot win \(\iff\) \(\exists T\subseteq S\) , \(\sum\limits_{x\in T}w_x+\sum\limits_{x,y\in T}g_{x,y}>m|T|\)
Flow Construction

\(\iff\) \(\exists T\subseteq S\) , \(\sum\limits_{x\in T}(w_x-m)+\sum\limits_{x,y\in T}g_{x,y}>0\)

Min-Cut Model

Construction :
- Vertices : \(V'=V\cup E\cup \{s,t\}\)
- Edges :
  - \(\forall (u,v)\in E\) , \((s,(u,v),g_{u,v})\)
  - \(\forall (u,v)\in E\) , \(((u,v),u,\infty)\) , \(((u,v),v,\infty)\)
  - \(\forall v\in V\) , \((v,t,m-w_v)\)
Analysis
1. \(\max\limits_{T\subseteq S}\sum\limits_{x\in T}(w_x-m)+\sum\limits_{x,y\in T}g_{x,y}>0\) \(\iff\) \(\sum_{x,y}g_{x,y}-\mathtt{MinCut}>0\)
  
  Min-Cut in \(G'\) : \((A,V'-A)\) , let \(T=A\cap V\) . The min cut value is : \[ \begin{aligned} &\quad \sum_{x\in T} (m-w_x)+\sum_{x\notin T\lor y\notin T} g_{x,y}\\ &=\sum_{x,y}g_{x,y}-\left(\sum_{x,y\in T}g_{x,y}+\sum_{x\in T}(w_x-m)\right) \end{aligned} \]
2. \(\sum_{x,y}g_{x,y}-\mathtt{MinCut}>0\) \(\iff\) \(z\) cannot win
  
  Consider as MaxFlow : greedily allocate all games , MaxFlow obviously \(\le \sum_{x,y}g_{x,y}\) .
  
  If \(<\) , we cannot allocate all games that satisfies \(f_v+w_v\le m\) , so \(z\) must lose in all possibilities .
  
  If \(=\) , we can find a valid game-result allocation , so \(z\) still have possibility to win .
Find maximum density
- Binary search : OK but slow
- parametric diagram
- Densest Subgraph problem can be solved by parametric flow algorithm in poly-time .

8.4 MaxFlow and LP

LP form of MaxFlow

Goal : maximize \(\sum_{v}f_{s,v}\)

Constraints :
- \(\forall v\in V\backslash\{s,t\}\) , \(\sum_{u}f_{u,v}=\sum_{u}f_{v,u}\)
- \(\forall f_e\in E\) , \(0\le f_e\le c_e\)
Integral Polytope

Claim : If \(c_e\in \mathbb Z_+\) , \(P\) is integral ( vertices of \(P\) are integral vector )

Proof : By FF algorithm

General Result :
- LP constraints : \(Ax\le b\)
- \(A\) is totally unimodular matrix \(\to\) Integral Polytope

Chapter 9 Randomized Algorithm

Complexity

P , RP , ZPP , BPP

Conjecture : P=RP ? (current guess : YES)

9.1 MaxCut Problem

Description

Given undirected \(G=(V,E)\)

Find a cut \((A,V-A)\) , s.t. \(|\{(u,v)\in E:u\in A,v\in V-A\}|\) is maximized
Randomized \(2\)-approximation algorithm

\(\forall v\in V\) , \(\begin{cases}v\in A&w.p.\frac{1}{2}\\ v\notin A&w.p.\frac{1}{2}\end{cases}\)

Analysis : \[ \begin{aligned} &\quad E[cut(A,V-A)]\\ &=\sum_{e\in E} \Pr\{e\in cut(A,V-A)\}\\ &=\sum_{(u,v)\in E} \Pr\{(u\in A,v\notin A)\lor (u\notin A,v\in A)\}\\ &=\sum_{(u,v)\in E} \frac{1}{2}\\ &=\frac{|E|}{2}\ge \frac{OPT}{2} \end{aligned} \]
Similarly , for Max-3-SAT problem , randomly assign each variable

\(E[SOL]\ge \frac{7}{8}OPT\)
Derandomization : Conditional Expectation method

Goal : design a deterministic \(\frac{8}{7}\)-approximation algorithm for Max-3-SAT \[ E_{x_i\in \{0,1\}}[SOL]=\sum_{C}\Pr\{C\text{ is satisfied}\}=\frac{7}{8}|C| \]

\[ E_{x_i\in \{0,1\}}[SOL]=\frac{1}{2}E[SOL|x_1=0]+\frac{1}{2}E[SOL|x_1=1] \]

We can still compute \(E[SOL|x_1=0]\) and \(E[SOL|x_1=1]\) by computing \(\Pr\{C\text{ is satisfied}\}\) for each clause .

Either \(E[SOL|x_1=0]\ge \frac{7}{8}\) or \(E[SOL|x_1=1]\ge \frac{7}{8}\)

9.2 Randomized D&C

\(k\)-th smallest number

Randomized algorithm

int select(S,k){ // find k-th smallest number in S
    a<- randomly choosed element in S ;
    S_l=S.filter(x:x<a);
    S_r=S.filter(x:x>=a);
    if(|S_l|==k-1) return a;
    else{
        if(|S_l|>=k) return select(S_l,k);
        else return select(S_r,k-|S_l|);
    }
}

Analysis

Goal : prove expected time is \(\mathcal O(n)\) .

Intuition :
- If \(\frac{1}{4}|S|\le |S_l|\le \frac{3}{4}|S|\) , do the function . Otherwise , do not divide and re-choose pivot \(a\) .
- This algorithm is obviously slower than our algorithm .
Analysis :
- Define phase \(j\) : \(n(\frac{3}{4})^j\le |S|\le n(\frac{3}{4})^{j+1}\)
  
  We have \(\mathcal O(\log n)\) phases
- Key : In each phase , #iterations to find a valid pivot is expected bounded .
  
  In one iteration : \(\Pr\{\text{succeed in choosing a pivot element}\}\ge \frac{1}{2}\) .
  
  Therefore , \(E[|\text{iterations in this phase}|]\le 2\) .
- \[ \begin{aligned} &\quad E[\text{running time}]\\ &\le \sum_{j=1}^{\mathcal O(\log n)} E[\text{running time of phase }j]\\ &\le \sum_{j=1}^{\mathcal O(\log n)} (\frac{3}{4})^{j+1} n E[\text{iterations in this phase}]\\ &\le \mathcal O(n) \end{aligned} \]
Derandomization

Divide into \(\frac{n}{k}\) groups , find medium for each group , then find medium for medium of each group as pivot .

We can ensure that this pivot is not far from middle rank \(\to\) good separation .

9.3 Hashing

Intuition

Find a "good" function \(h:U\to\{0,1,\cdots,n-1\}\) , where \(|U|> n\) . ( \(|U|\) is unaffordable in time/memory , but \(n\) is acceptable in time/memory)

Use \(h(u)\) to represent \(u\) .

Good performance : Insert : \(\mathcal O(1)\) , Delete : \(\mathcal O(1)\) , Find : \(\mathcal O(1)\) .

Problem : collision : \(u\neq v\in U\) , \(h(u)=h(v)\) .

Collision resolution -
- Chaining : maintain a link-list for each \(i\in \{0,1,\cdots,n-1\}\) .
  - If \(h(v)\) is allocated , insert \(v\) at the end of \(h(v)\)-th link-list .
  - When find/delete , iterate the whole \(h(v)\)-th link-list .
- Open Address method : ( #element stored \(<N\) )
  - \(h:U\times\{0,\cdots,N-1\}\to \{0,1,\cdots,N-1\}\)
  - We probe \(h(x,0),h(x,1),\cdots\) in order , until find an empty cell
  - Linear probing : \(h':U\to \{0,1,\cdots,N-1\}\) , \(h(x,i)=(h'(x)+i)\bmod N\)
  - Drawback : continuous allocation \(\to\) more possible to allocate near
    
    Ideal hashing function : like random
    
    Primary Cluster : Bad in performance
  - Improvement :
    - Quadratic probing \(h(x,i)=(h'(x)+c_1i+c_2i^2)\bmod N\)
    - Double Hashing : \(h(x,i)=(h_1(x)+ih_2(x))\bmod N\)