Algorithm Design 8

发表于 2023-11-21 分类于课程笔记，算法设计阅读次数：本文字数： 1.3k 阅读时长 ≈ 5 分钟

[2023 秋] 算法设计第八课课堂笔记 / [2023 Fall] Algorithm Design Lecture 8 Notes

Chapter 6 Approximation Problem

6.4 Knapsack

Description

\(n\) items , each have value \(v_i\) and weight \(w_i\le W\) . Maximum capacity \(W\) .

Find a subset \(I\subseteq [n]\) , s.t. \(\sum_{i\in I}w_i\le W\) , and \(\sum_{i\in I}v_i\) is maximized .
By DP : \(\mathcal O(nW)\)
Discretization

Let \(b=\frac{\epsilon}{2n}\cdot \max_{i\in [n]}v_i\) , let \(\bar v_i=\lceil\frac{v_i}{b}\rceil b\) . Therefore , \(\hat v_i:=\lceil \frac{v_i}{b}\rceil \in [0,\frac{2n}{\epsilon}]\) .

New DP method :

\(OPT(i,V)\) : the smallest weight one can obtain from first \(i\) items with total value of \(\hat v_i\) \(\ge V\) .

Originally : \(OPT(i,w)=\max v\) , now : \(OPT(i,v)=\min w\)

\[ OPT(i,V)=\min\begin{cases}OPT(i-1,V)\\OPT(i-1,V-\hat v_i)+w_i\end{cases} \]

Time complexity : \(\max V=n\cdot 2n/\epsilon\) , so total time is \(\mathcal O(n^3/\epsilon)\) .
Approximation analysis

Goal : Let \(S^*\) is OPT solution , \(S\) is our solution , then \(\sum_{i\in S}v_i\ge (1-\epsilon)\sum_{i\in S^*} v_i\)

Proof : \[ \begin{aligned} &\quad \sum_{i\in S^*}v_i\\ &\le \sum_{i\in S^*}\bar v_i\\ &\le \sum_{i\in S}\bar v_i\\ &\le \sum_{i\in S}(v_i+b)\\ &\le nb+\sum_{i\in S}v_i\\ &\le \frac{\epsilon}{2}\max_i v_i +\sum_{i\in S}v_i\\ &\le \frac{\epsilon}{2}\sum_{i\in S^*}v_i+\sum_{i\in S}v_i \end{aligned} \] Therefore , \(\sum_{i\in S}v_i\ge (1-\epsilon/2) \sum_{i\in S^*}v_i\)

We can reach \((1+\epsilon)\)-approximation

FPTAS , running time \(\mathcal O(poly(n,1/\epsilon))\) .

Note : Here we over-estimate the value of each items , and find valid solution ( \(\sum w_i\le W\) ) . Then because we don't over-estimate much , our solution is not much worse than the optimal solution .

Reversely , we can also under-estimate the weight of each items , and find more optimal solution (\(\sum v_i\ge V\)) . Then because we don't under-estimate much , our solution is not exceed much than the weight restriction (\(\sum w_i\le (1+\epsilon)W\))

6.5 Set Cover

Description

Universe \(U\) , \(|U|=n\) . \(\mathcal F=\{S_1,\cdots,S_m\}\) , \(S_i\subseteq U\) , each with weight \(w_i\ge 0\) .

Find a set cover \(I\) , s.t. \(\bigcup_{i\in I}S_i=U\) , and minimize \(\sum_{i\in I}w_i\) .
Greedy for SC

选“性价比”最高的 , \(|S_i|/w_i\) like

Algorithm :

repeat choose the subset \(s_i\) , s.t. \(w_i/|\{\text{newly covered elements by }S_i\}|\) is smallest
Approximation analysis

Goal : achieves \(\alpha=\ln n\)

Proof :

Suppose our solution picks \(S_{i_1},S_{i_2},\cdots,S_{i_k}\) , \(R_j:=U\backslash \bigcup_{l\le j}S_{i_l}\) , let \(n_j=|R_j|\) .

Let \(T_j:=R_{j-1}\cap S_{i_j}\) , \(t_j=|T_j|\) . ( \(T_j\) is newly added elements in \(j\)-th iteration)

Key inequality : \(w_{i_j}/t_j\le OPT/n_{j-1}\) (*)

Therefore , \(w_{i_j}\le OPT\cdot t_j/n_{j-1}\) . Since \(t_j=n_{j-1}-n_j\) , we have \[ \begin{aligned} SOL&= \sum_{j=1}^{k}w_{i_j}\\ &\le OPT\sum_{j=1}^k (t_j/n_{j-1})\\ &\le OPT \sum_{j=1}^k \frac{n_{j-1}-n_j}{n_{j-1}}\\ &= OPT\sum_{j=1}^k \sum_{i=n_j+1}^{n_{j-1}}\frac{1}{n_{j-1}}\\ &\le OPT \sum_{j=1}^k \sum_{i=n_j+1}^{n_{j-1}} \frac{1}{i}\\ &=OPT\sum_{i=1}^n \frac{1}{i} \end{aligned} \] (*) proof : 在第 \(j\) 轮，\(OPT\) 一定可以覆盖完 \(R_{j-1}\) ，设 \(OPT\backslash\{i_1,\cdots,i_{j-1}\}=\{o_1,\cdots,o_l\}\) ，则 \(\forall o\) , \(w_o/|S_o\cap R_{j-1}|\ge w_{i_j}/t_j\) .
Approximation with \((1-\delta)\ln n\) for Set Cover is NP-Hard

Proof : PCP Theorem ( probabilistic checkable proof )

\(NP=PCP(\mathcal O(\log n),\mathcal O(1))\)

PCP : read \(\log n\) bits of proof , then decide whether to accept

Used for approximation : easy to make \(h\sim \alpha h\) gap
Extension : Pricing Model ( see KT's Book )

For partition problem : partition original instance into multiple parts , each has a "center" . We consider all points are dominated by a center \(c\) , and each center can have a dominating set \(D_c\) .

Then we amortize the cost of part \(c\) to all points in \(D_c\) , and let this be our greedy reference. Each iteration , we only consider the remaining points , and choose the part with minimized amortized cost . Then usually we can get an inequality like (*) , leading to \(\mathcal O(\log n)\) approximation .

6.6 Linear Programming & Approximation

Description

\(n\) variables , \(m\) linear constraints \(M_i=\sum_{j=1}^n a_{i,j}x_j\ge b_i\) . Maximize \(\sum_{i=1}^n c_i x_i\) .

Variants:
- \(\sum_{i=1}^n a_{i,j}x_j\le b_i\) \(\to\) \(\sum_{i=1}^n (-a_{i,j})x_j \ge -b_i\)
- \(\sum_{i=1}^n a_{i,j}x_j=b_i\) \(\to\) \(\sum_{i=1}^n a_{i,j}x_j\ge b_i\) & \(\sum_{i=1}^n a_{i,j}x_j\le b_i\)
- Minimize \(\sum_{i=1}^n c_ix_i\) \(\to\) Maximize \(\sum_{i=1}^n (-c_i)x_i\)
Geometry view

One constraint \(\to\) one side of \(n-1\) dim hyperplane

Finally : form a polyhedron / polytope (bounded polyhedron)

Only need to check the vertices of polytope .

\(n\) dim polytope : surface is \(n-1\) dim polytope , connection between two \(k\) dim polytope is \(k-1\) dim polytope .

Each vertex : solution of linear system with \(n\) constraints
theoretical results

[Danzig] simplex algorithm

idea : find one vertex , choose one edge to move to increase the goal .

worst case : exponential time (enumerate all possible vertex)

OPEN : how to solve simplex in poly-time (choose which edge)

Current Results : poly-time path exists ; sub-exponential time

interior point method : theoretical poly-time , applicable

ellipsoid method : theoretical poly-time , not-applicable
Vertex Cover \(\to\) integer linear programming
1. integer linear programming
  
  \(x_v=\begin{cases}0&v\notin VC\\1&v\in VC\end{cases}\)
  
  constraints : \(\forall e=(u,v)\in E\) , \(x_u+x_v\ge 1\)
  
  integer constraint : \(x_v\in \{0,1\}\)
  
  goal : minimize \(\sum_{v\in V} w_vx_v\)
2. Linear Programming relaxation : relax ILP to LP
  
  replace \(x_v\in \{0,1\}\) by \(0\le x_v\le 1\)
  
  feasible region of LP contains feasible region of ILP
  
  Obs: \(OPT(LP)\le OPT(ILP)\)
3. Approximation Algorithm
  
  Step 1 : solve \(LP\) , denote the fractional solution by \(\vec x\)
  
  Step 2 : rounding : \(\bar x_v=\begin{cases}1&x_v\ge 0.5\\0&otherwise\end{cases}\)
  1. Proof 1 : \(\bar x_v\) is a valid solution for ILP
    
    Since \(x_u+x_v\ge 1\) , at least one of \(x_u\) and \(x_v\) is \(\ge 0.5\) , then at least one of them is rounded to \(1\) .
  2. Proof 2 : \(SOL\le 2\cdot OPT\) \[ \begin{aligned} SOL&=\sum_{v}w_v\bar x_v\\ &\le \sum_{v}w_v 2x_v\\ &=2\cdot OPT(LP)\\ &\le 2\cdot OPT(ILP) \end{aligned} \]
4. assuming UGC , it is hard to improve to \(\alpha=2-\epsilon\) .
Set Cover \(\to\) integer linear programming
1. integer linear programming
  
  constraints : \(\forall e\in U\) , \(\sum_{e\in S_i} x_i\ge 1\)
  
  integer constraint : \(x_i\in \{0,1\}\)
  
  goal : minimize \(\sum_{i=1}^m w_ix_i\)
2. How to round ?
  
  randomized rounding scheme : take \(\bar x_i=\begin{cases}1&w.p.\ x_i\\0&w.p.\ 1-x_i\end{cases}\)
  
  \(E[SOL]=\sum_{i}w_i\Pr\{\text{we choose }S_i\}=\sum_{i}w_ix_i=OPT(LP)\le OPT\)
  
  feasible probability
  
  \(\Pr\{e\text{ is not covered}\}=\prod_{i:e\in S_i}(1-x_i)\le \exp\left(\sum_{i:e\in S_i}-x_i\right)\le \frac{1}{e}\)
  
  We need to repeat \(T=2\ln n\) times , take the union of all rounds .
  
  \(\Pr\{e\text{ is not covered in all }T\text{ rounds}\}\le \frac{1}{e^T}=\frac{1}{n^2}\) .
  
  \(\Pr\{\exists e:e\text{ is not covered in all }T\text{ rounds}\}\le \frac{1}{n}\) .
  
  feasible probability : \(1-\frac{1}{n}\)
  
  approximation : \(\alpha=2\ln n\)