Algorithm Design 8
Chapter 6 Approximation Problem
6.4 Knapsack
Description
\(n\) items , each have value \(v_i\) and weight \(w_i\le W\) . Maximum capacity \(W\) .
Find a subset \(I\subseteq [n]\) , s.t. \(\sum_{i\in I}w_i\le W\) , and \(\sum_{i\in I}v_i\) is maximized .
By DP : \(\mathcal O(nW)\)
Discretization
Let \(b=\frac{\epsilon}{2n}\cdot \max_{i\in [n]}v_i\) , let \(\bar v_i=\lceil\frac{v_i}{b}\rceil b\) . Therefore , \(\hat v_i:=\lceil \frac{v_i}{b}\rceil \in [0,\frac{2n}{\epsilon}]\) .
New DP method :
\(OPT(i,V)\) : the smallest weight one can obtain from first \(i\) items with total value of \(\hat v_i\) \(\ge V\) .
Originally : \(OPT(i,w)=\max v\) , now : \(OPT(i,v)=\min w\)
\[ OPT(i,V)=\min\begin{cases}OPT(i-1,V)\\OPT(i-1,V-\hat v_i)+w_i\end{cases} \]
Time complexity : \(\max V=n\cdot 2n/\epsilon\) , so total time is \(\mathcal O(n^3/\epsilon)\) .
Approximation analysis
Goal : Let \(S^*\) is OPT solution , \(S\) is our solution , then \(\sum_{i\in S}v_i\ge (1-\epsilon)\sum_{i\in S^*} v_i\)
Proof : \[ \begin{aligned} &\quad \sum_{i\in S^*}v_i\\ &\le \sum_{i\in S^*}\bar v_i\\ &\le \sum_{i\in S}\bar v_i\\ &\le \sum_{i\in S}(v_i+b)\\ &\le nb+\sum_{i\in S}v_i\\ &\le \frac{\epsilon}{2}\max_i v_i +\sum_{i\in S}v_i\\ &\le \frac{\epsilon}{2}\sum_{i\in S^*}v_i+\sum_{i\in S}v_i \end{aligned} \] Therefore , \(\sum_{i\in S}v_i\ge (1-\epsilon/2) \sum_{i\in S^*}v_i\)
We can reach \((1+\epsilon)\)-approximation
FPTAS , running time \(\mathcal O(poly(n,1/\epsilon))\) .
Note : Here we over-estimate the value of each items , and find valid solution ( \(\sum w_i\le W\) ) . Then because we don't over-estimate much , our solution is not much worse than the optimal solution .
Reversely , we can also under-estimate the weight of each items , and find more optimal solution (\(\sum v_i\ge V\)) . Then because we don't under-estimate much , our solution is not exceed much than the weight restriction (\(\sum w_i\le (1+\epsilon)W\))
6.5 Set Cover
Description
Universe \(U\) , \(|U|=n\) . \(\mathcal F=\{S_1,\cdots,S_m\}\) , \(S_i\subseteq U\) , each with weight \(w_i\ge 0\) .
Find a set cover \(I\) , s.t. \(\bigcup_{i\in I}S_i=U\) , and minimize \(\sum_{i\in I}w_i\) .
Greedy for SC
选“性价比”最高的 , \(|S_i|/w_i\) like
Algorithm :
repeat choose the subset \(s_i\) , s.t. \(w_i/|\{\text{newly covered elements by }S_i\}|\) is smallest
Approximation analysis
Goal : achieves \(\alpha=\ln n\)
Proof :
Suppose our solution picks \(S_{i_1},S_{i_2},\cdots,S_{i_k}\) , \(R_j:=U\backslash \bigcup_{l\le j}S_{i_l}\) , let \(n_j=|R_j|\) .
Let \(T_j:=R_{j-1}\cap S_{i_j}\) , \(t_j=|T_j|\) . ( \(T_j\) is newly added elements in \(j\)-th iteration)
Key inequality : \(w_{i_j}/t_j\le OPT/n_{j-1}\) (*)
Therefore , \(w_{i_j}\le OPT\cdot t_j/n_{j-1}\) . Since \(t_j=n_{j-1}-n_j\) , we have \[ \begin{aligned} SOL&= \sum_{j=1}^{k}w_{i_j}\\ &\le OPT\sum_{j=1}^k (t_j/n_{j-1})\\ &\le OPT \sum_{j=1}^k \frac{n_{j-1}-n_j}{n_{j-1}}\\ &= OPT\sum_{j=1}^k \sum_{i=n_j+1}^{n_{j-1}}\frac{1}{n_{j-1}}\\ &\le OPT \sum_{j=1}^k \sum_{i=n_j+1}^{n_{j-1}} \frac{1}{i}\\ &=OPT\sum_{i=1}^n \frac{1}{i} \end{aligned} \] (*) proof : 在第 \(j\) 轮 ,\(OPT\) 一定可以覆盖完 \(R_{j-1}\) ,设 \(OPT\backslash\{i_1,\cdots,i_{j-1}\}=\{o_1,\cdots,o_l\}\) ,则 \(\forall o\) , \(w_o/|S_o\cap R_{j-1}|\ge w_{i_j}/t_j\) .
Approximation with \((1-\delta)\ln n\) for Set Cover is NP-Hard
Proof : PCP Theorem ( probabilistic checkable proof )
\(NP=PCP(\mathcal O(\log n),\mathcal O(1))\)
PCP : read \(\log n\) bits of proof , then decide whether to accept
Used for approximation : easy to make \(h\sim \alpha h\) gap
Extension : Pricing Model ( see KT's Book )
For partition problem : partition original instance into multiple parts , each has a "center" . We consider all points are dominated by a center \(c\) , and each center can have a dominating set \(D_c\) .
Then we amortize the cost of part \(c\) to all points in \(D_c\) , and let this be our greedy reference. Each iteration , we only consider the remaining points , and choose the part with minimized amortized cost . Then usually we can get an inequality like (*) , leading to \(\mathcal O(\log n)\) approximation .
6.6 Linear Programming & Approximation
Description
\(n\) variables , \(m\) linear constraints \(M_i=\sum_{j=1}^n a_{i,j}x_j\ge b_i\) . Maximize \(\sum_{i=1}^n c_i x_i\) .
Variants:
- \(\sum_{i=1}^n a_{i,j}x_j\le b_i\) \(\to\) \(\sum_{i=1}^n (-a_{i,j})x_j \ge -b_i\)
- \(\sum_{i=1}^n a_{i,j}x_j=b_i\) \(\to\) \(\sum_{i=1}^n a_{i,j}x_j\ge b_i\) & \(\sum_{i=1}^n a_{i,j}x_j\le b_i\)
- Minimize \(\sum_{i=1}^n c_ix_i\) \(\to\) Maximize \(\sum_{i=1}^n (-c_i)x_i\)
Geometry view
One constraint \(\to\) one side of \(n-1\) dim hyperplane
Finally : form a polyhedron / polytope (bounded polyhedron)
Only need to check the vertices of polytope .
\(n\) dim polytope : surface is \(n-1\) dim polytope , connection between two \(k\) dim polytope is \(k-1\) dim polytope .
Each vertex : solution of linear system with \(n\) constraints
theoretical results
[Danzig] simplex algorithm
idea : find one vertex , choose one edge to move to increase the goal .
worst case : exponential time (enumerate all possible vertex)
OPEN : how to solve simplex in poly-time (choose which edge)
Current Results : poly-time path exists ; sub-exponential time
interior point method : theoretical poly-time , applicable
ellipsoid method : theoretical poly-time , not-applicable
Vertex Cover \(\to\) integer linear programming
integer linear programming
\(x_v=\begin{cases}0&v\notin VC\\1&v\in VC\end{cases}\)
constraints : \(\forall e=(u,v)\in E\) , \(x_u+x_v\ge 1\)
integer constraint : \(x_v\in \{0,1\}\)
goal : minimize \(\sum_{v\in V} w_vx_v\)
Linear Programming relaxation : relax ILP to LP
replace \(x_v\in \{0,1\}\) by \(0\le x_v\le 1\)
feasible region of LP contains feasible region of ILP
Obs: \(OPT(LP)\le OPT(ILP)\)
Approximation Algorithm
Step 1 : solve \(LP\) , denote the fractional solution by \(\vec x\)
Step 2 : rounding : \(\bar x_v=\begin{cases}1&x_v\ge 0.5\\0&otherwise\end{cases}\)
Proof 1 : \(\bar x_v\) is a valid solution for ILP
Since \(x_u+x_v\ge 1\) , at least one of \(x_u\) and \(x_v\) is \(\ge 0.5\) , then at least one of them is rounded to \(1\) .
Proof 2 : \(SOL\le 2\cdot OPT\) \[ \begin{aligned} SOL&=\sum_{v}w_v\bar x_v\\ &\le \sum_{v}w_v 2x_v\\ &=2\cdot OPT(LP)\\ &\le 2\cdot OPT(ILP) \end{aligned} \]
assuming UGC , it is hard to improve to \(\alpha=2-\epsilon\) .
Set Cover \(\to\) integer linear programming
integer linear programming
constraints : \(\forall e\in U\) , \(\sum_{e\in S_i} x_i\ge 1\)
integer constraint : \(x_i\in \{0,1\}\)
goal : minimize \(\sum_{i=1}^m w_ix_i\)
How to round ?
randomized rounding scheme : take \(\bar x_i=\begin{cases}1&w.p.\ x_i\\0&w.p.\ 1-x_i\end{cases}\)
\(E[SOL]=\sum_{i}w_i\Pr\{\text{we choose }S_i\}=\sum_{i}w_ix_i=OPT(LP)\le OPT\)
feasible probability
\(\Pr\{e\text{ is not covered}\}=\prod_{i:e\in S_i}(1-x_i)\le \exp\left(\sum_{i:e\in S_i}-x_i\right)\le \frac{1}{e}\)
We need to repeat \(T=2\ln n\) times , take the union of all rounds .
\(\Pr\{e\text{ is not covered in all }T\text{ rounds}\}\le \frac{1}{e^T}=\frac{1}{n^2}\) .
\(\Pr\{\exists e:e\text{ is not covered in all }T\text{ rounds}\}\le \frac{1}{n}\) .
feasible probability : \(1-\frac{1}{n}\)
approximation : \(\alpha=2\ln n\)