Algorithm Design 8

Chapter 6 Approximation Problem

6.4 Knapsack

  1. Description

    \(n\) items , each have value \(v_i\) and weight \(w_i\le W\) . Maximum capacity \(W\) .

    Find a subset \(I\subseteq [n]\) , s.t. \(\sum_{i\in I}w_i\le W\) , and \(\sum_{i\in I}v_i\) is maximized .

  2. By DP : \(\mathcal O(nW)\)

  3. Discretization

    Let \(b=\frac{\epsilon}{2n}\cdot \max_{i\in [n]}v_i\) , let \(\bar v_i=\lceil\frac{v_i}{b}\rceil b\) . Therefore , \(\hat v_i:=\lceil \frac{v_i}{b}\rceil \in [0,\frac{2n}{\epsilon}]\) .

    New DP method :

    \(OPT(i,V)\) : the smallest weight one can obtain from first \(i\) items with total value of \(\hat v_i\) \(\ge V\) .

    Originally : \(OPT(i,w)=\max v\) , now : \(OPT(i,v)=\min w\)

    \[ OPT(i,V)=\min\begin{cases}OPT(i-1,V)\\OPT(i-1,V-\hat v_i)+w_i\end{cases} \]

    Time complexity : \(\max V=n\cdot 2n/\epsilon\) , so total time is \(\mathcal O(n^3/\epsilon)\) .

  4. Approximation analysis

    Goal : Let \(S^*\) is OPT solution , \(S\) is our solution , then \(\sum_{i\in S}v_i\ge (1-\epsilon)\sum_{i\in S^*} v_i\)

    Proof : \[ \begin{aligned} &\quad \sum_{i\in S^*}v_i\\ &\le \sum_{i\in S^*}\bar v_i\\ &\le \sum_{i\in S}\bar v_i\\ &\le \sum_{i\in S}(v_i+b)\\ &\le nb+\sum_{i\in S}v_i\\ &\le \frac{\epsilon}{2}\max_i v_i +\sum_{i\in S}v_i\\ &\le \frac{\epsilon}{2}\sum_{i\in S^*}v_i+\sum_{i\in S}v_i \end{aligned} \] Therefore , \(\sum_{i\in S}v_i\ge (1-\epsilon/2) \sum_{i\in S^*}v_i\)

    We can reach \((1+\epsilon)\)-approximation

    FPTAS , running time \(\mathcal O(poly(n,1/\epsilon))\) .

    Note : Here we over-estimate the value of each items , and find valid solution ( \(\sum w_i\le W\) ) . Then because we don't over-estimate much , our solution is not much worse than the optimal solution .

    Reversely , we can also under-estimate the weight of each items , and find more optimal solution (\(\sum v_i\ge V\)) . Then because we don't under-estimate much , our solution is not exceed much than the weight restriction (\(\sum w_i\le (1+\epsilon)W\))

6.5 Set Cover

  1. Description

    Universe \(U\) , \(|U|=n\) . \(\mathcal F=\{S_1,\cdots,S_m\}\) , \(S_i\subseteq U\) , each with weight \(w_i\ge 0\) .

    Find a set cover \(I\) , s.t. \(\bigcup_{i\in I}S_i=U\) , and minimize \(\sum_{i\in I}w_i\) .

  2. Greedy for SC

    选“性价比”最高的 , \(|S_i|/w_i\) like

    Algorithm :

    repeat choose the subset \(s_i\) , s.t. \(w_i/|\{\text{newly covered elements by }S_i\}|\) is smallest

  3. Approximation analysis

    Goal : achieves \(\alpha=\ln n\)

    Proof :

    Suppose our solution picks \(S_{i_1},S_{i_2},\cdots,S_{i_k}\) , \(R_j:=U\backslash \bigcup_{l\le j}S_{i_l}\) , let \(n_j=|R_j|\) .

    Let \(T_j:=R_{j-1}\cap S_{i_j}\) , \(t_j=|T_j|\) . ( \(T_j\) is newly added elements in \(j\)-th iteration)

    Key inequality : \(w_{i_j}/t_j\le OPT/n_{j-1}\) (*)

    Therefore , \(w_{i_j}\le OPT\cdot t_j/n_{j-1}\) . Since \(t_j=n_{j-1}-n_j\) , we have \[ \begin{aligned} SOL&= \sum_{j=1}^{k}w_{i_j}\\ &\le OPT\sum_{j=1}^k (t_j/n_{j-1})\\ &\le OPT \sum_{j=1}^k \frac{n_{j-1}-n_j}{n_{j-1}}\\ &= OPT\sum_{j=1}^k \sum_{i=n_j+1}^{n_{j-1}}\frac{1}{n_{j-1}}\\ &\le OPT \sum_{j=1}^k \sum_{i=n_j+1}^{n_{j-1}} \frac{1}{i}\\ &=OPT\sum_{i=1}^n \frac{1}{i} \end{aligned} \] (*) proof : 在第 \(j\) 轮 ,\(OPT\) 一定可以覆盖完 \(R_{j-1}\) ,设 \(OPT\backslash\{i_1,\cdots,i_{j-1}\}=\{o_1,\cdots,o_l\}\) ,则 \(\forall o\) , \(w_o/|S_o\cap R_{j-1}|\ge w_{i_j}/t_j\) .

  4. Approximation with \((1-\delta)\ln n\) for Set Cover is NP-Hard

    Proof : PCP Theorem ( probabilistic checkable proof )

    \(NP=PCP(\mathcal O(\log n),\mathcal O(1))\)

    PCP : read \(\log n\) bits of proof , then decide whether to accept

    Used for approximation : easy to make \(h\sim \alpha h\) gap

  5. Extension : Pricing Model ( see KT's Book )

    For partition problem : partition original instance into multiple parts , each has a "center" . We consider all points are dominated by a center \(c\) , and each center can have a dominating set \(D_c\) .

    Then we amortize the cost of part \(c\) to all points in \(D_c\) , and let this be our greedy reference. Each iteration , we only consider the remaining points , and choose the part with minimized amortized cost . Then usually we can get an inequality like (*) , leading to \(\mathcal O(\log n)\) approximation .

6.6 Linear Programming & Approximation

  1. Description

    \(n\) variables , \(m\) linear constraints \(M_i=\sum_{j=1}^n a_{i,j}x_j\ge b_i\) . Maximize \(\sum_{i=1}^n c_i x_i\) .

    Variants:

    • \(\sum_{i=1}^n a_{i,j}x_j\le b_i\) \(\to\) \(\sum_{i=1}^n (-a_{i,j})x_j \ge -b_i\)
    • \(\sum_{i=1}^n a_{i,j}x_j=b_i\) \(\to\) \(\sum_{i=1}^n a_{i,j}x_j\ge b_i\) & \(\sum_{i=1}^n a_{i,j}x_j\le b_i\)
    • Minimize \(\sum_{i=1}^n c_ix_i\) \(\to\) Maximize \(\sum_{i=1}^n (-c_i)x_i\)
  2. Geometry view

    One constraint \(\to\) one side of \(n-1\) dim hyperplane

    Finally : form a polyhedron / polytope (bounded polyhedron)

    Only need to check the vertices of polytope .

    \(n\) dim polytope : surface is \(n-1\) dim polytope , connection between two \(k\) dim polytope is \(k-1\) dim polytope .

    Each vertex : solution of linear system with \(n\) constraints

  3. theoretical results

    [Danzig] simplex algorithm

    ​ idea : find one vertex , choose one edge to move to increase the goal .

    ​ worst case : exponential time (enumerate all possible vertex)

    ​ OPEN : how to solve simplex in poly-time (choose which edge)

    ​ Current Results : poly-time path exists ; sub-exponential time

    interior point method : theoretical poly-time , applicable

    ellipsoid method : theoretical poly-time , not-applicable

  4. Vertex Cover \(\to\) integer linear programming

    1. integer linear programming

      \(x_v=\begin{cases}0&v\notin VC\\1&v\in VC\end{cases}\)

      constraints : \(\forall e=(u,v)\in E\) , \(x_u+x_v\ge 1\)

      integer constraint : \(x_v\in \{0,1\}\)

      goal : minimize \(\sum_{v\in V} w_vx_v\)

    2. Linear Programming relaxation : relax ILP to LP

      replace \(x_v\in \{0,1\}\) by \(0\le x_v\le 1\)

      feasible region of LP contains feasible region of ILP

      Obs: \(OPT(LP)\le OPT(ILP)\)

    3. Approximation Algorithm

      Step 1 : solve \(LP\) , denote the fractional solution by \(\vec x\)

      Step 2 : rounding : \(\bar x_v=\begin{cases}1&x_v\ge 0.5\\0&otherwise\end{cases}\)

      1. Proof 1 : \(\bar x_v\) is a valid solution for ILP

        Since \(x_u+x_v\ge 1\) , at least one of \(x_u\) and \(x_v\) is \(\ge 0.5\) , then at least one of them is rounded to \(1\) .

      2. Proof 2 : \(SOL\le 2\cdot OPT\) \[ \begin{aligned} SOL&=\sum_{v}w_v\bar x_v\\ &\le \sum_{v}w_v 2x_v\\ &=2\cdot OPT(LP)\\ &\le 2\cdot OPT(ILP) \end{aligned} \]

    4. assuming UGC , it is hard to improve to \(\alpha=2-\epsilon\) .

  5. Set Cover \(\to\) integer linear programming

    1. integer linear programming

      constraints : \(\forall e\in U\) , \(\sum_{e\in S_i} x_i\ge 1\)

      integer constraint : \(x_i\in \{0,1\}\)

      goal : minimize \(\sum_{i=1}^m w_ix_i\)

    2. How to round ?

      randomized rounding scheme : take \(\bar x_i=\begin{cases}1&w.p.\ x_i\\0&w.p.\ 1-x_i\end{cases}\)

      \(E[SOL]=\sum_{i}w_i\Pr\{\text{we choose }S_i\}=\sum_{i}w_ix_i=OPT(LP)\le OPT\)

      feasible probability

      \(\Pr\{e\text{ is not covered}\}=\prod_{i:e\in S_i}(1-x_i)\le \exp\left(\sum_{i:e\in S_i}-x_i\right)\le \frac{1}{e}\)

      We need to repeat \(T=2\ln n\) times , take the union of all rounds .

      \(\Pr\{e\text{ is not covered in all }T\text{ rounds}\}\le \frac{1}{e^T}=\frac{1}{n^2}\) .

      \(\Pr\{\exists e:e\text{ is not covered in all }T\text{ rounds}\}\le \frac{1}{n}\) .

      feasible probability : \(1-\frac{1}{n}\)

      approximation : \(\alpha=2\ln n\)