Algorithm Design 7

  1. NP-Hardness

    • NP Completeness : Decision Problem

    • NP Hardness : Optimization Problem

    • Problem in NP-Hard : not in NPC , but the decision version is NPC

  2. How to almost solve NP-Hard problem ?

    • Special Restrictions
    • Approximation
    • Randomization

Chapter 5 Extending the Limits of Tractability

Solving NP-Hard problem under some special restrictions

  1. Max Independent Set

    • general graphs : NP-Hard
    • interval graph : P
    • tree / graph with bounded tree width : P
  2. FPT (fixed parameter tractable)

    Intuition : Consider the parametrization of an instance

    Def [ FPT ]: Input size \(n\) , parameter \(k\) . A problem is FPT if there exists an algorithm that solve the problem in \(\mathcal O(f(k)poly(n))\).

    \(k\) can be viewed similar to constant , \(\mathcal O(2^kpoly(n))\) is still FPT , but \(\mathcal O(n^k)\) is not FPT ( \(poly(n)\) is independent from \(k\) ).

  3. Vertex Cover , \(k=|VC|\)

    We want to construct an algorithm in \(\mathcal O(f(k)poly(n))\) , which is better than trivial algorithm in \(\mathcal O(n^{k})\) .

    1. Key Observation : \(e=(u,v)\) , \(VC(G)\le k\iff VC(G-u)\le k-1\lor CV(G-v)\le k-1\) .

    2. Algorithm : check \((G,k)\)

      1. If \(G\) has no edge , then the empty set is a VC , return true .
      2. If \(|E|> k|V|\) , then \(\not\exists VC,|VC|\le k\) , return false .
      3. Let \(e=(u,v)\in E\) , return check(G-u,k-1)||check(G-v,k-1) .
    3. Running time : By recursive tree , at most \(k\) levels , level \(i\) has \(2^{i-1}\) possibilities , with \(\mathcal O(n(k-i+1))\) computation in each node.

      Total time : \(\mathcal O(2^kkn)\) .

    4. Best FPT : \(\mathcal O(1.2738^kpoly(n))\) .

      Method : reduce more in Key Observation (e.g. \(k\to k-1\) improve to \(k\to k-2\)) by considering more edges.

      Kernel method : \(\mathcal O(|E|+(5^{1/4})^k k^2)\) .

Chapter 6 Approximation Algorithm

6.1 Approximation Criteria

  1. Approximation ration \(\alpha\ge 1\)

    minimization problem : \(\alpha:=\sup_I \frac{SOL(I)}{OPT(I)}\).

    maximization problem : \(\alpha:=\sup_I \frac{OPT(I)}{SOL(I)}\).

  2. Some examples

    • Exact solution : \(\alpha=1\)
    • Constant approximation : \(\alpha=\mathcal O(1)\)
    • Logarithm approximation : \(\alpha=\mathcal O(\log n)\)
    • \(\alpha=\mathcal O(n)\) : usually not interesting.
    • Poly-Time Approximation Scheme ( PTAS ) : \(\alpha=1+\epsilon\) , Time : poly when \(\epsilon\) is constant.
    • Fully Poly-Time Approximation Scheme ( FPTAS ) : Time : \(\mathcal O(poly(\frac{1}{\epsilon},n))\).

6.2 Load Balancing Problem

  1. Description :

    Input : \(n\) jobs , \(t_i\) : time for job \(i\) , \(m\) machines

    Output : assign each job to one machine , say machine \(j\) is assigned jobs \(S_j\) , minimize \(\max_{j=1}^m \left\{\sum_{i\in S_j}t_i\right\}\) .

  2. Load Balancing Problem is NP-Hard

    Prove 1 : special subset sum \(\le_p\) Load Balancing Problem

    special subset sum : \(W=\frac{\sum_{i=1}^n w_i}{2}\) . This is still NPC.

    Prove 2 : 3-partition problem \(\le_p\) Load Balancing Problem

    3-partition problem :

    Given \(3n\) numbers \(a_1,\cdots,a_{3n}\) , partition them into \(n\) groups , with each group \(3\) numbers.

    Decide whether we can find a partition , s.t. the sum of each group is the same.

    Theorem : 3-partition problem is NPC ( even if \(a_i\) is poly-sized )

  3. Greedy Algorithm 1

    Process jobs in arbitrary order one-by-one . Assign the job to the machine with current minimal load.

    1. Analysis : \(\alpha=2\).

    2. Proof :

      load of OPT : \(T^*\).

      • Observation 1 : \(T^*\ge \max_{i=1}^n t_i\).
      • Observation 2 : \(T^*\ge \frac{1}{m}\sum_{i=1}^n t_i\).

      Let's prove that \(SOL\le \max_{i=1}^n t_i+\frac{1}{m}\sum_{i=1}^n t_i\) , so \(SOL\le 2T^*\).

      Let \(SOL=\sum_{i\in S_j} t_i\) , and \(t_k\) is the last job assigned to machine \(j\) . Therefore , we can prove that before assigning \(t_k\) , the load of machine \(j\) is \(\le \frac{1}{m}\sum_{i=1}^m t_i\) ( since machine \(j\) with current minimal load ) . \(t_k\le \max_{i=1}^n t_i\) . Therefore , \(SOL\le \frac{1}{m}\sum_{i=1}^m t_i +\max_{i=1}^n t_i\).

    3. Worst case :

      \(n=m(m-1)+1\) , with \(m(m-1)\) jobs \(t_j=1\) , and \(1\) job \(t_j=m\).

      Then \(SOL=2m-1\) , but \(OPT=m\). \(\alpha=\frac{2m-1}{m}\to 2\) .

  4. Greedy Algorithm 2 ( improvement )

    Process jobs in decreasing order.

    1. Analysis : \(\alpha=1.5\) ( not tight )

    2. Proof :

      1. Case 1 : \(n\le m\) , \(OPT=SOL\)

      2. Case 2 : \(n>m\) , so \(t_k\le t_{m+1}\le \frac{OPT}{2}\).

        \(t_{m+1},t_i\) must be both in one machine for some \(1\le i\le m\).

6.3 \(k\)-Center Problem

  1. Description

    Input : a metric graph \(G=(V,E,d)\) , number of centers \(k\).

    Def [ metric graph ] : \(d\) satisfies \(\begin{cases}d(u,u)=0&u\in V\\d(u,v)=d(v,u)&u,v\in V\\d(s,t)\le d(s,w)+d(w,t)&s,t,w\in V\end{cases}\).

    Output : Choose \(k\) centers \(C=\{c_1,\cdots,c_k\}\subseteq V\). Minimize max cluster radius. \[ d(v,C):=\min_{i=1}^k \{d(v,c_i)\} \] Minimize \(\max_{v\in V}d(v,C)\).

  2. Remark : Applications in ML

    unsupervised learning.

    \(k\)-means : minimize \(\sum_{v\in V}d(v,C)^2\).

    \(k\)-median : minimize \(\sum_{v\in V}d(v,C)\).

  3. Greedy Algorithm with \(\alpha=2\)

    1. Algorithm

      1. Guess optimal radius \(r\) ( at most \(|E|\) possibilities )

      2. Choose an arbitrary vertex \(v\in S\) , \(C\gets C\cup\{v\}\)

      3. Let \(D=\{u\in S:d(u,v)\le 2r\}\) , \(S\gets S\backslash D\)

      4. If \(S\neq \varnothing\) , go to (2)

      5. Finally , if \(|C|>k\) , our guess is too small ( fail )

        If \(|C|\le k\) , our guess succeed.

      Hopefully , we want if we success at \(r\) , then \(OPT\ge r\) and \(SOL\le 2r\) .

    2. Proof

      Key Observation : If \(r\ge OPT\) , then we will definitely success.

      Suppose \(C_{OPT}=\{c_1,\cdots,c_{opt}\}\) , and \(b_v=\arg\min_{1\le i\le OPT} d(v,c_i)\) .

      In each step , suppose that we choose \(v\) , then \(\forall u\in V,b_u=b_v\) , \(d(u,v)\le d(u,c_{b_v})+d(c_{b_v},v)\le 2r\) . Therefore , \(\{u:b_u=b_v\}\subseteq D_{v}\) .

      That is , each step we can remove at least one optimal cluster .

  4. \(\alpha=2\) is the best result

    Inapproximability : \(\forall \alpha<2\) , this problem is NP-Hard .

    Promise Problem : For a given instance , such that the optimal value is \(\le h\) or \(> \alpha h\) . Decide which case it is.

    Claim : Any \(\alpha\)-approximation algorithm can be used to solve Promise Problem.

    Approximation answer : \(\le \alpha h\) , then \(OPT\le \alpha h\) , then \(OPT\le h\) , output true

    Approximation answer : $>h $ , then \(OPT>h\) , then \(OPT>\alpha h\) , output false

  5. THM : Promise \(k\)-center for any \(\alpha<2\) is NPC

    Proof : Dominating Set \(\le_p\) Promise \(k\)-center for \(\alpha<2\)

    \(G\) for Dominating Set , construct \(G'\) for Promise \(k\)-center for \(\alpha<2\)

    \(G'\) : \(E'\) complete graph , \(w(e)=\begin{cases}1&e\in E\\2&e\notin E\end{cases}\)

    \((G,k)\in DS\iff\) the answer of \(G'\) is true for \((G',k,h=1)\) .