### Question

In the *uncapacitated facility location problem*, we have a set of clients $D$ and a set of facilities $F$. For each client $j \in D$ and facility $i \in F$, there is a cost $c_{ij}$ of assigning client $j$ to facility $i$. Furthermore, there is a cost $f_i$ associated with each facility $i \in F$. The goal of the problem is to choose a subset of facilities $F' \subseteq F$ so as to minimize the total cost of the facilities in $F'$ and the cost of assigning each client $j \in D$ to the nearest facility in $F'$. In other words, we wish to find $F'$ so as to minimize $\sum_{i \in F'} f_i + \sum_{j \in D}\min_{i \in F}c_{ij}$.

- Show that there exists some $c$ such that there is no $(c \ln |D|)$-approximation algorithm for the uncapacitated facility location problem unless P = NP.
- Give an $O(\ln |D|)$-approximation algorithm for the uncapacitated facility location problem.

### Answers

**1.** We show an approximation preserving reduction from the set cover problem:

Given an instance of the set cover problem $(E, \mathcal{S})$ we create the following instance of the facility location problem:$(E, \mathcal{S})$ and set

$c_{ij} = \begin{cases} 0 & \text{if}\; i \in S_j \\ 1 & \text{otherwise} \end{cases}$We also set $f_i$ to be 1 for each $i$.Clearly every solution to the facility location can be transformed to a solution to the set cover problem of the same price and vice versa.

**2.** We now formalize the facility location problem as a set cover problem. In this formulation there will be an exponential number of sets but we will argue that we can use the greedy algorithm for this instance of set cover in polynomial time.

Given a facility location instance $(F, D, c, f)$ we define the following set cover instance $(\mathcal{S}, D, w)$ where $\mathcal{S} = F \times 2^D$. We let the set $(i, D' \subseteq D) \in \mathcal{S}$ covers the elements of $D'$ and define its cost to be $f_i + \sum_{j \in D}c_{ij}$.

Clearly an optimal solution to the above set cover instance is an optimal solution to the corresponded facility location instance.

Now we argue that, in each iteration, we can find the set that minimizes the cost per element in polynomial time. For two number $k \in [|F|]$ and $l \in [|D|]$ let the set corresponded to $k$ and $l$ be $(k, D')$ where $D'$ are the $l$ clients closest to facility $k$ there are $kl$ such sets that can be found in polynomial time and one can convinced himself that the set that minimizes the cost per uncovered client is among those sets. Thus we have a $O(\ln D)$-approximation algorithm for the facility location problem.