We consider the problem of placing a specified number (p) of facilities on the nodes of a network so as to minimize some measure of the distances between facilities. This formulation models a number of problems arising in facility location, statistical clustering, pattern recognition, and also a processor allocation problem in multiprocessor systems.
We consider the problem under three different objectives, namely minimizing the diameter, minimizing the average distance, and minimizing the variance. The problem is NP-hard under any of the objectives even when the edge weights obey triangle inequality. We observe that in general, even obtaining a relative approximation for any of the objectives is NP-hard.
For problem instances in which the edge weights satisfy triangle inequality, a general framework for approximating the minimum cost compact location problem for each of the above measures is presented. Our framework can be extended to the case when there are both node weights and edge weights, and the resulting approximation algorithms yield the same performance guarantee as in the edge weighted case. Our algorithms can be further generalized to the case when we are also given a set of distinguished nodes, and we are required to place a facility at each distinguished site plus p additional facilities so as to minimize any of the three objectives. Our algorithms are easy to parallelize. We have also developed polynomial time algorithms when the given network is a tree.