On the Approximability of Some Haplotyping Problems

John Abraham; Zhixiang Chen; Richard Fowler; Bin Fu; Binhai Zhu

doi:10.1007/978-3-642-02158-9_3

On the Approximability of Some Haplotyping Problems

John Abraham, Zhixiang Chen, Richard Fowler, Bin Fu, Binhai Zhu

Source

Lecture Notes in Computer Science > Algorithmic Aspects in Information and Management > 3-14

Abstract

In this paper, we study several versions of optimization problems related to haplotype reconstruction/identification. The input to the first problem is a set C ₁ of haplotypes, a set C ₂ of haplotypes, and a set G of genotypes. The objective is to select the minimum number of haplotypes from C ₂ so that together with haplotypes in C ₁ they resolve all (or the maximum number of) genotypes in G. We show that this problem has a factor-O(logn) polynomial time approximation. We also show that this problem does not admit any approximation with a factor better than O(logn) unless P=NP. For the corresponding reconstruction problem, i.e., when C ₂ is not given, the same approximability results hold.

The other versions of the haplotype identification problem are based on single individual haplotyping, including the well-known Minimum Fragment Removal (MFR) and Minimum SNP Removal (MSR), which have both shown to be APX-hard previously. We show in this paper that MFR has a polynomial time O(logn)-factor approximation. We also consider Maximum Fragment Identification (MFI), which is the complementary version of MFR; and Maximum SNP Identification (MSI), which is the complementary version of MSR. We show that, for any positive constant ε< 1, neither MFI nor MSI has a factor-n ^1 − ε polynomial time approximation algorithm unless P=NP.