The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
We study the sequence lengths required by neighbor-joining, greedy parsimony, and a phylogenetic reconstruction method (DCMNJ+MP) based on disk-covering and the maximum parsimony criterion. We use extensive simulations based on random birth-death trees, with controlled deviations from ultrametricity, to collect data on the scaling of sequence-length requirements...
We address the problem of estimating the degree to which the evolutionary history of a set of molecular sequences violates a strong molecular clock hypothesis. We quantify this deviation formally, by defining the “stretch” of a model tree with respect to the underlying ultrametric tree (indicated by time). We then define the “minimum stretch” of a dataset for a tree and show how this can be computed...
We study the convergence rates of neighbor-joining and several new phylogenetic reconstruction methods on families of trees of bounded diameter. Our study presents theoretically obtained convergence rates, as well as an empirical study based upon simulation of evolution on random birth-death trees. We find that the new phylogenetic methods offer an advantage over the neighbor-joining method, except...
Many large-scale phylogenetic reconstruction methods attempt to solve hard optimization problems such as Maximum Parsimony (MP) and Maximum Likelihood (ML), but they are severely limited by the number of taxa that they can handle in a reasonable timeframe. A standard heuristic approach to this problem is the divide-and-conquer strategy: decompose the data set into smaller subsets, solve the subsets...
The ranking of SNPs and prediction of phenotypes in continuous genome wide association studies is a subject of increasing interest with applications in personalized medicine and animal and plant breeding. The ranking of SNPs in case control (discrete label) genome wide association studies has been examined in several previous studies with machine learning techniques but this is poorly explored for...
The era of genomics brings the potential of better DNA based risk prediction and treatment. While genome-wide association studies are extensively studied for risk prediction, the potential of using whole exome data for this purpose is unclear. We explore this problem for chronic lymphocytic leukemia that is one of the largest whole exome dataset of 186 case and 169 controls available from the NIH...
Background Determining interacting SNPs in genome-wide association studies is computationally expensive yet of considerable interest in genomics. Findings We present a program Chi8 that calculates the Chi-square 8 degree of freedom test between all pairs of SNPs in a brute force manner on a Graphics Processing Unit. We analyze each of the seven WTCCC genome-wide association studies that have about...
Background Programs based on hash tables and Burrows-Wheeler are very fast for mapping short reads to genomes but have low accuracy in the presence of mismatches and gaps. Such reads can be aligned accurately with the Smith-Waterman algorithm but it can take hours and days to map millions of reads even for bacteria genomes. Results We introduce a GPU program called MaxSSmap with the aim of achieving...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.