The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Sequence overlap graphs, constructed based on suffix-prefix relationships between pairs of sequences, are an important data structure in computational biology. High throughput sequencers can read several million to a few billion DNA fragments in a single experiment, making the construction of overlap graphs for such datasets compute-intensive. In this paper, we present a Locality-Sensitive Hashing...
Background Alignment-free sequence comparison approaches have been garnering increasing interest in various data- and compute-intensive applications such as phylogenetic inference for large-scale sequences. While k-mer based methods are predominantly used in real applications, the average common substring (ACS) approach is emerging as one of the prominent alignment-free approaches. This ACS approach...
Background Hepatitis C is a major public health problem in the United States and worldwide. Outbreaks of hepatitis C virus (HCV) infections associated with unsafe injection practices, drug diversion, and other exposures to blood are difficult to detect and investigate. Molecular analysis has been frequently used in the study of HCV outbreaks and transmission chains; helping identify a cluster of sequences...
We present an efficient parallel algorithm for the following problem: Given an input collection D of n sequences of total length N, a length threshold f and a mismatch threshold κ, report all κ-mismatch maximal common substrings of length at least f over all pairs of strings in D. This problem is motivated by clustering and assembly applications in computational biology, where D is a collection of...
Reverse-engineering genome-scale gene networks from gene expression data is a principal challenge in systems biology. Mutual information (MI) based methods are favored because of their ability to recover non-linear relationships, low algorithmic complexity, and their successful use in various biological applications such as gene function prediction. In this paper, we present the first ever construction...
Accurate estimation of evolutionary distance between two sequences is fundamental and critical to phylogenetic analysis aiming to reconstruct the correct evolutionary history and estimate the time of divergence between species. Conventionally, sequence alignment is usually used for this purpose due to its high accuracy. However, alignment-based approaches involve high computational cost. Therefore,...
Hepatitis C is a major public health problem in the United States and worldwide. Outbreaks of hepatitis C virus (HCV) infections associated with unsafe injection practices, drug diversion, and other exposures to blood are difficult to detect and investigate. Molecular analysis has been frequently used in the study of HCV outbreaks and transmission chains; helping identify a cluster of sequences as...
Learning Bayesian networks is NP-hard. Even with recent progress in heuristic and parallel algorithms, modeling capabilities still fall short of the scale of the problems encountered. In this paper, we present a massively parallel method for Bayesian network structure learning, and demonstrate its capability by constructing genome-scale gene networks of the model plant Arabidopsis thaliana from over...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.