The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Background Because of the short read length of high throughput sequencing data, assembly errors are introduced in genome assembly, which may have adverse impact to the downstream data analysis. Several tools have been developed to eliminate these errors by either 1) comparing the assembled sequences with some similar reference genome, or 2) analyzing paired-end reads aligned to the assembled sequences...
Background Despite the large volume of genome sequencing data produced by next-generation sequencing technologies and the highly sophisticated software dedicated to handling these types of data, gaps are commonly found in draft genome assemblies. The existence of gaps compromises our ability to take full advantage of the genome data. This study aims to identify a practical approach for biologists...
This paper presents a new way for keyword spotting in degraded imaged document. Two prevalent word indexing, OCR and word shape coding, are combined compactly based on the recognition confidence evaluation. The basic procedures are as follows. First, OCR candidates are used for OCR indexing. Second, a new stoke feature and convex-concave feature of word are adopted for word shape coding. Furthermore,...
Identification of salient patterns for the classification of gene expression profiles is a useful step in examining the biological significance and correlation of genes with disease states. We propose a clustering-based approach in which feature selection is first carried out to identify influential genes and then salient patterns are determined to characterize each of the different classes. The proposed...
On the basis of the assumption that diseases with similar phenotypes are caused by functionally related genes, We describe a candidate gene prioritization method that is entirely based on mutual information of human protein-protein interactions and disease phenotypic similarities, and develop a tool named GENEDIG to calculate the mutual information concordance score between the human protein network...
Aiming at the various distribute clustering problems in diffusion model for all data points, providing a new clustering algorithm (CDD) based on the change of density. CDD searches the core point using a typical clustering algorithm (DBSCAN) which based on the density, then calculate the direction, speed and acceleration of density diffused which through analyze the diffusion rule of data sample and...
In the classical visual object tracking, the observation model and the inference model are two essential parts for efficient tracking. Although the strong discriminative features have been used to model observation model in tracking, most of these features have high dimension, which handicaps the tracking speed. In this paper, we introduce a pyramid random subspace to build the observation model,...
Disease genes identification is the key to the issue of human genetic diseases cure. This paper describes a new method of human genetic disease gene prediction, which is based on the relations between clinical manifestations and protein-protein interaction network. A new prediction model which is based on associated probability and Pearson correlation coefficient is also described. This mathematical...
Recently, object tracking is viewed as a foreground/background two-class classification problem. In this paper, we propose a non-parameter approach to model the observation model for tracking via graph, which is a semi-supervised approach. More specially, the topology structure of graph is carefully designed to reflect the properties of the sample's distribution during tracking. In predication, the...
As designing practical algorithms of learning from examples, one has to deal with some optimization problems. The major optimization problems are: the smallest feature subset selection, the smallest decision tree induction, and the smallest k-DNF induction. In this paper, we show that all these optimization problems listed as above are NP-hard, and we present new greedy algorithms for solving these...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.