The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Dimensionality reduction applied to gene expression is challenging for machine learning algorithms due to a small number of samples and a high number of attributes. This paper proposes a preprocessing phase by means of random projection method in microarray data. Experimental results are promising and it shows that the use of this method improves the performance of classification algorithms.
In this paper, we present a graph-based approach to automatically detect defective zebrafish embryos. Here, the zebrafish is segmented from the background using a texture descriptor and morphological operations. In this way, we can represent the embryo shape as a graph, for which we propose a vectorisation method to recover clique histogram vectors for classification. The clique histogram represents...
Based on the chaos game representation of protein sequences, a new method to predict the subcellular location of apoptosis protein sequences was put forward. Using of Support Vector Machine,and tested on a known dataset which includes 317 apoptosis proteins, the higher predictive success rates were obtained. Our prediction results showed that the chaos game representation of protein sequences were...
An important limitation of current protein secondary structure prediction tools is the bad performance in locating the secondary structure boundaries. Efficiently utilize the residue position-specific preference around secondary structure boundaries can help to resolve this problem. TLSSP (two level secondary structure predictor), proposed in this study, used a two-level strategy to utilize these...
Selecting a subset of genes with strong discriminative power is a very important step in classification problems based on gene expression data. Lasso is known to have automatic variable selection ability in linear regression analysis. This paper uses Lasso to select most informative genes to represent the class label as a linear function of the gene expression data. The selected genes are further...
Supervised learning methods have been recently exploited to learn gene regulatory networks from gene expression data. They consist of building a binary classifier from feature vectors composed by expression levels of a set of known regulatory connections, available in public databases (eg. RegulonDB, TRRD, Transfac, IPA), and using such a classifier to predict new unknown connections. The input to...
We have explored correlations between the measured efficiency of the RNAi process and several computed signatures that characterize equilibrium secondary structure of the participating mRNA, siRNA, and their complexes. A previously published data set of 609 experimental points (with efficiency represented as percentage of remaining mRNA) was used for the analysis. While virtually no correlation with...
In feature gene selection, filtering model concerns classification accuracy while ignoring gene redundancy problem. On the other hand, gene clustering finds correlated genes without considering their predictive abilities. It is valuable to enhance their performances by the help of each other. We report a new feature gene extraction algorithm, namely double-thresholding extraction of feature gene (DEFG),...
Synthetic lethal genetic interactions are of interest as they can be used to predict function of unknown proteins and find drug target or drug combinations. In this study, we applied support vector machine (SVM) classifier to predict synthetic lethal genetic interactions in Saccharomyces cerevisiae based on domain information in proteins. We found that our method can predict synthetic lethal genetic...
The classification of Raman spectra is useful in identification and diagnosis applications. We have obtained Raman spectra from bacterial samples using three different species of bacteria. Before any form of classification can be carried out on the Raman spectra it is important that some form of normalization is used. This is due to the nature of the readings obtained by the acquisition equipment...
This paper presents a texture analysis method on digital chest radiograph to distinguish pneumoconiosis chest from normal chest. First, two lung fields are segmented from a digital chest X-ray image by the active shape model (ASM) method and regions of interest (ROIs) are selected in inter-rib areas along the outer and middle zones of the lung fields. Second, the chest image is preprocessed by multi-scale...
Ontology learning aims to facilitate the construction of ontologies by decreasing the amount of effort required to produce an ontology for a new domain. However, there are few studies that attempt to automate the entire ontology learning process from the collection of domain-specific literature, to text mining to build new ontologies or enrich existing ones. In this paper, we present a complete framework...
A novel method of feature extraction form protein sequences, structures and physicochemical properties has been proposed and obtained a better classification results by the key eigenvector obtained form knowledge reduction combined with the algorithm of support vector machine. Based on Jackknife detecting methods, the comprehensive classification results 78.3% and 90.9% for all-??, all-??, ??+?? and...
Protein methylation modification has been discovered for half a century but still far less been studied than other modifications. Computational analysis is recently introduced to discover other unknown methylation sites based on few known ones. To effectively predict possible methylation, sophisticated classification strategy should be well devised. In this paper, we first extracted informative features...
This paper presents a novel fuzzy rule based gene ranking algorithm for extracting salient genes from a large set of microarray data which helps us to reduce computational efforts towards model building process. The proposed algorithm is an unsupervised approach and does not require class information for gene ranking and Microarray data has been used to form a set of robust fuzzy rule base which helps...
Flow cytometry (FCM) is widely used in health research and is a technique to measure cell properties such as phenotype, cytokine expression, etc., for up to millions of cells from a sample. FCM data analysis is a highly tedious, subjective and manually time-consuming (to the level of impracticality for some data) process that is based on intuition rather than standardized statistical inference. This...
Neural activity is very important source for data mining and can be used as a control signal for brain-computer interfaces (BCIs). Particularly, Magnetic signals of neurons are enriched with information about the movement of different part of the body such as wrist movement. In this paper, we use MEG (Magneto encephalography) signals of two subjects recorded during wrist movement task in four directions...
O-glycosylation of the mammalian protein is studied. It is serine or threonine specific, though any consensus sequence is still unknown. We have been applied support vector machines (SVM) for the prediction of O-glycosylation sites from various kinds of protein information, aiming to investigate a glycosylation condition and elucidate the mechanisms. In the present study, we focus on the distribution...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.