The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Selecting relevant features in data modeling is critical to ensure effective and accurate prediction of future effects. The problem becomes compounded when the relevance of previously selected features cannot be guaranteed due to changes in the underlying dataset. We propose an algorithm based on the statistical plaid model for the discovery and tracking of feature relevance scores in datasets that...
Alternative polyadenylation (APA) can lead to various mRNA isforms differing in their 3' UTR, which contributes to the dynamics of gene regulation, including stability, localization and translation of mRNA. However, clustering of genes using poly(A) site data has not been extensively studied. Here we constructed a two-layer model based on canonical correlation analysis (CCA) to explore the clustering...
Microarray gene expression data is voluminous and very few genes in the dataset are informative for disease analysis. Selecting those genes from the whole dataset is a very challenging task. There are many optimization techniques used by the researchers for gene subset selection but none of them provides global optimal solution for all gene datasets. In the paper, we have proposed a strength pareto...
It is a main task of exploratory data mining, and a common technique for statistical data analysis, used in many fields, including machine learning, pattern recognition, image analysis, information retrieval, and bioinformatics. This paper proposes an efficient algorithm used to evaluate the results of cluster analysis in classifying the patients with the tumor. The Prediction Strength, combined with...
Clustering gene expression data is an important task in bioinformatics research and biomedical applications. In this paper, we present an effective clustering algorithm for gene expression data. The clustering algorithm is based on the analysis of data's density distribution. We propose an intersecting partition of gene expression data into the supports of data points. Density clusters are maximally...
Cluster analysis of gene expression data is one of the most useful tools for identifying biologically relevant groups of genes, however, gene expression data suffer severely from the problems of measurement noise, dimension curse, high redundancy between genes, and the functional annotation of genes is incomplete and imprecise. These properties lead to most of the traditional clustering algorithms...
Microarray studies are used in molecular biology to explore patterns of expression of thousands of genes. This methodology has relevantly developed in the last decades, and so has the need for appropriate methods for analyzing high-throughput data generated from such experiments. Identifying sets of genes and samples characterized by similar values of expression and validating these results are two...
Cluster validation techniques are essential tools within cluster analysis, helpful to the interpretation of clustering results. In this study, the validation ability of Dunn's index in gene clustering was investigated with public gene expression datasets clustered by hierarchical clustering, K-means and Self-organizing maps. It was made clear that Dunn's index would give misleading validity results...
This paper presents a comprehensive analysis of a novel temporal dataset of Shewanella oneidensis. Here we propose to cluster the temporal gene expression data of Shewanella oneidensis to define its molecular response at different time intervals following acidic pH and basic pH exposure and to find a relation of these temporal data at different environmental conditions. A mapping between those clusters...
Cluster analysis is an important tool for discovering the structures and patterns hidden in gene expression data. In this paper, a new algorithm for clustering gene expression profiles is proposed. In this method, we find natural clusters in the data based on a competitive learning strategy. Using partially known modes as constraints in our method, we reduce the sensitivity of the clustering procedure...
Normalization before clustering is often needed for proximity indices, such as Euclidian distance, which are sensitive to differences in the magnitude or scales of the attributes. The goal is to equalize the size or magnitude and the variability of these features. This can also be seen as a way to adjust the relative weighting of the attributes. In this context, we present a first large scale data...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.