The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Recently, pattern matching with flexible gap constraints has attracted extensive attention especially in biological sequence analysis and mining patterns from sequences. An issue is to search Maximal Pattern Matching with Gaps and the One-Off Condition (MPMGOOC). Firstly, we introduce the concept of MPMGOOC. In order to solve the problem, we propose some special concepts of Nettree which is different...
In this paper, we present a fast biological data mining algorithm named IRTM based on embedded frequent subtree. We also advance a string encoding method for representing the trees, a scope-list for extending all substrings and some pruning rules which can further reduce the computational time and space cost. Experimental results show that IRTM algorithm can achieve significantly performance improvement...
Here we apply the graph-theoretic concept of betweenness centrality to a class of protein repeats, e.g., Armadillo (ARM) and HEAT. The Betweenness of a node represents how often a node is traversed on the shortest path between all pairs of nodes i, j in the network and thus gives the contribution of each node in the network. These repeats are not easily detectable at the sequence level because of...
In this article, we present a novel algorithm for measuring protein similarity based on their three dimensional structure (protein tertiary structure). The PROSIMA algorithm using suffix tress for discovering common parts of main-chains of all proteins appearing in current NCSB protein data bank (PDB). By identifying these common parts we build a vector model and next use classical information retrieval...
Though DNA microarray technology simultaneously measures the expression levels of thousands of genes, only a few underlying gene features may account for significant data variation in gene classification problems. Selection of features from huge data set is difficult and so dimension reduction of gene expression data set is essential in order to determining important features, which play key role...
RNAi is a naturally occurring, highly conserved phenomenon of RNA mediated gene silencing among the multicellular organisms. Currently, RNAi has been successfully applied in functional genomics, therapeutics and new drug target identification in mammals and other eukaryotes. The uniqueness lies in sequence specific gene knock down which made RNAi an indispensible technology. In the mechanism of RNAi,...
This paper presents an incremental clustering algorithm based on DGC, a density-based algorithm we developed earlier. We experimented with real-life datasets and both methods perform satisfactorily. The methods have been compared with some well-known clustering algorithms and they perform well in terms of z-score cluster validity measure.
Peptide computing is a novel way of computing that uses the interaction between peptides and antibodies as a computational model. Since several copies of peptides and antibodies can interact at the same time, this computing model is massively parallel and highly non-deterministic in nature. Due to these advantages this computational model helps us to solve some of the very hard combinatorial problems...
Microarray technology has emerged as one of the robust methodology for quantitatively analyzing gene expressions of thousands of genes simultaneously. The experimental design, image processing and data analysis are the three major stages of microarray based analysis. The main goal of array image processing is, to measure the intensity of the spots and quantify the gene expression values based on these...
The investigation of potential microarray markers, which in turn will speed up the molecular analysis and provide reliable results on the benefit of patient care is of significant importance. Feature selection techniques, which aim at minimizing the dimensionality of the microarray data by keeping the most significant genes according to their expression values is a necessary component towards this...
The proper application of statistics, machine learning, and data-mining techniques in routine clinical diagnostics to classify diseases using their genetic expression profile is still a challenge. One critical issue is the overall inability of most state-of-the-art classifiers to identify out-of-class samples, i.e., samples that do not belong to any of the available classes. This paper shows a possible...
Cluster analysis offers a suite of powerful unsupervised methods, commonly used as exploratory data analysis tools. Such tools can be proven especially useful when we face the situation of analyzing large data sets and want to get an intuitive insight at subtle correlations between instances of the data. In this work, we demonstrate that simple hierarchical clustering approaches (based on compositional...
It has been shown previously that glucocorticoids exert a dual mechanism of action, meaning cytotoxic and mitogenic as well as mitogenic and anti-apoptotic, in a dose-dependent manner on CCRF-CEM cells at 72 h. Early gene expression response suggested also a dose-dependent dual mechanism of action of prednisolone which is apparently reflected on cell state upon 72 h of treatment. The present work...
DNA Microarrays have dramatically reshaped modern biological research by deriving profiles of genome-wide expression of living organisms, and producing an unprecedented wealth of quantitative data. Given this characteristic, microarray experiments are considered high-throughput both in terms of data (data intensive) and processing (computationally intensive). GRISSOM enables exploitation of GRID resources...
Ribonucleic acid (RNA) has important structural and functional roles in the cell and plays roles in many stages of protein synthesis. The structure of RNA largely determines its function. Current physical methods for structure determination are time-consuming and expensive, thus the methods for the computational prediction of structure are necessary. Various algorithms that have been used for RNA...
This work addresses the segmentation of two-dimensional polyacrylamide gel electrophoresis images containing overlapping protein spots. A novel segmentation approach is proposed, which is capable of detecting spot boundaries within the region of overlap. The proposed approach is based on the observation that the spot boundaries in the overlap region are associated with local intensity minima. The...
One of the crucial tasks in many inference problems is the extraction of an underlying sparse graphical model from a given number of high-dimensional measurements. In machine learning, this is frequently achieved using, as a penalty term, the Lp norm of the model parameters, with p ?? 1 for efficient dilution. Here we propose a statistical-mechanics analysis of the problem in the setting of perceptron...
Most of the biclustering algorithms for the analysis of high dimensional gene expression data use some distance measure or correlation coefficient between a pair of genes as the similarity measure. These measures capture only linear relationships between the genes but non linear relationships may exist amongst them. Mutual information is a more general measure to investigate relationships (positive,...
A well-structured and controlled design methodology, along with a supporting hierarchical design system, has been developed to optimally support the development effort on several programs requiring gate array and semi custom Very Large Scale Integration (VLSI) design. In this paper, we will present an application of VLSI in System Biology. This work examines signaling networks that control the survival...
Mining sequential patterns in biological data has attracted a great deal of attention in the last couple of years. Biologists are interested in finding the frequent orderly arrangement of motifs that may be responsible for similar expression of a group of genes. The size of the output space can be greatly reduced if only the maximal frequent patterns are reported. In this paper we present maximal...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.