The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Biomedical text mining is the process of extracting high quality information from biomedical text. It has a lot of applications in genetics related studies. Information about gene-disease associations is very important in drug design. Laboratory based methods for gene-disease association extraction need more effort and time. Literature mining is a good method for generating candidate set of genes...
With the advent of modern scientific methods for data collection, huge volumes of biological data are now getting accumulated at various data banks. The enormity of such data and the complexity of biological networks greatly increase the challenges of understanding and interpreting the underlying data. Effective and efficient Data Mining techniques are essential to unearth useful information from...
Self Organizing Map (SOM) is a significant algorithmic methodology to visualize data spaces of larger dimensions. Accurate analysis of the input data requires a well-trained SOM. Many measures are there in practice to analyse the quality of the map. One of the most commonly used measure is Quantization Error. A trained SOM grid with minimum quantization error may not be topologically well preserved...
The findings of Human Genome Project revealed Single Nucleotide Polymorphisms (SNPs) as the most common form of genetic variations in humans. It also demonstrated the active role of SNPs in the genesis of many system disorders. Thus it is related to one or more diseases. SNP-Gene interactions across multiple diseases is a novel area of research which can result in understanding its commonality or...
Understanding the role of genetics is very important for the in-depth study of a disease. Even though lots of information about gene-disease association is available, it is difficult even for an expert user to manually extract it from the huge volume of literature. Therefore, this work introduces a novel extraction tool that can identify disease associated genes from the literature using text-mining...
The bioinformatics field which is now dealing with a vast amount of data such as the protein patterns and the gene expression data, with a lot more information still to be unraveled, uses the basic techniques and tools for Data mining for retrieving useful information from huge biological databases. Clustering is a popular Data mining technique which is extensively used efficiently. The K-means clustering...
Clustering is one of the widely used unsupervised methods to interpret and analyze huge amount of data in the field of Bioinformatics. One of the major issues involved in clustering is to address the growing data so that the cluster quality does not decrease with increase in the size of the data. In this work, we compare the promising clustering algorithms on various cancer domains and suggest improvements...
The identification of new therapeutic uses of existing drugs, or drug repositioning, offers the possibility of faster drug development, reduced risk, lesser cost and shorter paths to approval. Several computational methods have been proposed in the literature that makes use of publicly available transcriptional data to reposition drugs against diseases. In this work, we carry out a data mining process...
This paper deals with the problem of extracting acronym-definition pairs from biomedical text. We propose an improved Text mining system based on pattern matching method and space reduction heuristics which increases both recall and precision. Three metrics were used for evaluating the system - recall (measure of how much relevant data the system has extracted from text), precision (measure of how...
Identification of structural and sequence motifs in genomic sequences is gaining much attention now a days. Ribonucleic acid or RNA is one of the important biomolecule whose secondary structure defines its functionality. Soft computing techniques like genetic programming have been used for motif identification. In this paper, we propose a method for identifying common structural motifs in a set of...
Hepatitis C Virus (HCV) has become a major risk factor for the development of Hepatocellular Carcinoma (HCC). A framework has been developed to identify genomic markers associated with HCC of HCV sequences, which comprises of clustering, feature selection and classification. A new method for feature extraction for genomic sequences rooted in Hash tables has been proposed. It requires less memory compared...
Clustering is a data mining technique that classifies a set of observations into several clusters based on some similarity measures. The most commonly used partitioning based clustering algorithm is K-means. However, the K-means algorithm has several drawbacks. The algorithm generates a local optimal solution based on the randomly chosen initial centroids. A recently developed meta heuristic optimization...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.