Search results

Items from 1 to 20 out of 63 results

chapter

Using Randomised Vectors in Transcription Factor Binding Site Predictions

F Rezwan, Yi Sun, N Davey, R Adams, more

2010 Ninth International Conference on Machine Learning and Applications > 523 - 527

2010 Ninth International Conference on Machine Learning and Applications (ICMLA 2010)

Finding the location of binding sites in DNA is a difficult problem. Although the location of some binding sites have been experimentally identified, other parts of the genome may or may not contain binding sites. This poses problems with negative data in a trainable classifier. Here we show that using randomized negative data gives a large boost in classifier performance when compared to the original...

chapter

A New Approach to Classification with the Least Number of Features

S Klement, T Martinetz

2010 Ninth International Conference on Machine Learning and Applications > 141 - 146

2010 Ninth International Conference on Machine Learning and Applications (ICMLA 2010)

Recently, the so-called Support Feature Machine (SFM) was proposed as a novel approach to feature selection for classification, based on minimisation of the zero norm of a separating hyper plane. We propose an extension for linearly non-separable datasets that allows a direct trade-off between the number of misclassified data points and the number of dimensions. Results on toy examples as well as...

chapter

Feature Selection Based on Genetic Algorithm for Classification of Pre-miRNAs

Ke Han

2010 2nd International Conference on Information Engineering and Computer Science > 1 - 4

2010 2nd International Conference on Information Engineering and Computer Science (ICIECS)

Precursor miRNAs (pre-miRNAs) are usually extracted to obtain quite a lot of global and intrinsic folding features that include some redundant and useless features. Therefore,it is essential to select the most representative feature subset,which contributes to improve the classification efficiency.We propose a novel feature selection method based on genetic algorithm.The information gain of feature...

chapter

Several New Tools for Cancer Classification Combined with PLSDR Base on High-Dimensional Gene Expression Profile

JianGeng Li, Hui Li

2010 2nd International Conference on Information Engineering and Computer Science > 1 - 4

2010 2nd International Conference on Information Engineering and Computer Science (ICIECS)

It is known that Logistic Regression coupled with Partial Least Squares dimension reduction (PLSDR-LD) is capable of extracting a great deal of useful information for classification from gene expression profile and getting a rather high classification accuracy rate. In this study, we replace the logistic function of Logistic Regression with several functions which are similar to logistic function...

chapter

Sparse representation based feature selection for mass spectrometry data

Jiqing Ke, Lei Zhu, Bin Han, Qi Dai, more

2010 IEEE International Conference on Bioinformatics and Biomedicine Workshops (BIBMW) > 57 - 62

2010 IEEE International Conference on Bioinformatics and Biomedicine Workshops (BIBMW 2010)

Mass spectrometry (MS) data has been widely analyzed for the detection of early stage cancers. Its potential for seeking proteomic biomarkers has received a great deal of attention in recent years. In the sparse representation classification (SRC) framework, a testing sample is represented as a sparse linear combination of training samples. The coefficient vector of representation is obtained by a...

chapter

DNA microarray classification by means of weighted voting based on rough set classifier

Przemyslaw Górecki, Piotr Artiemjew

2010 International Conference of Soft Computing and Pattern Recognition > 269 - 272

2010 International Conference of Soft Computing and Pattern Recognition (SoCPaR 2010)

In this paper we present a new approach for classification of microarray data. Our methodology consists of two steps: an attribute selection, which aims at selection of the most informative genes, and a classification of expression profiles, which is carried out by weighted voting, a novel instance-based classifier based on Rough Set Theory. Attribute selection consists of two stages - initial selection,...

chapter

Abstraction Augmented Markov Models

C Caragea, A Silvescu, D Caragea, V Honavar

2010 IEEE International Conference on Data Mining > 68 - 77

2010 10th IEEE International Conference on Data Mining (ICDM 2010)

High accuracy sequence classification often requires the use of higher order Markov models (MMs). However, the number of MM parameters increases exponentially with the range of direct dependencies between sequence elements, thereby increasing the risk of over fitting when the data set is limited in size. We present abstraction augmented Markov models (AAMMs) that effectively reduce the number of numeric...

chapter

Truncation of protein sequences for fast profile alignment with application to subcellular localization

Man-Wai Mak, Wei Wang, Sun-Yuan Kung

2010 IEEE International Conference on Bioinformatics and Biomedicine (BIBM) > 115 - 120

2010 IEEE International Conference on Bioinformatics and Biomedicine (BIBM 2010)

We have recently found that the computation time of homology-based subcellular localization can be substantially reduced by aligning profiles up to the cleavage site positions of signal peptides, mitochondrial targeting peptides, and chloro-plast transit peptides [1]. While the method can reduce the profile alignment time by as much as 20 folds, it cannot reduce the computation time spent on creating...

chapter

Projecting partial least square and principle component regression across microarray studies

Chi-Cheng Haung, Shin-Hsin Tu, Heng-Hui Lien, Ching-Shui Huang, more

2010 IEEE International Conference on Bioinformatics and Biomedicine Workshops (BIBMW) > 506 - 511

2010 IEEE International Conference on Bioinformatics and Biomedicine Workshops (BIBMW 2010)

The study was to compare principle component (PC) versus partial least square (PLS) regression, the former unsupervised and the latter supervised gene component analysis, for highly complicated and correlated microarray gene expression profile. Projection of derived classifiers into independent samples for clinical phenotype prediction was evaluated as well. Previous studies had suggested that PLS...

chapter

Effects of partial reporting of classification results

M R Yousefi, Jianping Hua, Chao Sima, E R Dougherty

2010 IEEE International Workshop on Genomic Signal Processing and Statistics (GENSIPS) > 1 - 4

2010 IEEE International Workshop on Genomic Signal Processing and Statistics (GENSIPS)

When proposing a new classification scheme, perhaps in the form of a classification rule or feature selection method, modelers in the bioinformatics literature typically report its performance on data sets of interest, such as gene-expression microarrays. These data sets often include thousands of features but a small number of sample points, which increases variability in feature selection and error...

chapter

RMS bounds and sample size considerations for error estimation in linear discriminant analysis

A Zollanvari, U M Braga-Neto, E R Dougherty

2010 IEEE International Workshop on Genomic Signal Processing and Statistics (GENSIPS) > 1 - 4

2010 IEEE International Workshop on Genomic Signal Processing and Statistics (GENSIPS)

The validity of a classifier depends on the precision of the error estimator used to estimate its true error. This paper considers the necessary sample size to achieve a given validity measure, namely RMS, for resubstitution and leave-one-out error estimators in the context of LDA. It provides bounds for the RMS between the true error and both the resubstitution and leave-one-out error estimators...

chapter

Evaluation of Genetic Algorithms for tuning SVM parameters in multi-class problems

F Samadzadegan, A Soleymani, R A Abbaspour

2010 11th International Symposium on Computational Intelligence and Informatics (CINTI) > 323 - 328

2010 11th International Symposium on Computational Intelligence and Informatics (CINTI 2010)

Support Vector Machine (SVM) is a useful technique for data classification with successful applications in different fields of bioinformatics, image segmentation, data mining, etc. A key problem of these methods is how to choose an optimal kernel and how to optimize its parameters in the learning process of SVM. The objective of this study is to propose a Genetic Algorithm approach for parameter optimization...

chapter

A novel ensemble approach to prediction of protein subcellular location

Chen Yue-hui, Liu Li-yuan, Ma Bing-xian

2010 International Conference on Computer Application and System Modeling (ICCASM 2010) > 9 > V9-544 - V9-547

2010 International Conference on Computer Application and System Modeling (ICCASM 2010)

Much attention has been paid to the technically research and practical application of prediction of protein subcellular location since a great number of previous works by researchers proved the close relationship between protein function and its location as well as human genome project successfully completed over last decades. With rapid progress of computer's calculating speed, computational intelligence...

chapter

Hierarchical Multilabel Classification Using Top-Down Label Combination and Artificial Neural Networks

R Cerri, A C P L F de Carvalho

2010 Eleventh Brazilian Symposium on Neural Networks > 253 - 258

2010 Eleventh Brazilian Symposium on Neural Networks (SBRN 2010)

Hierarchical Multilabel Classification is a classification problem where the classes of the examples are hierarchically structured and, additionally, each example can simultaneously belong to two or more classes in the same hierarchical level. This paper proposes a new Top-Down classification method based on a label combination process, using Artificial Neural Networks as base classifiers. The experimental...

chapter

A Markov Chain Monte Carlo Sampling Relevance Vector Machine Model for Recognizing Transcription Start Sites

Huang Juncai, Wang Fengbi, Mao Huanzhang, Zhou Mingtian

2010 International Conference on Artificial Intelligence and Computational Intelligence > 3 > 185 - 188

2010 International Conference on Artificial Intelligence and Computational Intelligence (AICI 2010)

The task of finding transcription start sites (TSSs) can be modeled as a classification problem. Relevance vector machines (RVM) is a family of machine learning methods that represent a Bayesian approach to the training of general linear models (GLM). Based on the Markov-chain Monte Carlo(MCMC) sampler, propose a model for using the RVM to explore very large numbers of candidate features. The model...

chapter

Biomarker Identification Based on the L1 + L1 Penalized Model

Meng-Yun Wu, Dao-Qing Dai, Yu Shi, Hong Yan

2010 Chinese Conference on Pattern Recognition (CCPR) > 1 - 5

2010 Chinese Conference on Pattern Recognition (CCPR 2010)

Penalized feature selection and classification techniques are promising in bioinformatics studies of high-dimensional microarray data. The penalized objective function of penalization methods includes two parts: classification objective function and penalty terms. We propose a novel L₁ + L₁ model. The classification objective function is chosen as the negative log-likelihood function based on the...

chapter

Prediction of O-Glycosylation Sites in Protein Sequence by Kernel Principal Component Analysis

Xue-mei Yang, Xue-wei Cui, Xue-zhu Yang

2010 International Conference on Computational Aspects of Social Networks > 267 - 270

2010 International Conference on Computational Aspects of Social Networks (CASoN 2010)

O-glycosylation is one of the main types of the mammalian protein glycosylation, it occurs on the particular site of serine and threonine. It's important to predict the O-glycosylation site. In this paper, we propose a new method of kernel principal component analysis (KPCA) to predict the O-glycosylation site with window size w=9. The samples for experiment are encoded by the sparse coding and projected...

chapter

Protein classification using family profiles

YuGang Li, Yao Lu, Fa Zhang, ZhenGe Qiu, more

2010 Seventh International Conference on Fuzzy Systems and Knowledge Discovery > 5 > 2212 - 2216

2010 Seventh International Conference on Fuzzy Systems and Knowledge Discovery (FSKD)

Protein classification plays an important role in the research in Bioinformatics. Many discriminative methods, including the SVM based algorithms are used to do this job. In order to use these methods, variable length protein sequences must be converted into fixed-length dimensional vectors. The current work presents a new method of converting sequences into vectors. The method first constructs profile...

chapter

A new fuzzy membership computation method for fuzzy support vector machines

Trung Le, Dat Tran, Wanli Ma, D Sharma

International Conference on Communications and Electronics 2010 > 153 - 157

2010 Third International Conference on Communications and Electronics (ICCE 2010)

Support vector machine (SVM) considers all data points with the same importance in classification problems, therefore SVM is very sensitive to noisy data or outliers. Current fuzzy approach to two-class SVM introduces a fuzzy membership to each data point in order to reduce the sensitivity of less important data, however computing fuzzy memberships is still a challenge. It has been found that the...

chapter

Utilization of Dynamic Reducts to Improve Performance of the Rule-Based Similarity Model for Highly-Dimensional Data

Andrzej Janusz

2010 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology > 3 > 432 - 435

2010 IEEE/ACM International Conference on Web Intelligence-Intelligent Agent Technology (WI-IAT)

This paper presents an extension to the Rule-Based Similarity (RBS) model a novel rough set approach to the problem of learning a similarity relation from data. The original model, proposed in [1], applied the notion of Tversky's feature contrast model in a rough set framework to facilitate an accurate case-based classification. In the dynamic RBS model, a dynamic reducts technique is used to broaden...

Keywords:
TRAINING
PATTERN CLASSIFICATION

Publication date

Set your own date range

INFONA - science communication portal

Search results

Using Randomised Vectors in Transcription Factor Binding Site Predictions

A New Approach to Classification with the Least Number of Features

Feature Selection Based on Genetic Algorithm for Classification of Pre-miRNAs

Several New Tools for Cancer Classification Combined with PLSDR Base on High-Dimensional Gene Expression Profile

Sparse representation based feature selection for mass spectrometry data

DNA microarray classification by means of weighted voting based on rough set classifier

Abstraction Augmented Markov Models

Truncation of protein sequences for fast profile alignment with application to subcellular localization

Projecting partial least square and principle component regression across microarray studies

Effects of partial reporting of classification results

RMS bounds and sample size considerations for error estimation in linear discriminant analysis

Evaluation of Genetic Algorithms for tuning SVM parameters in multi-class problems

A novel ensemble approach to prediction of protein subcellular location

Hierarchical Multilabel Classification Using Top-Down Label Combination and Artificial Neural Networks

A Markov Chain Monte Carlo Sampling Relevance Vector Machine Model for Recognizing Transcription Start Sites

Biomarker Identification Based on the L1 + L1 Penalized Model

Prediction of O-Glycosylation Sites in Protein Sequence by Kernel Principal Component Analysis

Protein classification using family profiles

A new fuzzy membership computation method for fuzzy support vector machines

Utilization of Dynamic Reducts to Improve Performance of the Rule-Based Similarity Model for Highly-Dimensional Data

Filter options

Publication date

Content availability

Publication type

Keywords

INFONA - science communication portal

Search results

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Content availability

Publication type

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options