The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
In this paper, we report a classifier ensemble technique using the search capability of genetic algorithm (GA) for Named Entity Recognition (NER) in biomedical domain. We use Maximum Entropy (ME) framework to build a number of classifiers depending upon the various representations of a set of features. The proposed technique is evaluated with the JNLPBA 2004 data sets that yield the overall recall,...
This paper concentrates on studying the use of interval type-2 fuzzy sets for the pattern classification problem. Even though researchers recognize that type-2 fuzzy sets are more difficult to understand and use than type-1 fuzzy sets, the interest in the study is motivated by the additional power to represent uncertainty in different levels. The work developed here relies on the recent advances concerning...
This paper aims to challenge the problem of finding accurate and relevant rules for the task of classification. The scope is to improve the accuracy, or at least to provide a comparable accuracy measure, for classification algorithms implemented so far. Because the task of classification must be as accurate as possible, the paper proposes a method based on genetic algorithms to enhance the speed and...
Relevance vector machine (RVM) is a state-of-the-art technique for regression and classification, as a sparse Bayesian extension version of the support vector machine. The selection of a kernel and associated parameter is a critical step of RVM application. The real-world application and recent researches have emphasized the requirement to multiple kernel learning, in order to boost the fitting accuracy...
The learning of Fuzzy Rule-Based Classification Systems for High-Dimensional problems suffers from exponential growth of the fuzzy rule search space when the number of patterns and/or variables becomes high. In this work, we propose a fuzzy association rule-based classification method with genetic rule selection for high-dimensional problems to obtain an accurate and compact fuzzy rule-based classifier...
This paper proposes the use of a local feature selection scheme, for the effective selection of relevant features, when designing Genetic Fuzzy Rule-Based Classification Systems (GFRBCSs). The method relies in providing the genetic search with deterministic information about the quality of each feature with respect to its classification ability, directing the evolution in selecting the most useful...
Relevance vector machine (RVM) is a state-of-the-art technique for regression and classification, as a sparse Bayesian extension version of the support vector machine. The kernel function and parameter selection is a key problem in the research of RVM. The real-world application and recent researches have emphasized the requirement to multiple kernel learning. This paper proposes a novel regression...
A novel approach based on applying a modern meta-heuristic Gene Expression Programming (GEP) to detecting Web application attacks is presented in the paper. This class of attacks relates to malicious activity of an intruder against applications, which use a database for storing data. The application uses SQL to retrieve data from the database and Web server mechanisms to put them in a Web browser...
In this contribution we analyse the significance of the granularity level (number of labels) in Fuzzy Rule-Based Classification Systems in the scenario of data-sets with a high imbalance degree. We refer to imbalanced data-sets when the class distribution is not uniform, a situation that it is present in many real application areas. The aim of this work is to adapt the number of fuzzy labels for each...
Gene expression data usually contains a large number of genes (several thousand or more) but a small number of samples (usually <100). Among all the genes, many are irrelevant, insignificant or redundant to the discriminant problem under investigation. Hence the identification of informative genes, which have the greatest power for classification, is of fundamental and practical importance to the...
This paper presents an approach for designing classifiers for a multiclass problem using Genetic Programming (GP). The proposed approach takes an integrated view of all classes when GP evolves. An individual of the population will be represented using multiple trees. The GP is trained with a set of N training samples in steps. A concept of unfitness of a tree is used in order to improve genetic evolution...
In this paper, the classification rule-mining problem is considered as a multi-objective problem rather than a uni-objective one. Metrics like predictive accuracy and comprehensibility, used for evaluating a rule can be thought of as different criteria of this problem. Predictive accuracy measures the accuracy of the rules extracted from the dataset where as, comprehensibility is measured by the number...
This paper tries to apply a genetic algorithm-based method to fuzzy rule-base system for fuzzy classification with minimum fuzzy rules, which simultaneously enhances or maintain the performance of the fuzzy classification system with fuzzy rule-base. That is, the optimization is included with the minimization of the number of the extracted fuzzy rules and the maximization of the performance of the...
For microarray data classification problem, selecting relevant genes from microarray data pose a formidable challenge to researchers due to the high-dimensionality of features, multi-class categories being involved and the usually small sample size. In order to correctly analyze microarray data, the goal of feature (gene) selection is to select those subsets of differentially expressed genes that...
In this work our aim is to increase the performance of fuzzy rule based classifications systems in the framework of imbalanced data-sets by means of the application of a genetic tuning step. We focus on the imbalanced data-set problem since it appears in many real application areas and, for this reason, it has become a relevant topic in the area of machine learning. This problem occurs when the number...
Automatic categorization of documents into pre-defined taxonomies is a crucial step in data mining and knowledge discovery. Standard machine learning techniques like support vector machines(SVM) and related large margin methods have been successfully applied for this task. Unfortunately, the high dimensionality of input feature vectors impacts on the classification speed. The kernel parameters setting...
This survey gives state-of-the-art of genetic algorithm (GA) based clustering techniques. Clustering is a fundamental and widely applied method in understanding and exploring a data set. Interest in clustering has increased recently due to the emergence of several new areas of applications including data mining, bioinformatics, web use data analysis, image analysis etc. To enhance the performance...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.