The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
E-learning has witnessed a great interest from the part of corporations, educational institutions and individuals alike. As an education pattern, e-learning systems have become more and more popular. It commonly refers to teaching efforts propagated through the use of computers in a bid to impart knowledge in a non traditional classroom environment. As a prerequisite for an effective development of...
Classifier selection aims to reduce the size of an ensemble of classifiers in order to improve its efficiency and classification accuracy. Recently an information-theoretic view was presented for feature selection. It derives a space of possible selection criteria and show that several feature selection criteria in the literature are points within this continuous space. The contribution of this paper...
In this research we took an experiment of two feature selection methods - eta square and stepwise methods on two classification models - back propagation neural network (BPNN) and general regression neural network (GRNN) to study the effects on the correctness of firm bankruptcy classification. The correctness includes the average classification correctness and the power of bankruptcy classification...
Development of a feature ranking method based upon the discriminative power of features and unbiased towards classifiers is of interest. We have studied a consensus feature ranking method, based on multiple classifiers, and have shown its superiority to well known statistical ranking methods. In a target environment such as a medical dataset, missing values and an unbalanced distribution of data must...
Nowadays, the classification of graph data has become an important and active research topic in the last decade, which has a wide variety of real world applications, e.g. drug activity predictions and kinase inhibitor discovery. Current research on graph classification focuses on single-label settings. However, in many applications, each graph data can be assigned with a set of multiple labels simultaneously...
A computational mutagenesis is detailed whereby each single residue substitution in a protein chain of primary sequence length N is represented as a sparse N-dimensional feature vector, whose M ≪ N nonzero components locally quantify environmental perturbations occurring at the mutated position and its neighbors in the protein structure. The methodology makes use of both the Delaunay tessellation...
Graph classification is important for different scientific applications; it can be exploited in various problems related to bioinformatics and cheminformatics. Given their graphs, there is increasing need for classifying small molecules to predict their properties such as activity, toxicity or mutagenicity. Using subtrees as feature set for graph classification in kernel methods has been shown to...
This paper proposes a new feature-selection strategy by integrating the Rough Set Theory (RST) and Particle Swarm Optimisation (PSO) algorithms to generate a set of discriminatory features for the classification problem. The proposed method is seen as a marriage between filter and wrapper approaches in which the RST is used to pre-reduce the feature set before optimisation by PSO, a meta-heuristic...
Support Vector Machines (SVMs) ensembles have been widely used to improve classification accuracy in complicated pattern recognition tasks. In this work we propose to apply an ensemble of SVMs coupled with feature-subset selection methods to aleviate the curse of dimensionality associated with expression-based classification of DNA microarray data. We compare the single SVM classifier to SVM ensembles...
Text classification is an important research field of data mining topics. This article brings a mutual information and information entropy pair based feature selection method (MIIEP_FS) based on the theory of information entropy and information entropy pair concept. This method measure the classification effect using feature by mutual information method and show the difference extent between the features...
Feature selection continues to grow in importance in many areas of science and engineering, as large datasets become increasingly common. In particular, bioscience and medical datasets routinely contain several thousands of features. For effective data mining in such datasets, tools are required that can reliably distinguish the most relevant features. The latter is a useful goal in itself (e.g. such...
As of 1997, when a special issue on relevance including several papers on feature selection was published, few domains explored used more than 40 features.The situation has changed considerably in the past few years and, in this special issue, most papers explore domains with hundreds to tens of thousands of variables or features: new techniques are proposed to address these challenging tasks involving...
Most of the previous researches on sentiment analysis concentrate on the binary distinction of positive vs. negative. This paper presents the multi-class sentiment classification problem that attempt to mine the implied rating information from reviews. We use four machine learning methods and two feature selection methods to find out whether or not the multi-class sentiment classification problem...
Hepatitis patients are those who need continuous special medical treatment to reduce mortality rate. Using clinical test findings data and machine learning technology such as Support Vector Machines (SVM), the classification and prediction of their life prognosis can be done. However, we cannot pledge that all the features values in the data are correlated to each other. Therefore, we incorporate...
Feature selection is viewed as an important preprocessing step for pattern recognition, machine learning and data mining. It is used to find an optimal subset to reduce computational cost, increase the classification accuracy and improve result comprehensibility. In this paper, a weighted distance learning approach is introduced to minimize Leaving-One-Out classification error using a gradient descent...
Feature selection studies reveal how to select a subset or list of attributes or variables that are used to construct models describing data. A feature-selection algorithm is part of the classification rule. This is why feature selection must be included when using cross-validation error estimation. Rough Sets theory provides a new mathematical tool to deal with uncertainty and vagueness of an information...
The Research of detection malware using machine learning method attracts much attention recent years. However, most of research focused on code analysis which is signature-based or analysis of system call sequence in Linux environment. Obviously, all methods have their strengths and weaknesses. In this paper, we concentrate on detection Trojan horse by operation system information in Windows environment...
Craters are important geographical features caused by the impacts of meteoroids. Craters have been widely studied because they contain crucial information about the age and geologic formations of planets. This paper discusses an automated crater-detection framework using knowledge discovery and data mining (KDD) process including sampling, feature selection and creation, and supervised learning methods...
This paper deals with the optimization of sensor arrangement and feature selection for activity recognition of the people living alone with sensors. We suggest an algorithm which picks up from several thousand to millions of characteristic sensor reactions as feature candidates, and selects best feature combinations and corresponding sensor arrangements for classification with as small numbers of...
Extracting knowledge out of qualitative data is an ever-growing issue in our networking world. Opposite to the widespread trend consisting of extending general classification methods to zero/one-valued qualitative variables, we explore here another path: we first build a specific representation for these data, respectful of the non-occurrence as well as presence of an item, and making the interactions...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.