The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Feature selection algorithm has a great influence on the accuracy of text categorization. The traditional information gain (IG) feature selection algorithm usually selects the features that rarely appear in the specified categories, but frequently appear in other categories. To overcome this drawback, on the basis of in-depth analysis of the related algorithms, an improved IG feature selection method...
With the exponential increase of the data scale, the problem of feature selection has been the focus in statistical pattern recognition. In this paper, a new modified forward deep floating searching algorithm (SDFFS) is proposed to select a feature subset of d features from the original candidate-set of D features (d < D), which is an improvement of the state of the art SFFS algorithm. The SDFFS...
In data mining, a well known problem of “Curse of Dimensionality” occurs due to presence of large number of dimensions in a dataset. This problem leads to reduced accuracy of machine learning classifiers because of presence of many insignificant and irrelevant dimensions or features in the dataset. Data mining applications such as bioinformatics, risk management, forensics etc., generally involves...
Identification of minimum number of local regions of a handwritten character image, containing well-defined discriminating features which are sufficient for a minimal but complete description of the character is a challenging task. A new region selection technique based on the idea of an enhanced Harmony Search methodology has been proposed here. The powerful framework of Harmony Search has been utilized...
Two factors characterize a good feature selection algorithm: its accuracy and stability. This paper aims at introducing a new approach to stable feature selection algorithms. The innovation of this paper centers on a class of stable feature selection algorithms called feature weighting as regularized energy-based learning (FREL). Stability properties of FREL using L1 or L2 regularization are investigated...
For most of data sets, there exist some redundant, irrelevant and even noise features. Usually, there are plenty of features in medical data sets and the correlation among features is strong. So, feature selection of medical data sets gets great concern in recent years. RELIEFF is one of the effective feature selection algorithms, but cannot remove redundant features. RS is a mathematical approach...
The data mining applications such as bioinformatics, risk management, forensics etc., involves very high dimensional dataset. Due to large number of dimensions, a well known problem of “Curse of Dimensionality” occurs. This problem leads to lower accuracy of machine learning classifiers due to involvement of many insignificant and irrelevant dimensions or features in the dataset. There are many methodologies...
As increase in the internet services and usage with open access to sensitive data, necessity of security to these systems had become a need of the hour. Intrusion Detection Systems (IDSs) provide an important layer of security for computer systems and networks, and are becoming more and more crucial issue. To detect the attacks hitting the network it is very obligatory to properly monitor the flow...
Business and Research organizations are continuously generating huge amount of high dimensional data. They need to analyze this data in real-time with minimum cost. Data pre-processing techniques in combination with dimensionality reduction techniques are widely used by researchers to improve the quality of data and reduce the time, cost required to analyze the data. But standard methods are not available...
Feature selection or variable reduction is a fundamental problem in data mining, refers to the process of identifying the few most important features for application of a learning algorithm. The best subset contains the minimum number of dimensions retaining a suitably high accuracy on classifier in representing the original features. The objective of the proposed approach is to reduce the number...
For classification of High Dimensional data, feature selection is the most important step for obtaining optimal result with respect to processing power required and time taken. Feature selection is a method by which the most relevant feature is selected from a set of features containing redundant and irrelevant features thereby reducing the load on the classification algorithm. This paper proposes...
In today's networked environment, massive volume of data being generated, gathered and stored in databases across the world. This trend is growing very fast, year after year. Today it is normal to find databases with terabytes of data, in which vital information and knowledge is hidden. The unseen information in such databases is not feasible to mine without efficient mining techniques for extracting...
Todays, feature selection is an active research in machine learning. The main idea of feature selection is to select a subset of available features, by eliminating features with little or no predictive information. This paper presents a hybrid model with a new local search technique based on reinforcement learning for feature selection. We combined the particle swarm optimization (PSO) with support...
In this paper, we hybridize the improved gravitational search algorithm (IGSA) with kernel based extreme learning machine (KELM) method. Based on this, a novel hybrid system IGSA-KELM is proposed to improve the generalization performance for classification problems. In this system, IGSA is designed by combining the search strategy of particle swarm optimization and GSA to effectively reduce the problem...
In original data, there may exist redundant features, irrelevant features, noisy features besides informative features. Extracting informative features while eliminating the others is the goal of feature selection. This paper proposed a new feature selection algorithm based on Relief algorithm and SVM-RFE algorithm, and it is strongly targeted to eliminate the unnecessary features. Finally, We test...
Dimensionality reduction as a preprocessing step to machine learning is effective in removing irrelevant and redundant data, increasing learning accuracy, and improving result comprehensibility. However, the recent increase of dimensionality of data poses a severe challenge to many existing feature selection and feature extraction methods with respect to efficiency and effectiveness. In the field...
Proper parameter settings of support vector machine (SVM) and feature selection are of great importance to its efficiency and accuracy. In this paper, we propose a parallel adaptive particle swarm optimization algorithm to simultaneously perform the parameter optimization and feature selection for SVM, termed PTVPSO-SVM. It is implemented in an efficient parallel environment using PVM (Parallel Virtual...
Selecting a feature subset with strong discriminative power is a critical process for high dimensional data analysis, which has attracted much attention in many application domains, such as text categorization and genome projects. Since traditional feature selection methods provide limited contributions to classification, many researchers resort to hybrid or elaborate approaches to choose interesting...
Manufacturing data is an important source of knowledge that can be used to enhance the production capability. The detection of the causes of defects may possibly lead to an improvement in production. However, the production records generally contain an enormous set of features. It is almost impossible in practice to monitor all features at once. This research proposes the feature reduction technique,...
Finding an appropriate set of features from data of high dimensionality for building an accurate classification model is a well-known NP-hard computational problem. Unfortunately in data mining, some big data are not only big in volume but they are described by a large number of features. Many feature subset selection algorithms have been proposed in the past, they are nevertheless far from perfect...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.