Advanced search

chapter

Feature selection and extraction in data mining

Aparna U.R., Shaiju Paul

2016 Online International Conference on Green Engineering and Technologies (IC-GET) > 1 - 3

2016 Online International Conference on Green Engineering and Technologies (IC-GET)

Data mining is the process of extraction of relevant information from a collection of data. Mining of a particular information related to a concept is done on the basis of the feature of the data. The accessing of these features hence for data retrieval can be termed as the feature extraction mechanism. Different type of feature extraction methods are being used. The feature selection algorithm should...

chapter

Attribute reduction using backward elimination algorithm

M Karnan, P Kalyani

2010 IEEE International Conference on Computational Intelligence and Computing Research > 1 - 4

2010 IEEE International Conference on Computational Intelligence and Computing Research (ICCIC 2010)

Attribute reduction of an information system is a key problem in rough set theory and its applications. This paper proposes a new feature selection mechanism based on backward elimination algorithm to solve the attribute reduction problem in roughest theory. It is the most promising technique in the Rough set theory, a new mathematical approach to reduct car and cancer dataset using backward elimination...

chapter

When classifier selection meets information theory: A unifying view

M F A Hady, F Schwenker, G Palm

2010 International Conference of Soft Computing and Pattern Recognition > 314 - 319

2010 International Conference of Soft Computing and Pattern Recognition (SoCPaR 2010)

Classifier selection aims to reduce the size of an ensemble of classifiers in order to improve its efficiency and classification accuracy. Recently an information-theoretic view was presented for feature selection. It derives a space of possible selection criteria and show that several feature selection criteria in the literature are points within this continuous space. The contribution of this paper...

chapter

A New Attribute Dependency Function in Information System

Guang-ming Lang, Qing-Guo Li

2010 International Conference on Computational Intelligence and Software Engineering > 1 - 4

2010 International Conference on Computational Intelligence and Software Engineering (CiSE 2010)

Attribute dependency function is very important for feature selection in data mining, pattern recognition and machine learning. However, Pawlak's is inadequate for some information systems, and Daisuke's definition is only for categorical attribute. In this paper, we introduce a new definition based on partition for numerical attribute. The advantage of the definition is that heterogeneous features...

chapter

Applying Permutation Tests for Assessing the Statistical Significance of Wrapper Based Feature Selection

A Airola, Tapio Pahikkala, J Boberg, T Salakoski

2010 Ninth International Conference on Machine Learning and Applications > 989 - 994

2010 Ninth International Conference on Machine Learning and Applications (ICMLA 2010)

Feature selection is commonly used in bioinformatics applications, such as gene selection from DNA micro array data. Recently, wrapper methods have been proposed as an improvement over traditionally used filter based feature selection methods. In wrapper methods, the goodness of a feature set is often measured using the cross-validation performance of a machine learning method trained with the features...

chapter

Consensus Feature Ranking in Datasets with Missing Values

S Fakhraei, H Soltanian-Zadeh, F Fotouhi, K Elisevich

2010 Ninth International Conference on Machine Learning and Applications > 771 - 775

2010 Ninth International Conference on Machine Learning and Applications (ICMLA 2010)

Development of a feature ranking method based upon the discriminative power of features and unbiased towards classifiers is of interest. We have studied a consensus feature ranking method, based on multiple classifiers, and have shown its superiority to well known statistical ranking methods. In a target environment such as a medical dataset, missing values and an unbalanced distribution of data must...

chapter

Web spam detection based on discriminative content and link features

M Mahmoudi, A Yari, S Khadivi

2010 5th International Symposium on Telecommunications > 542 - 546

2010 5th International Symposium on Telecommunications (IST)

The problem of spam detection is a crucial task in the web information retrieval systems. The dynamic nature of information resources as well as the continuous changes in the information demands of the users makes the task of web spam detection a challenging topic. So far many different methods from researchers with different backgrounds have been proposed to tackle with spam web pages problem. In...

chapter

Anomaly Detection Using an Ensemble of Feature Models

K Noto, C Brodley, D Slonim

2010 IEEE International Conference on Data Mining > 953 - 958

2010 10th IEEE International Conference on Data Mining (ICDM 2010)

We present a new approach to semi-supervised anomaly detection. Given a set of training examples believed to come from the same distribution or class, the task is to learn a model that will be able to distinguish examples in the future that do not belong to the same class. Traditional approaches typically compare the position of a new data point to the set of ``normal'' training data points in a chosen...

chapter

Integrating Rough Set Theory and Particle Swarm Optimisation in feature selection

S Abdul-Rahman, Z Mohamed-Hussein, A A Bakar

2010 10th International Conference on Intelligent Systems Design and Applications > 1009 - 1014

10th International Conference on Intelligent Systems Design and Applications (ISDA 2010)

This paper proposes a new feature-selection strategy by integrating the Rough Set Theory (RST) and Particle Swarm Optimisation (PSO) algorithms to generate a set of discriminatory features for the classification problem. The proposed method is seen as a marriage between filter and wrapper approaches in which the RST is used to pre-reduce the feature set before optimisation by PSO, a meta-heuristic...

chapter

Attribute Selection and Imbalanced Data: Problems in Software Defect Prediction

T M Khoshgoftaar, Kehan Gao, N Seliya

2010 22nd IEEE International Conference on Tools with Artificial Intelligence > 1 > 137 - 144

2010 22nd International Conference on Tools with Artificial Intelligence (ICTAI 2010)

The data mining and machine learning community is often faced with two key problems: working with imbalanced data and selecting the best features for machine learning. This paper presents a process involving a feature selection technique for selecting the important attributes and a data sampling technique for addressing class imbalance. The application domain of this study is software engineering,...

chapter

A mutual information and information entropy pair based feature selection method in text classification

Zhili Pei, Yuxin Zhou, Lisha Liu, Lihua Wang, more

2010 International Conference on Computer Application and System Modeling (ICCASM 2010) > 6 > V6-258 - V6-261

2010 International Conference on Computer Application and System Modeling (ICCASM 2010)

Text classification is an important research field of data mining topics. This article brings a mutual information and information entropy pair based feature selection method (MIIEP_FS) based on the theory of information entropy and information entropy pair concept. This method measure the classification effect using feature by mutual information method and show the difference extent between the features...

chapter

Feature selection and classification in bioscience/medical datasets: study of parameters and multi-objective approach in Two-Phase EA/k-NN method

M S B Dissanayake, D W Corne

2010 UK Workshop on Computational Intelligence (UKCI) > 1 - 6

2010 UK Workshop on Computational Intelligence (UKCI)

Feature selection continues to grow in importance in many areas of science and engineering, as large datasets become increasingly common. In particular, bioscience and medical datasets routinely contain several thousands of features. For effective data mining in such datasets, tools are required that can reliably distinguish the most relevant features. The latter is a useful goal in itself (e.g. such...

chapter

A comparison study of multi-class sentiment classification for Chinese reviews

Dongmei Zhang, Shengen Li, Cuiling Zhu, Xiaofei Niu, more

2010 Seventh International Conference on Fuzzy Systems and Knowledge Discovery > 5 > 2433 - 2436

2010 Seventh International Conference on Fuzzy Systems and Knowledge Discovery (FSKD)

Most of the previous researches on sentiment analysis concentrate on the binary distinction of positive vs. negative. This paper presents the multi-class sentiment classification problem that attempt to mine the implied rating information from reviews. We use four machine learning methods and two feature selection methods to find out whether or not the multi-class sentiment classification problem...

chapter

Feature selection based on modified minimize entropy principle

Jr-Shian Chen, Hung-Lieh Chou, D W Tai

2010 International Conference on Electronics and Information Engineering > 1 > V1-10 - V1-13

2010 International Conference on Electronics and Information Engineering (ICEIE 2010)

Feature selections have seen growing importance placed on statistics, pattern recognition, machine learning and data mining. Researchers have demonstrated the interest in the methods for improving the performance of their forecasting results. Therefore, this study proposes a feature selection approach, which based on minimize entropy principle approach. Experimental results have shown that the proposed...

chapter

Prediction of hepatitis prognosis using Support Vector Machines and Wrapper Method

A H Roslina, A Noraziah

2010 Seventh International Conference on Fuzzy Systems and Knowledge Discovery > 5 > 2209 - 2211

2010 Seventh International Conference on Fuzzy Systems and Knowledge Discovery (FSKD)

Hepatitis patients are those who need continuous special medical treatment to reduce mortality rate. Using clinical test findings data and machine learning technology such as Support Vector Machines (SVM), the classification and prediction of their life prognosis can be done. However, we cannot pledge that all the features values in the data are correlated to each other. Therefore, we incorporate...

chapter

Feature selection via minimizing nearest neighbor classification error

Peng-Fei Zhu, Tian-Hang Meng, Yun-Long Zhao, Rui-Xian Ma, more

2010 International Conference on Machine Learning and Cybernetics > 1 > 506 - 511

2010 International Conference on Machine Learning and Cybernetics (ICMLC 2010)

Feature selection is viewed as an important preprocessing step for pattern recognition, machine learning and data mining. It is used to find an optimal subset to reduce computational cost, increase the classification accuracy and improve result comprehensibility. In this paper, a weighted distance learning approach is introduced to minimize Leaving-One-Out classification error using a gradient descent...

chapter

Detecting Trojan horses based on system behavior using machine learning method

Yu-Feng Liu, Li-Wei Zhang, Jian Liang, Sheng Qu, more

2010 International Conference on Machine Learning and Cybernetics > 2 > 855 - 860

2010 International Conference on Machine Learning and Cybernetics (ICMLC 2010)

The Research of detection malware using machine learning method attracts much attention recent years. However, most of research focused on code analysis which is signature-based or analysis of system call sequence in Linux environment. Obviously, all methods have their strengths and weaknesses. In this paper, we concentrate on detection Trojan horse by operation system information in Windows environment...

chapter

Analyzing the dynamics of the simultaneous feature and parameter optimization of an evolving Spiking Neural Network

Stefan Schliebs, Michael Defoin-Platel, Nikola Kasabov

The 2010 International Joint Conference on Neural Networks (IJCNN) > 1 - 8

2010 International Joint Conference on Neural Networks (IJCNN 2010)

This study investigates the characteristics of the Quantum-inspired Spiking Neural Network (QiSNN) feature selection and classification framework. The self-adapting nature of QiSNN due to the simultaneous optimization of network parameters and feature subsets represents a highly desirable characteristic in the context of machine learning and knowledge discovery. In this paper, the evolution of the...

chapter

Effective classification for crater detection: A case study on Mars

Jue Wang, Wei Ding, B Fradkin, C H Pham, more

9th IEEE International Conference on Cognitive Informatics (ICCI'10) > 688 - 695

2010 9th IEEE International Conference on Cognitive Informatics (ICCI)

Craters are important geographical features caused by the impacts of meteoroids. Craters have been widely studied because they contain crucial information about the age and geologic formations of planets. This paper discusses an automated crater-detection framework using knowledge discovery and data mining (KDD) process including sampling, feature selection and creation, and supervised learning methods...

chapter

An efficient fashion-driven learning approach to model user preferences in on-line shopping scenarios

Orhan Camoglu, Tianli Yu, Luca Bertelli, Diem Vu, more

2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Workshops > 28 - 34

2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops (CVPR Workshops)

In this work we tackle the problem of search personalization for on-line soft goods shopping. By learning what the user likes and what the user does not like, better search rankings and therefore a better overall shopping experience can be obtained. The first contribution of the work is in terms of feature selection: given the specific nature of the domain, we combine the traditional visual and text...

INFONA - science communication portal

Advanced search

Advanced search in people

Feature selection and extraction in data mining

Attribute reduction using backward elimination algorithm

When classifier selection meets information theory: A unifying view

A New Attribute Dependency Function in Information System

Applying Permutation Tests for Assessing the Statistical Significance of Wrapper Based Feature Selection

Consensus Feature Ranking in Datasets with Missing Values

Web spam detection based on discriminative content and link features

Anomaly Detection Using an Ensemble of Feature Models

Integrating Rough Set Theory and Particle Swarm Optimisation in feature selection

Attribute Selection and Imbalanced Data: Problems in Software Defect Prediction

A mutual information and information entropy pair based feature selection method in text classification

Feature selection and classification in bioscience/medical datasets: study of parameters and multi-objective approach in Two-Phase EA/k-NN method

A comparison study of multi-class sentiment classification for Chinese reviews

Feature selection based on modified minimize entropy principle

Prediction of hepatitis prognosis using Support Vector Machines and Wrapper Method

Feature selection via minimizing nearest neighbor classification error

Detecting Trojan horses based on system behavior using machine learning method

Analyzing the dynamics of the simultaneous feature and parameter optimization of an evolving Spiking Neural Network

Effective classification for crater detection: A case study on Mars

An efficient fashion-driven learning approach to model user preferences in on-line shopping scenarios

Filter options

Publication date

Content availability

Publication type

Keywords

INFONA - science communication portal

Advanced search

Advanced search in people

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Content availability

Publication type

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options