Taghi M. Khoshgoftaar

chapter

A survey of stability analysis of feature subset selection techniques

Taghi M. Khoshgoftaar, Alireza Fazelpour, Huanjing Wang, Randall Wald

2013 IEEE 14th International Conference on Information Reuse & Integration (IRI) > 424 - 431

2013 IEEE 14th International Conference on Information Reuse & Integration (IRI)

With the proliferation of high-dimensional datasets across many application domains in recent years, feature selection has become an important data mining task due to its capability to improve both performance and computational efficiencies. The chosen feature subset is important not only due to its ability to improve classification performance, but also because in some domains, knowing the most important...

chapter

The Effect of Number of Iterations on Ensemble Gene Selection

Wael Awada, Taghi M. Khoshgoftaar, David Dittman, Randall Wald

2012 11th International Conference on Machine Learning and Applications > 2 > 198 - 203

2012 Eleventh International Conference on Machine Learning and Applications (ICMLA)

Dimensionality-reducing techniques such as gene selection have become commonplace in order to reduce the high dimensionality found within bioinformatics datasets such as DNA microarray datasets. The degree of dimensionality is reduced by identifying and removing redundant and irrelevant features or genes and leaving only an optimum subset of features for subsequent analysis. However, a number of feature...

chapter

Comparing Two New Gene Selection Ensemble Approaches with the Commonly-Used Approach

David J. Dittman, Taghi M. Khoshgoftaar, Randall Wald, Amri Napolitano

2012 11th International Conference on Machine Learning and Applications > 2 > 184 - 191

2012 Eleventh International Conference on Machine Learning and Applications (ICMLA)

Ensemble feature selection has recently become a topic of interest for researchers, especially in the area of bioinformatics. The benefits of ensemble feature selection include increased feature (gene) subset stability and usefulness as well as comparable (or better) classification performance compared to using a single feature selection method. However, existing work on ensemble feature selection...

chapter

An Empirical Study on the Stability of Feature Selection for Imbalanced Software Engineering Data

Huanjing Wang, Taghi M. Khoshgoftaar, Amri Napolitano

2012 11th International Conference on Machine Learning and Applications > 1 > 317 - 323

2012 Eleventh International Conference on Machine Learning and Applications (ICMLA)

In software quality modeling, software metrics are collected during the software development cycle. However, not all metrics are relevant to the class attribute (software quality). Metric (feature) selection has become the cornerstone of many software quality classification problems. Selecting software metrics that are important for software quality classification is a necessary and critical step...

chapter

Impact of noise and data sampling on stability of feature ranking techniques for biological datasets

Ahmad Abu Shanab, Taghi M. Khoshgoftaar, Randall Wald, Amri Napolitano

2012 IEEE 13th International Conference on Information Reuse & Integration (IRI) > 415 - 422

2012 IEEE 13th International Conference on Information Reuse & Integration (IRI)

Feature selection is an important preprocessing step when learning from bioinformatics datasets. Since these datasets often have high dimensionality (a large number of features), selecting the most important ones both improves performance and reduces computation time. In addition, when the features in question are genes (as is the case for microarray datasets), knowing the important genes is useful...

chapter

A novel dataset-similarity-aware approach for evaluating stability of software metric selection techniques

Huanjing Wang, Taghi M. Khoshgoftaar, Randall Wald, Amri Napolitano

2012 IEEE 13th International Conference on Information Reuse & Integration (IRI) > 1 - 8

2012 IEEE 13th International Conference on Information Reuse & Integration (IRI)

Software metric (feature) selection is an important pre-processing step before building software defect prediction models. Although much research has been done analyzing the classification performance of feature selection methods, fewer works have focused on their stability (robustness). Stability is important because feature selection methods which reliably produce the same results despite changes...

chapter

Measuring Stability of Threshold-Based Feature Selection Techniques

Huanjing Wang, Taghi M. Khoshgoftaar

2011 IEEE 23rd International Conference on Tools with Artificial Intelligence > 986 - 993

2011 IEEE 23rd International Conference on Tools with Artificial Intelligence (ICTAI)

Feature selection has been applied in many domains, such as text mining and software engineering. Ideally a feature selection technique should produce consistent outputs regardless of minor variations in the input data. Researchers have recently begun to examine the stability (robustness) of feature selection techniques. The stability of a feature selection method is defined as the degree of agreement...

chapter

Impact of Data Sampling on Stability of Feature Selection for Software Measurement Data

Kehan Gao, Taghi M. Khoshgoftaar, Amri Napolitano

2011 IEEE 23rd International Conference on Tools with Artificial Intelligence > 1004 - 1011

2011 IEEE 23rd International Conference on Tools with Artificial Intelligence (ICTAI)

Software defect prediction can be considered a binary classification problem. Generally, practitioners utilize historical software data, including metric and fault data collected during the software development process, to build a classification model and then employ this model to predict new program modules as either fault-prone (fp) or not-fault-prone (nfp). Limited project resources can then be...

chapter

A noise-based stability evaluation of threshold-based feature selection techniques

Wilker Altidor, Taghi M. Khoshgoftaar, Amri Napolitano

2011 IEEE International Conference on Information Reuse & Integration > 240 - 245

2011 IEEE International Conference on Information Reuse & Integration (IRI)

This paper presents a noise-based stability performance evaluation approach for feature selection techniques. For the stability assessment, a similarity-based measure is used to quantify the degree of agreement between a filter's output on a clean dataset and its outputs on the same dataset corrupted with different combinations of noise level and noise distribution. Experiments are conducted with...

chapter

Measuring robustness of Feature Selection techniques on software engineering datasets

Huanjing Wang, Taghi M. Khoshgoftaar, Randall Wald

2011 IEEE International Conference on Information Reuse & Integration > 309 - 314

2011 IEEE International Conference on Information Reuse & Integration (IRI)

Feature Selection is a process which identifies irrelevant and redundant features from a high-dimensional dataset (that is, a dataset with many features), and removes these before further analysis is performed. Recently, the robustness (e.g., stability) of feature selection techniques has been studied, to examine the sensitivity of these techniques to changes in their input data. In this study, we...

INFONA - science communication portal

Search results for: Taghi M. Khoshgoftaar

A survey of stability analysis of feature subset selection techniques

The Effect of Number of Iterations on Ensemble Gene Selection

Comparing Two New Gene Selection Ensemble Approaches with the Commonly-Used Approach

An Empirical Study on the Stability of Feature Selection for Imbalanced Software Engineering Data

Impact of noise and data sampling on stability of feature ranking techniques for biological datasets

A novel dataset-similarity-aware approach for evaluating stability of software metric selection techniques

Measuring Stability of Threshold-Based Feature Selection Techniques

Impact of Data Sampling on Stability of Feature Selection for Software Measurement Data

A noise-based stability evaluation of threshold-based feature selection techniques

Measuring robustness of Feature Selection techniques on software engineering datasets

Filter options

Publication date

Keywords

INFONA - science communication portal

Search results for: Taghi M. Khoshgoftaar

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options