Search results for: T.M. Khoshgoftaar

Items from 1 to 9 out of 9 results

chapter

Feature Selection with Imbalanced Data for Software Defect Prediction

T.M. Khoshgoftaar, Kehan Gao

2009 International Conference on Machine Learning and Applications > 235 - 240

Eighth International Conference on Machine Learning and Applications (ICMLA 2009)

In this paper, we study the learning impact of data sampling followed by attribute selection on the classification models built with binary class imbalanced data within the scenario of software quality engineering. We use a wrapper-based attribute ranking technique to select a subset of attributes, and the random undersampling technique (RUS) on the majority class to alleviate the negative effects...

chapter

Wrapper-Based Feature Ranking for Software Engineering Metrics

W. Altidor, T.M. Khoshgoftaar, A. Napolitano

2009 International Conference on Machine Learning and Applications > 241 - 246

Eighth International Conference on Machine Learning and Applications (ICMLA 2009)

The application of feature ranking to software engineering datasets is rare at best. In this study, we consider wrapper-based feature ranking where nine performance metrics aided by a particular learner are evaluated. We consider five learners and take two different approaches, each in conjunction with one of two different methodologies: 3-fold Cross-Validation (CV) and 3-fold Cross-Validation Risk...

chapter

An Empirical Study on Wrapper-Based Feature Ranking

W. Altidor, T.M. Khoshgoftaar, J. Van Hulse

2009 21st IEEE International Conference on Tools with Artificial Intelligence > 75 - 82

2009 21st IEEE International Conference on Tools with Artificial Intelligence (ICTAI 2009)

Feature selection has become the cornerstone of many classification problems. It has been applied in many domains such as Web mining, text categorization, gene expression microarray analysis, image analysis, and combinatorial chemistry. One type of well-studied feature selection methodology is filtering, which is typically divided into ranking and subset evaluation. This work provides an empirical...

chapter

An empirical comparison of repetitive undersampling techniques

J. Van Hulse, T.M. Khoshgoftaar, A. Napolitano

2009 IEEE International Conference on Information Reuse&Integration > 29 - 34

2009 IEEE International Conference on Information Reuse & Integration (IRI 2009)

A common problem for data mining and machine learning practitioners is class imbalance. When examples of one class greatly outnumber examples of the other class (es), traditional machine learning algorithms can perform poorly. Random undersampling is a technique that has shown great potential for alleviating the problem of class imbalance. However, undersampling leads to information loss which can...

chapter

RUSBoost: Improving classification performance when training data is skewed

C. Seiffert, T.M. Khoshgoftaar, J. Van Hulse, A. Napolitano

2008 19th International Conference on Pattern Recognition > 1 - 4

ICPR 2008 19th International Conference on Pattern Recognition

Constructing classification models using skewed training data can be a challenging task. We present RUSBoost, a new algorithm for alleviating the problem of class imbalance. RUSBoost combines data sampling and boosting, providing a simple and efficient method for improving classification performance when training data is imbalanced. In addition to performing favorably when compared to SMOTEBoost (another...

chapter

Improving Learner Performance with Data Sampling and Boosting

C. Seiffert, T.M. Khoshgoftaar, J. Van Hulse, A. Napolitano

2008 20th IEEE International Conference on Tools with Artificial Intelligence > 1 > 452 - 459

2008 20th IEEE International Conference on Tools with Artificial Intelligence (ICTAI)

Learning from imbalanced datasets is a well known problem in the data mining community. Many techniques have been proposed to alleviate the problems associated with class imbalance, including data sampling and boosting. While data sampling has received the bulk of the attention from the research community, our results show that boosting often results in better classification performance than even...

chapter

Addressing Class Imbalance in Non-binary Classification Problems

N. Seliya, Zhiwei Xu, T.M. Khoshgoftaar

2008 20th IEEE International Conference on Tools with Artificial Intelligence > 1 > 460 - 466

2008 20th IEEE International Conference on Tools with Artificial Intelligence (ICTAI)

The problem of class imbalance in machine learning is quite real and cumbersome when it comes to building a useful and practical classification model. We present a unique insight into addressing class imbalance for classification problems that involve three or more categories, i.e. non-binary. This study is different than related works in the literature because most works focus on addressing class...

chapter

Resampling or Reweighting: A Comparison of Boosting Implementations

C. Seiffert, T.M. Khoshgoftaar, J. Van Hulse, A. Napolitano

2008 20th IEEE International Conference on Tools with Artificial Intelligence > 1 > 445 - 451

2008 20th IEEE International Conference on Tools with Artificial Intelligence (ICTAI)

Boosting has been shown to improve the performance of classifiers in many situations, including when data is imbalanced. There are, however, two possible implementations of boosting, and it is unclear which should be used. Boosting by reweighting is typically used, but can only be applied to base learners which are designed to handle example weights. On the other hand, boosting by resampling can be...

chapter

Using Imputation Techniques to Help Learn Accurate Classifiers

Xiaoyuan Su, T.M. Khoshgoftaar, R. Greiner

2008 20th IEEE International Conference on Tools with Artificial Intelligence > 1 > 437 - 444

2008 20th IEEE International Conference on Tools with Artificial Intelligence (ICTAI)

It is difficult to learn good classifiers when training data is missing attribute values. Conventional techniques for dealing with such omissions, such as mean imputation, generally do not significantly improve the performance of the resulting classifier. We proposed imputation-helped classifiers, which use accurate imputation techniques, such as Bayesian multiple imputation (BMI), predictive mean...

Filter options

Keywords:
TRAINING

Publication date

Set your own date range

Keywords

LEARNING (ARTIFICIAL INTELLIGENCE) (7)
DATA MINING (6)
PATTERN CLASSIFICATION (6)
DATA MODELS (5)
TRAINING DATA (5)
ACCURACY (4)
BOOSTING (4)
CLASSIFICATION ALGORITHMS (4)
MEASUREMENT (4)
SUPPORT VECTOR MACHINES (4)
BIOLOGICAL SYSTEM MODELING (3)
CLASS IMBALANCE (3)
DATA SAMPLING (3)
MACHINE LEARNING (3)
BAYES METHODS (2)
BOOSTING ALGORITHM (2)
CLASSIFICATION MODEL (2)
FEATURE SELECTION (2)
LOGISTIC REGRESSION (2)
MULTILAYER PERCEPTRONS (2)
REGRESSION ANALYSIS (2)
SAMPLING METHODS (2)
SOFTWARE (2)
WRAPPER-BASED FEATURE RANKING (2)
3- FOLD CROSS VALIDATION RISK IMPACT (1)
3-FOLD CROSS VALIDATION (1)
3-FOLD CROSS-VALIDATION RISK IMPACT (1)
5-NEAREST NEIGHBOR LEARNING (1)
ADABOOST (1)
ALGORITHM DESIGN AND ANALYSIS (1)
AREA UNDER PRC (1)
AREA UNDER ROC (1)
ARTIFICIAL INTELLIGENCE (1)
ARTIFICIAL NEURAL NETWORKS (1)
ATTRIBUTE SELECTION (1)
BAYESIAN MULTIPLE IMPUTATION (1)
BELIEF NETWORKS (1)
BEST ARITHMETIC MEAN (1)
BEST F-MEASURE (1)
BEST GEOMETRIC MEAN (1)
BINARY CLASS IMBALANCED DATA (1)
BIOINFORMATICS (1)
CLASSIFICATION TREE ANALYSIS (1)
COMBINATORIAL CHEMISTRY (1)
COVARIANCE MATRIX (1)
DATA SAMPLING METHODS (1)
DATA SAMPLING TECHNIQUES (1)
DECISION TREES (1)
EMPIRICAL COMPARISON (1)
EXPECTATION MAXIMIZATION (1)
EXPECTATION-MAXIMISATION ALGORITHM (1)
FAULT DIAGNOSIS (1)
FEATURE EXTRACTION (1)
FEATURE SELECTION METHODOLOGY (1)
FREQUENCY MODULATION (1)
GENE EXPRESSION (1)
GENE EXPRESSION MICROARRAY ANALYSIS (1)
GLASS (1)
HINDER CLASSIFICATION (1)
IMAGE ANALYSIS (1)
IMBALANCED DATA (1)
IMBALANCED TRAINING DATA CLASSIFICATION (1)
IMPUTATION TECHNIQUES (1)
IMPUTATION-HELPED CLASSIFIERS (1)
INFORMATION FILTERING (1)
K-NEAREST NEIGHBORS CLASSIFIER (1)
LEARNING (1)
LEARNING IMPACT (1)
LOGISTICS (1)
MACHINE LEARNED CLASSIFIERS (1)
MACHINE LEARNING ALGORITHMS (1)
MEAN IMPUTATION (1)
MULTILAYER PERCEPTRON (1)
MULTILAYER PERCEPTRON CLASSIFIER (1)
NAIVE BAYES CLASSIFIER (1)
NAIVE BAYES LEARNING (1)
NIOBIUM (1)
NON-BINARY CLASSIFIERS (1)
NONBINARY CLASSIFICATION (1)
OVERALL ACCURACY (1)
PERFORMANCE EVALUATION (1)
PERFORMANCE METRICS (1)
PREDICTIVE MEAN MATCHING (1)
PREDICTIVE MODELS (1)
RANDOM OVERSAMPLING (1)
RANDOM PROCESSES (1)
RANDOM UNDERSAMPLING (1)
RANDOM UNDERSAMPLING TECHNIQUE (1)
REPETITIVE UNDERSAMPLING TECHNIQUE (1)
RESAMPLING METHOD (1)
REWEIGHTING METHOD (1)
SKEWED TRAINING DATA (1)
SOFTWARE DEFECT PREDICTION (1)
SOFTWARE ENGINEERING (1)
SOFTWARE ENGINEERING METRICS (1)
SOFTWARE FAULT TOLERANCE (1)
SOFTWARE METRICS (1)
SOFTWARE PERFORMANCE EVALUATION (1)
SOFTWARE QUALITY (1)
more

INFONA - science communication portal

Search results for: T.M. Khoshgoftaar

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options