Search results for: T.M. Khoshgoftaar

Items from 21 to 36 out of 36 results

chapter

Improving Learner Performance with Data Sampling and Boosting

C. Seiffert, T.M. Khoshgoftaar, J. Van Hulse, A. Napolitano

2008 20th IEEE International Conference on Tools with Artificial Intelligence > 1 > 452 - 459

2008 20th IEEE International Conference on Tools with Artificial Intelligence (ICTAI)

Learning from imbalanced datasets is a well known problem in the data mining community. Many techniques have been proposed to alleviate the problems associated with class imbalance, including data sampling and boosting. While data sampling has received the bulk of the attention from the research community, our results show that boosting often results in better classification performance than even...

chapter

Addressing Class Imbalance in Non-binary Classification Problems

N. Seliya, Zhiwei Xu, T.M. Khoshgoftaar

2008 20th IEEE International Conference on Tools with Artificial Intelligence > 1 > 460 - 466

2008 20th IEEE International Conference on Tools with Artificial Intelligence (ICTAI)

The problem of class imbalance in machine learning is quite real and cumbersome when it comes to building a useful and practical classification model. We present a unique insight into addressing class imbalance for classification problems that involve three or more categories, i.e. non-binary. This study is different than related works in the literature because most works focus on addressing class...

chapter

Resampling or Reweighting: A Comparison of Boosting Implementations

C. Seiffert, T.M. Khoshgoftaar, J. Van Hulse, A. Napolitano

2008 20th IEEE International Conference on Tools with Artificial Intelligence > 1 > 445 - 451

2008 20th IEEE International Conference on Tools with Artificial Intelligence (ICTAI)

Boosting has been shown to improve the performance of classifiers in many situations, including when data is imbalanced. There are, however, two possible implementations of boosting, and it is unclear which should be used. Boosting by reweighting is typically used, but can only be applied to base learners which are designed to handle example weights. On the other hand, boosting by resampling can be...

chapter

Using Imputation Techniques to Help Learn Accurate Classifiers

Xiaoyuan Su, T.M. Khoshgoftaar, R. Greiner

2008 20th IEEE International Conference on Tools with Artificial Intelligence > 1 > 437 - 444

2008 20th IEEE International Conference on Tools with Artificial Intelligence (ICTAI)

It is difficult to learn good classifiers when training data is missing attribute values. Conventional techniques for dealing with such omissions, such as mean imputation, generally do not significantly improve the performance of the resulting classifier. We proposed imputation-helped classifiers, which use accurate imputation techniques, such as Bayesian multiple imputation (BMI), predictive mean...

chapter

Software quality modeling: The impact of class noise on the random forest classifier

A. Folleco, T.M. Khoshgoftaar, J. Van Hulse, L. Bullard

2008 IEEE Congress on Evolutionary Computation (IEEE World Congress on Computational Intelligence) > 3853 - 3859

2008 IEEE Congress on Evolutionary Computation (CEC)

This study investigates the impact of increasing levels of simulated class noise on software quality classification. Class noise was injected into seven software engineering measurement datasets, and the performance of three learners, random forests, C4.5, and Naive Bayes, was analyzed. The random forest classifier was utilized for this study because of its strong performance relative to well-known...

article

Assuring Timeliness in an e-Science Service-Oriented Architecture

J.C. Sloan, T.M. Khoshgoftaar, V. Raghav

Computer > 2008 > 41 > 8 > 56 - 62

An improvement to public-resource e-science portals shows promise in solving a well-known dilemma: how to dynamically discover a provider PC that is ready to deliver computing power when the scientific community requires it.

chapter

Using evolutionary sampling to mine imbalanced data

D.J. Drown, T.M. Khoshgoftaar, R. Narayanan

Sixth International Conference on Machine Learning and Applications (ICMLA 2007) > 363 - 368

2007 International Conference on Machine Learning and Applications

Class imbalance tends to cause inferior performance in data mining learners. Evolutionary sampling is a technique which seeks to counter this problem by using genetic algorithms to evolve a reduced sample of a complete dataset to train a classification model. Evolutionary sampling works to remove noisy and duplicate instances so that the sampled training data will produce a superior classifier. We...

chapter

Learning with limited minority class data

T.M. Khoshgoftaar, C. Seiffert, J. Van Hulse, A. Napolitano, more

Sixth International Conference on Machine Learning and Applications (ICMLA 2007) > 348 - 353

2007 International Conference on Machine Learning and Applications

A practical problem in data mining and machine learning is the limited availability of data. For example, in a binary classification problem it is often the case that examples of one class are abundant, while examples of the other class are in short supply. Examples from one class, typically the positive class, can be limited due to the financial cost or time required to collect these examples. This...

chapter

An application of a rule-based model in software quality classification

L.A. Bullard, T.M. Khoshgoftaar, Kehan Gao

Sixth International Conference on Machine Learning and Applications (ICMLA 2007) > 204 - 210

2007 International Conference on Machine Learning and Applications

A new rule-based classification model (RBCM) and rule-based model selection technique are presented. The RBCM utilizes rough set theory to significantly reduce the number of attributes, discretation to partition the domain of attribute values, and Boolean predicates to generate the decision rules that comprise the model. When the domain values of an attribute are continuous and relatively large, rough...

chapter

Hybrid Collaborative Filtering Algorithms Using a Mixture of Experts

Xiaoyuan Su, R. Greiner, T.M. Khoshgoftaar, Xingquan Zhu

IEEE/WIC/ACM International Conference on Web Intelligence (WI'7) > 645 - 649

2007 IEEE/WIC/ACM International Conference on Web Intelligence

Collaborative filtering (CF) is one of the most successful approaches for recommendation. In this paper, we propose two hybrid CF algorithms, sequential mixture CF and joint mixture CF, each combining advice from multiple experts for effective recommendation. These proposed hybrid CF models work particularly well in the common situation when data are very sparse. By combining multiple experts to form...

chapter

Mining Data with Rare Events: A Case Study

C. Seiffert, T.M. Khoshgoftaar, J. Van Hulse, A. Napolitano

19th IEEE International Conference on Tools with Artificial Intelligence(ICTAI 2007) > 2 > 132 - 139

2007 19th IEEE International Conference on Tools with Artificial Intelligence

The performance of classification models can be negatively impacted if the data on which they are trained contains very rare events. While recent research has investigated the issue of class imbalance, few if any studies address issues related to the handling of extreme imbalance (rare events), where the minority class can account for as little as 0.1% of the training data. This work investigates...

chapter

An Empirical Study of Learning from Imbalanced Data Using Random Forest

T.M. Khoshgoftaar, M. Golawala, J. Van Hulse

19th IEEE International Conference on Tools with Artificial Intelligence(ICTAI 2007) > 2 > 310 - 317

2007 19th IEEE International Conference on Tools with Artificial Intelligence

This paper discusses a comprehensive suite of experiments that analyze the performance of the random forest (RF) learner implemented in Weka. RF is a relatively new learner, and to the best of our knowledge, only preliminary experimentation on the construction of random forest classifiers in the context of imbalanced data has been reported in previous work. Therefore, the contribution of this study...

chapter

Incomplete-Case Nearest Neighbor Imputation in Software Measurement Data

J. Van Hulse, T.M. Khoshgoftaar

2007 IEEE International Conference on Information Reuse and Integration > 630 - 637

2007 IEEE International Conference on Information Reuse and Integration

Missing values are commonly encountered in software measurement data, and k nearest neighbor imputation (kNNI) is one of the most popular imputation procedures used by researchers and practitioners in empirical software engineering. Imputation techniques are used to replace missing values with one or more alternatives. Traditionally, kNNI uses only complete cases as possible donors for imputation...

article

Unsupervised multiscale color image segmentation based on MDL principle

Qiming Luo, T.M. Khoshgoftaar

IEEE Transactions on Image Processing > 2006 > 15 > 9 > 2755 - 2761

We present an unsupervised multiscale color image segmentation algorithm. The basic idea is to apply mean shift clustering to obtain an over-segmentation and then merge regions at multiple scales to minimize the minimum description length criterion. The performance on the Berkeley segmentation benchmark compares favorably with some existing approaches

chapter

Analyzing software quality with limited fault-proneness defect data

N. Seliya, T.M. Khoshgoftaar, S. Zhong

Ninth IEEE International Symposium on High-Assurance Systems Engineering (HASE'5) > 89 - 98

Ninth IEEE International Symposium on High-Assurance Systems Engineering

Assuring whether the desired software quality and reliability is met for a project is as important as delivering it within scheduled budget and time. This is especially vital for high-assurance software systems where software failures can have severe consequences. To achieve the desired software quality, practitioners utilize software quality models to identify high-risk program modules: e.g., software...

chapter

A clustering approach to wireless network intrusion detection

Shi Zhong, T.M. Khoshgoftaar, S.V. Nath

17th IEEE International Conference on Tools with Artificial Intelligence (ICTAI'5) > 7 pp. - 196

ICTAI 2005. 17th IEEE International Conference on Tools with Artificial Intelligence

Intrusion detection in wireless networks has become an indispensable component of any useful wireless network security systems, and has recently gained attention in both research and industry communities due to widespread use of wireless local area networks (WLANs). This paper focuses on detecting intrusions or anomalous behaviors in WLANs with data clustering techniques. We first explore the security...

Publication date

Set your own date range

INFONA - science communication portal

Search results for: T.M. Khoshgoftaar

Improving Learner Performance with Data Sampling and Boosting

Addressing Class Imbalance in Non-binary Classification Problems

Resampling or Reweighting: A Comparison of Boosting Implementations

Using Imputation Techniques to Help Learn Accurate Classifiers

Software quality modeling: The impact of class noise on the random forest classifier

Assuring Timeliness in an e-Science Service-Oriented Architecture

Using evolutionary sampling to mine imbalanced data

Learning with limited minority class data

An application of a rule-based model in software quality classification

Hybrid Collaborative Filtering Algorithms Using a Mixture of Experts

Mining Data with Rare Events: A Case Study

An Empirical Study of Learning from Imbalanced Data Using Random Forest

Incomplete-Case Nearest Neighbor Imputation in Software Measurement Data

Unsupervised multiscale color image segmentation based on MDL principle

Analyzing software quality with limited fault-proneness defect data

A clustering approach to wireless network intrusion detection

Filter options

Publication date

Content availability

Publication type

Keywords

Journal

INFONA - science communication portal

Search results for: T.M. Khoshgoftaar

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Content availability

Publication type

Keywords

Journal

Reporting an error / abuse

Sending the report failed

Accessibility options