Search results for: Taghi M. Khoshgoftaar

Items from 1 to 13 out of 13 results

chapter

Analysis of Transfer Learning Performance Measures

Karl R. Weiss, Taghi M. Khoshgoftaar

2017 IEEE International Conference on Information Reuse and Integration (IRI) > 338 - 345

2017 IEEE International Conference on Information Reuse and Integration (IRI)

In machine learning applications, there are scenarios of having no labeled training data, due to the data being rare or too expensive to obtain. In these cases, it is desirable to use readily available labeled data, that is similar to, but not the same as, the domain application of interest. Transfer learning algorithms are used to build high-performance classifiers, when the training data has different...

chapter

Investigating Transfer Learners for Robustness to Domain Class Imbalance

Karl R. Weiss, Taghi M. Khoshgoftaar

2016 15th IEEE International Conference on Machine Learning and Applications (ICMLA) > 207 - 213

2016 15th IEEE International Conference on Machine Learning and Applications (ICMLA)

A transfer learning environment is characterized by a machine learning algorithm being trained with data from one domain (the source domain) and being tested on data from a different domain (the target domain). In a transfer learning scenario, the class probability of the source domain may be different from the class probability of the target domain, which is referred to as "domain class imbalance"...

chapter

An Investigation of Transfer Learning and Traditional Machine Learning Algorithms

Karl R. Weiss, Taghi M. Khoshgoftaar

2016 IEEE 28th International Conference on Tools with Artificial Intelligence (ICTAI) > 283 - 290

2016 IEEE 28th International Conference on Tools with Artificial Intelligence (ICTAI)

Previous research focusing on the evaluation of transfer learning algorithms has predominantly used real-world datasets to measure an algorithm's performance. A test with a real-world dataset exposes an algorithm to a single instance of distribution difference between the training (source) and test (target) datasets. These previous works have not measured performance over a wide-range of source and...

chapter

Designing a Testing Framework for Transfer Learning Algorithms (Application Paper)

Karl R. Weiss, Taghi M. Khoshgoftaar, Oneeb Rehman

2016 IEEE 17th International Conference on Information Reuse and Integration (IRI) > 152 - 159

2016 IEEE 17th International Conference on Information Reuse and Integration (IRI)

Most works covering the topic of transfer learning propose an algorithm to solve a given domain adaptation problem, then test the algorithm using real-world datasets. A test with a real-world dataset represents a single transfer learning test condition, which partially measures an algorithm's performance. Previous research has placed little emphasis on developing a comprehensive and uniform test for...

chapter

Cross-Domain Sentiment Analysis: An Empirical Investigation

Brian Heredia, Taghi M. Khoshgoftaar, Joseph Prusa, Michael Crawford

2016 IEEE 17th International Conference on Information Reuse and Integration (IRI) > 160 - 165

2016 IEEE 17th International Conference on Information Reuse and Integration (IRI)

Understanding the sentiment conveyed by a person is a crucial task in any social interaction. Moreover, it can be used to gain insight and understanding of views held by many people. Sentiment classification is not limited to human interaction, as text can also convey the sentiment of the author. Opinion mining in text is a long studied field in machine learning. This study focuses on two of the many...

chapter

Using Random Undersampling to Alleviate Class Imbalance on Tweet Sentiment Data

Joseph Prusa, Taghi M. Khoshgoftaar, David J. Dittman, Amri Napolitano

2015 IEEE International Conference on Information Reuse and Integration > 197 - 202

2015 IEEE International Conference on Information Reuse and Integration (IRI)

Sentiment classification of tweets is used for a variety of social sensing tasks and provides a means of discerning public opinion on a wide range of topics. A potential concern when performing sentiment classification is that the training data may contain class imbalance, which can negatively affect classification performance. A classifier trained on imbalanced data may be biased in favor of the...

chapter

A Novel Noise-Resistant Boosting Algorithm for Class-Skewed Data

Jason Van Hulse, Taghi M. Khoshgoftaar, Amri Napolitano

2012 11th International Conference on Machine Learning and Applications > 2 > 551 - 557

2012 Eleventh International Conference on Machine Learning and Applications (ICMLA)

Boosting methods have been successfully applied in a wide variety of machine learning applications. In the context of data quality issues, a number of variants of the standard boosting method have been proposed and evaluated. To address the problem of mislabeled examples, ORBoost was developed to prevent over fitting to noisy examples. Our research group has recently proposed RUSBoost as an enhancement...

chapter

A Novel Noise Filtering Algorithm for Imbalanced Data

Jason Van Hulse, Taghi M Khoshgoftaar, Amri Napolitano

2010 Ninth International Conference on Machine Learning and Applications > 9 - 14

2010 Ninth International Conference on Machine Learning and Applications (ICMLA 2010)

Noise filtering is a commonly-used methodology to improve the performance of learners built using low-quality data. A common type of noise filtering is a data preprocessing technique called classification filtering. In classification filtering, a classifier is built and evaluated on the training dataset (typically using cross-validation) and any misclassified instances are considered noisy. The strategies...

chapter

A Comparative Study of Threshold-Based Feature Selection Techniques

Huanjing Wang, Taghi M Khoshgoftaar, Jason Van Hulse

2010 IEEE International Conference on Granular Computing > 499 - 504

2010 IEEE International Conference on Granular Computing (GrC-2010)

Given high-dimensional software measurement data, researchers and practitioners often use feature (metric) selection techniques to improve the performance of software quality classification models. This paper presents our newly proposed threshold-based feature selection techniques, comparing the performance of these techniques by building classification models using five commonly used classifiers...

chapter

Evaluating the impact of data quality on sampling

Jason Van Hulse, Taghi M Khoshgoftaar, Amri Napolitano

2010 IEEE International Conference on Information Reuse&Integration > 31 - 36

2010 IEEE International Conference on Information Reuse & Integration (IRI 2010)

Three important data characteristics that can substantially impact a data mining project are class imbalance, poor data quality and the size of the training dataset. Data sampling is a commonly used method for improving learner performance when data is imbalanced. However, little effort has been put forth to investigate the performance of data sampling techniques when data is both noisy and imbalanced...

chapter

Active learning with neural networks for intrusion detection

Naeem Seliya, Taghi M Khoshgoftaar

2010 IEEE International Conference on Information Reuse&Integration > 49 - 54

2010 IEEE International Conference on Information Reuse & Integration (IRI 2010)

This paper presents a neural-network-based active learning procedure for computer network intrusion detection. Applying data mining and machine learning techniques to network intrusion detection often faces the problem of very large training dataset size. For example, the training dataset commonly used for the DARPA KDD-1999 offline intrusion detection project contained approximately five hundred...

article

Supervised Neural Network Modeling: An Empirical Investigation Into Learning From Imbalanced Data With Labeling Errors

Taghi M Khoshgoftaar, Jason Van Hulse, Amri Napolitano

IEEE Transactions on Neural Networks > 2010 > 21 > 5 > 813 - 830

Neural network algorithms such as multilayer perceptrons (MLPs) and radial basis function networks (RBFNets) have been used to construct learners which exhibit strong predictive performance. Two data related issues that can have a detrimental impact on supervised learning initiatives are class imbalance and labeling errors (or class noise). Imbalanced data can make it more difficult for the neural...

chapter

Hybrid sampling for imbalanced data

Chris Seiffert, Taghi M. Khoshgoftaar, Jason Van Hulse

2008 IEEE International Conference on Information Reuse and Integration > 202 - 207

2008 IEEE International Conference on Information Reuse and Integration (2008 IRI)

Decision tree learning in the presence of imbalanced data is an issue of great practical importance, as such data is ubiquitous in a wide variety of application domains. We propose hybrid data sampling, which uses a combination of two sampling techniques such as random oversampling and random undersampling, to create a balanced dataset for use in the construction of decision tree classification models...

Filter options

Keywords:
TRAINING DATA

Publication date

Set your own date range

Publication type

book (12)
article (1)

Keywords

TRAINING (10)
DATA MINING (7)
DATA MODELS (5)
MACHINE LEARNING ALGORITHMS (5)
MACHINE LEARNING (4)
TESTING (4)
ALGORITHM DESIGN AND ANALYSIS (3)
DATA SAMPLING (3)
DISTORTION (3)
LEARNING (ARTIFICIAL INTELLIGENCE) (3)
NOISE (3)
NOISE LEVEL (3)
NOISE MEASUREMENT (3)
PREDICTION ALGORITHMS (3)
SOFTWARE (3)
SUPPORT VECTOR MACHINES (3)
TRANSFER LEARNING (3)
ANALYSIS OF VARIANCE (2)
CLASS IMBALANCE (2)
CLASS NOISE (2)
CLASSIFICATION (2)
DECISION TREES (2)
DOMAIN ADAPTATION (2)
DOMAIN CLASS IMBALANCE (2)
IMBALANCED DATA (2)
NEODYMIUM (2)
NEURAL NETS (2)
NEURAL NETWORKS (2)
SAMPLING METHODS (2)
SENTIMENT ANALYSIS (2)
TRADITIONAL MACHINE LEARNING (2)
TRAINING DATASET (2)
ACTIVE LEARNING (1)
ARTIFICIAL NEURAL NETWORKS (1)
AUC (1)
BIOLOGICAL SYSTEM MODELING (1)
BOOSTING (1)
C4.5 CLASSIFIER (1)
C4.5 DECISION TREE (1)
CLASS DISTRIBUTION (1)
CLASS-SKEWED DATA (1)
CLASSIFICATION ACCURACY (1)
CLASSIFICATION ALGORITHMS (1)
CLASSIFICATION TREE ANALYSIS (1)
COMPUTER NETWORK INTRUSION DETECTION (1)
CROSS-DOMAIN (1)
CROSS-VALIDATION (1)
DARPA KDD-1999 (1)
DATA PREPROCESSING TECHNIQUE (1)
DATA QUALITY (1)
DATASET SIZE (1)
DISTANCE MEASUREMENT (1)
DISTORTION MEASUREMENT (1)
DISTORTION PROFILES (1)
ECLIPSE DATA SETS (1)
FEATURE EXTRACTION (1)
FILTERING THEORY (1)
HIGH-DIMENSIONAL SOFTWARE MEASUREMENT DATA (1)
INTRUSION DETECTION (1)
LABELING (1)
LABELING ERRORS (1)
LEARNING ALGORITHM (1)
MACHINE LEARNING TECHNIQUES (1)
MOTION PICTURES (1)
MULTILAYER PERCEPTRONS (1)
NETWORK TRAFFIC INSTANCES (1)
NEURAL NETWORK LEARNING ALGORITHMS (1)
NIOBIUM (1)
NOISE DISTRIBUTION (1)
NOISE FILTERING ALGORITHM (1)
NOISE-RESISTANT BOOSTING ALGORITHMS (1)
PATTERN CLASSIFICATION (1)
PERFORMANCE METRICS (1)
RADIAL BASIS FUNCTION NETWORKS (1)
REVIEWS (1)
SECURITY OF DATA (1)
SOCIOTECHNICAL SYSTEMS (1)
SOFTWARE METRICS (1)
SOFTWARE QUALITY (1)
SOFTWARE QUALITY CLASSIFICATION MODELS (1)
STRESS (1)
SUPERVISED LEARNING (1)
SUPERVISED LEARNING INITIATIVES (1)
SUPERVISED NEURAL NETWORK MODELING (1)
TEST FRAMEWORK (1)
THRESHOLD-ADJUSTED CLASSIFICATION FILTER (1)
THRESHOLD-BASED FEATURE SELECTION TECHNIQUE (1)
THRESHOLD-BASED FEATURE SELECTION TECHNIQUES (1)
TRANSFER LEARNING TESTING (1)
TWEET (1)
TWEET MINING (1)
VERY LARGE TRAINING DATASET SIZE (1)
more

INFONA - science communication portal

Search results for: Taghi M. Khoshgoftaar

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Publication type

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options