The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Growing scale of server infrastructure in large datacenters has led to an increased need for effective server workload prediction mechanisms. Two main challenges faced in server workload prediction task are lack of large-scale training data and changes in the underlying distribution of server workloads in events like change in dominant applications of servers or change in allocation of servers, etc...
Real world data mining applications such as Mine Countermeasure Missions (MCM) involve learning from imbalanced data sets, which contain very few instances of the minority classes and many instances of the majority class. For instance, the number of naturally occurring clutter objects (such as rocks) that are detected typically far outweighs the relatively rare event of detecting a mine. In this paper...
Synthetic Minority Oversampling TEchnique (SMOTE) is a popular oversampling method that was proposed to improve random oversampling but its behavior on high-dimensional data has not been thoroughly investigated. In this paper we evaluate the performance of SMOTE on high-dimensional data, using gene expression microarray data. We observe that SMOTE does not attenuate the bias towards the classification...
Learning with imbalanced datasets has been a major topic of study for many years. In this paper, we focus on a type of imbalance called imbalance due to rare instances. Such imbalances occur in a variety of domains. Rare instances have received less focus in prediction problems and we wish to draw attention to how accuracy can be improved in the presence of rare data. We discuss an approach to regression...
In this paper we present a Dynamic Sampling Framework for use with multi-class imbalanced data containing any number of classes. The framework makes use of existing sampling techniques such as RUS, ROS, and SMOTE and ties the classification algorithm into the sampling process in a wrapper like manner. In doing so the framework is able to search for a desirably sampled training set, thus eliminating...
The goal of class prediction studies is to develop rules to accurately predict the class membership of new subjects. The classifiers differ in the way they combine the values of the variables available for each subject. Frequently the classifiers are developed using class-imbalanced data, where the number of samples in each class is not equal. Standard classification methods used on class-imbalanced...
Multi-class classification problem has become a challenging problem in bioinformatics research. The problem becomes more difficult as the number of classes increases. Decomposing the problem into a set of binary problems can be a good solution in some cases. One of the popular approaches is to build a hierarchical tree structure where a binary classifier is used at each node of the tree. This paper...
Even though facial expressions have universal meaning in communications, their appearances show a large amount of variation due to many factors, such as different image acquisition setups, different ages, genders, and cultural backgrounds etc. Collecting enough amounts of annotated samples for each target domain is impractical, this paper investigates the problem of facial expression recognition in...
Despite early success in automatic chord recognition, recent efforts are yielding diminishing returns while basically iterating over the same fundamental approach. Here, we abandon typical conventions and adopt a different perspective of the problem, where several seconds of pitch spectra are classified directly by a convolutional neural network. Using labeled data to train the system in a supervised...
More than a decade of research has produced numerous representations and similarity measures to support time series classification and clustering. Yet most of the work in the field is so focused on the representation or similarity measure that it ignores the possibility of improving performance using ensembles of representations or classifiers. This paper explores ways of exploiting representational...
Ensembles of neural networks have been the focus of extensive studies over the past two decades. Effectively encouraging diversity remains a key element in yielding improved performance from such ensembles. Negatively correlated learning (NCL) has emerged as a promising framework for concurrently training an ensemble of learners while emphasizing the cooperation among them. The NCL methodology relies...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.