The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Random Forests and their many variations developed to one of the most successful instruments to automatically analyse image data. One of the most crucial parts is the definition and selection of node tests within the individual trees, which among other things allow for trade-offs between accuracy and computational load. This paper discusses several different approaches to test creation and compares...
Two algorithms for building classification trees, based on Tsallis and Rényi entropy, are proposed and applied to customer churn problem. The dataset for modeling represents highly unbalanced proportion of two classes, which is often found in real world applications, and may cause negative effects on classification performance of the algorithms. The quality measures for obtained trees are compared...
Ensemble techniques have been widely used for improving the classification performance, and recent studies show that ensembling classifiers through multi-modal perturbation can further improve the classification performance. In this paper, we propose a selective ensemble algorithm based on multi-modal perturbation (called SE_MP). In SE_MP, we devise a multi-modal perturbation method based on sampling...
Support Vector Machine (SVM) is a popular machine learning technique for classification. SVM is computationally infeasible with large dataset due to its large training time. In this paper we compare three different methods for training time reduction of SVM. Different combination of Decision Tree (DT), Fisher Linear Discriminant (FLD), QR Decomposition (QRD) and Modified Fisher Linear Discriminant...
Hyperspectral image classification is a challenging classification problem: obtaining complete and representative training sets is costly; pixels can belong to unknown classes; and it is generally an ill-posed problem. The need to achieve high classification accuracy surpasses the need to classify the entire image. To achieve this, we use classification with rejection by providing the classifier an...
Text Categorization plays an important role in the fields of information retrieval, machine learning, natural language processing, data mining and others. With the development of computer and information technology, there have been many classification algorithms. Each text classification algorithms will get result at differing speeds and efficiency due to the various feature of test text. It has been...
Random Forest is a well-known ensemble learning method that achieves high recognition accuracies while preserving a fast training procedure. To construct a Random Forest classifier, several decision trees are arranged in a forest while a majority voting leads to the final decision. In order to split each node of a decision tree into two children, several possible variables are randomly selected while...
Multi-index hashing (MIH) is the state-of-the-art method for indexing binary codes, as it divides long codes into substrings and builds multiple hash tables. However, MIH is based on the dataset codes uniform distribution assumption, and will lose efficiency in dealing with non-uniformly distributed codes. Besides, there are lots of results sharing the same Hamming distance to a query, which makes...
In this paper, a novel cultural event classification algorithm based on convolutional neural networks is proposed. The proposed method firstly extracts regions that contain meaningful information. Then, convolutional neural networks are trained to classify the extracted regions. The final classification of a scene is performed by combining the classification results of each extracted region of the...
In this paper, we consider multi-sensor classification when there is a large number of unlabeled samples. The problem is formulated under the multi-view learning framework and a Consensus-based Multi-View Maximum Entropy Discrimination (CMV-MED) algorithm is proposed. By iteratively maximizing the stochastic agreement between multiple classifiers on the unlabeled dataset, the algorithm simultaneously...
Large-scale distributed learning plays an ever-more increasing role in modern computing. However, whether using a compute cluster with thousands of nodes, or a single multi-GPU machine, the most significant bottleneck is that of communication. In this work, we explore the effects of applying quantization and encoding to the parameters of distributed models. We show that, for a neural network, this...
The general phenomenon for Image Classification is based on the Feature extraction mechanism. In every domain of image analysis, the classification accuracy is dependent on how better the feature set is generated which helps the machine to learn and predict the unknown sample class label. In this paper, a novel feature extraction mechanism is proposed and named as Counting Label Occurrence Matrix...
Voting based Extreme learning machine was recently proposed to reduce the error due to variance in Extreme Learning Machine. This paper further refines the algorithm by using entropy based ensemble pruning. Results obtained shows significant improvement in performance along with reduction in computational and storage requirement.
We analyze the performance of classification schemes on information collected from social conversation posted in Twitter among audiences of a popular US based TV show. In this research, we consider entropy as a measure of information exchange in a group conversation that is related to social, temporal, and second screen device features. The group conversations are identified by hashtags present in...
A weighted least squares scheme based on an empirical survival error potential function is proposed in this paper. The empirical survival error potential function provides an error compensation scheme for noise distributions far from being Gaussian. This error compensation procedure is efficiently implemented via a weighted least squares formulation where an analytical solution form is obtained. The...
Aiming at properties of remote sensing image data such as high-dimension, nonlinearity and massive unlabeled samples, a kind of probability least squares support vector machine (PLSSVM) classification method based on hybrid entropy and L1 norm was proposed. Firstly, hybrid entropy was designed by combining quasi-entropy with entropy difference, which was used to select the most "valuable"...
Cross-situational learning, the ability to learn word meanings across multiple scenes consisting of multiple words and referents, is thought to be an important tool for language acquisition. The ability has been studied in infants, children, and adults, and yet there is much debate about the basic storage and retrieval mechanisms that operate during cross-situational word learning. It has been difficult...
This paper presents an empirical study on machine learning based sentiment analysis for Vietnamese, in which we focus on the task of sentiment classification. We investigate the task regarding both learning model and linguistics feature aspects. We also introduce an annotated corpus for sentiment classification extracted from hotel reviews in Vietnamese and conduct a series of experiments and analyses...
Attacks against web servers and web-based applications remain a serious global network security threat. Attackers are able to compromise web services, collect confidential information from web data bases, interrupt or completely paralyze web servers. In this study, we consider the analysis of HTTP logs for the detection of network intrusions. First, a training set of HTTP requests which does not contain...
This paper introduces a novel method for forming efficient one-class classifier ensembles. A common problem in one-class classification is a complex structure of the target class, which often leads to creation of a too expanded decision boundary. We propose to employ a clustering step in order to partition the target class into atomic subsets and using these as input for one-class classifiers. By...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.