The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
The quality of the training data used in a supervised image classification can impact on the accuracy of the resulting thematic map obtained. Here the effects of mis-labeled training cases on the accuracy of classifications by discriminant analysis and a support vector machine were explored. The accuracy of both classifiers varied with the amount and nature of mis-labeled training cases. In particular,...
We study the problem of online multitask learning for solving multiple related classification tasks in parallel, aiming at classifying every sequence of data received by each task accurately and efficiently. One practical example of online multitask learning is the micro-blog sentiment detection on a group of users, which classifies micro-blog posts generated by each user into emotional or non-emotional...
Growth of microblogging “Twitter” is dramatic among online users in Thailand. Communication on Twitter is very lively and up-to-date since users Users often express their feelings and sentiments in Twitter posts related to current topics or new growing topic. While sentiment analysis on Twitter has challenges in language related issues, such as short-length message and word usage variation, it also...
The automatic insertion of diacritics in electronic texts is necessary for a number of languages, including French, Romanian, Croatian, Sindhi, Vietnamese, etc. When diacritics are removed from a word and the resulting string of characters is not a word, it is easy to recover the diacritics. However, sometimes the resulting string is also a word, possibly with different grammatical properties or a...
Our research involves an original method for the intensity of speech defect monitoring in child patients with developmental dysphasia. We have drawn upon a body of knowledge consisting of phonetics, acoustics and ANN applications. The aim of the paper is to compare two methods based on the vowel detector, both of which classify the parameter of developmental dysphasia, with the results of the speech...
Classification and prediction are effective tools in anomaly and fault detection. They can be used in development of a continuous learning prediction method. Inaccurate classification will result in either too many or too few anomalies or under- and over-diagnosis. The confidence of prediction relies on the accurate determination of class centers and borders based on the adequate training data. This...
Information has a great value, in order to use the existing information we need to store it in a manner which can be retrieved easily when needed. So classifying the available information becomes inevitable. In addition to the existing supervised and unsupervised paradigms of classification the paper attempts to exploit the concept of semi-supervised learning paradigm. Semi-supervised learning is...
In this paper we present a new One-Versus-All or OVA-based scheme for multi-class classification problems, aiming to reduce the training time when applying support vector machines (SVMs), particularly on large datasets. The experimental results on ten benchmark datasets show that the performance of the proposed scheme, referred to as "VINE", is comparable to that of its predecessor OVA scheme,...
Coreference is a common linguistic phenomenon in natural language understanding, it plays an important role in simplifying the expression and linking up the context. In this paper, the algorithm of support vector machines is applied to solve the problem of Chinese coreference, we consider fully the important characteristics which related to coreference and integrate them effectively to build model...
Feature selection and weighting are normally ways to improve KNN classification algorithm. In this paper, we use the reverse cloud algorithm to map the training samples into clouds. Each attribute is mapped to a cloud vector. Reverse cloud algorithm is not sensitive to the noise on data sets and it can eliminate the impact of noise on classification effectively. By comparing the similarity of clouds...
Our aim in this paper is to propose a rule-weight learning algorithm in fuzzy rule-based classifiers. The proposed algorithm is presented in two modes: first, all training examples are assumed to be equally important and the algorithm attempts to minimize the error-rate of the classifier on the training data by adjusting the weight of each fuzzy rule in the rule-base, and second, a weight is assigned...
As reliance on Internet connected systems expands, the threat of damage from malicious actors, especially undetected actors, rises. Masquerade attacks, where one individual or system poses as another, are among the most harmful and difficult to detect types of intrusion. Previous efforts to detect masquerade attacks have focused on host-based approaches, including command line, system call, and GUI...
The SVDD (support vector data description) is one of the most well-known one-class support vector learning methods, in which one tries the strategy of utilizing balls defined on the feature space in order to distinguish a set of normal data from all other possible abnormal objects. The usual strategy of the SVDD depends on the process of finding the region for the normal-class training data with somewhat...
Support vector machine has been widely used in the classification issues. This paper proposed a new cascade support vector machine classification algorithm CSVM with AdaBoost algorithm framework and support vector machine SVM combination to deal with the problem of multiple classifiers. for the problem of consuming time in the multi-classification problems with support vector machines, this paper...
Class posterior distributions have recently been used quite successfully in Automatic Speech Recognition (ASR), either for frame or phone level classification or as acoustic features, which can be further exploited (usually after some “ad hoc” transformations) in different classifiers (e.g., in Gaussian Mixture based HMMs). In the present paper, we show preliminary results showing that it may be possible...
We present a new and computationally efficient scheme for classifying signals into a fixed number of known classes. We model classes as subspaces in which the corresponding data is well represented by a dictionary of features. In order to ensure low misclassification, the subspaces should be incoherent so that features of a given class cannot represent efficiently signals from another. We propose...
Although an improvement of hierarchical text classification can be achieved by using hierarchical structure information, existing hierarchical text classification methods suffer from two problems: data skew (especially in large-scale hierarchy) and error propagation. In this paper, we first define the concept of path-based semantic vector for the presentation of categories. Then a set of additional...
In this work we propose a novel method for automatic discrete speech recognition composed from two steps. In a first step, discrete speech features are extracted by means of Mel Frequency Cepstral Coefficients (MFCCs) followed by vector quantization (VQ). Then in a second step, the obtained features are fed to a Tree distribution classifier which provides the class-label associated with each feature...
When collecting network connection information, we can not obtain a complete data set at once, which result in SVM training insufficiently and high error rate of prediction. To solve this problem, this paper proposes a new method that combines support vector machine with clustering algorithm, based on analyzing the relation between boundary support vectors and KKT condition. In the method, firstly,...
SVM-based classification needs lots of labeled data to train classifier model, but labeling training dataset is a time-wasting and energy-wasting task. Furthermore, the feature space is sparse commonly because of text's high dimension. All of the factors above can influence the performance of classification. We propose a SVM-based text classification with SSK-means clustering algorithm where little...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.