The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
This paper describes a novel approach to pattern classification that combines Parzen window and support vector machines. Pattern classification is usually performed in universes where all possible categories are defined. Most of the current supervised learning classification techniques do not account for undefined categories. In a universe that is only partially defined, there may be objects that...
This paper investigates a method for instance selection in the context of supervised classification adapted to large databases. Based on the scale up concept, the method reduces the time required to perform the selection procedure by enabling the application of known condensation instance techniques to only small data sets instead of the whole set. The novelty of our approach relies in the way of...
The increase of malware that are exploiting the Internet daily has become a serious threat. The manual heuristic inspection of malware analysis is no longer considered effective and efficient compared against the high spreading rate of malware. Hence, automated behavior-based malware detection using machine learning techniques is considered a profound solution. The behavior of each malware on an emulated...
Cloud computing is an elastic computing model that the users can lease the resources from the rentable infrastructure. Cloud computing is gaining popularity due to its lower cost, high reliability and huge availability. To utilize the powerful and huge capability of cloud computing, this paper is to import it into data mining and machine learning field. As one of the most influential and open competition...
Performance of classification methods using Machine Learning Techniques majority depends on the quality of data were used in learning. The transformation techniques are used to increase the efficiency of classification because each type of data is suitable for different classification techniques. This study is aimed at providing comparative performance of different classification techniques by changing...
Supervised Learning (SL) is a machine learning research area which aims at developing techniques able to take advantage from labeled training samples to make decisions over unseen examples. Recently, a lot of tools have been presented in order to perform machine learning in a more straightforward and transparent manner. However, one problem that is increasingly present in most of the SL problems being...
While advances in sensor and signal processing techniques have provided effective tools for quantitative research on traditional Chinese pulse diagnosis (TCPD), the automatic classification of pulse waveforms is remained a difficult problem. To address this issue, this paper proposed a novel edit distance with real penalty (ERP)-based k-nearest neighbors (KNN) classifier by referring to recent progresses...
Financial distress is the most synthetic form of business crisis and financial distress prediction (FDP) has been a widely and continually studied topic in the field of corporate finance. This paper attempts to put forward OR-CBR in K-nearest neighbors model, which can be the implementation of corresponding algorithm.
In a 2006 TPAMI paper, Wang proposed the neighborhood counting measure, a similarity measure for the k-NN algorithm. In his paper, Wang mentioned the minimum risk metric (MRM,), an early distance measure based on the minimization of the risk of misclassification. Wang did not compare NCM to MRM because of its allegedly excessive computational load. In this comment paper, we complete the comparison...
Stuttering is a speech disorder in which the normal flow of speech is disrupted by occurrences of dysfluencies, such as repetitions, interjection and so on. There are high proportion of repetitions and prolongations in stuttered speech, usually at the beginning of sentences. Consequently, acoustic analysis can be used to classify the stuttered events. This paper describes particular stuttering events...
This paper proposes an efficient training strategy for one-class support vector machines. The strategy exploits the feature of a trained one-class SVM which uses points only residing on the exterior region of data distribution as support vectors. Thus the proposed training set reduction method selects the so-called extreme points which sit on the boundary of data distribution, through local geometry...
In this paper the Ex(α) Reinforcement Learning algorithm is presented. This algorithm is designed to deal with problems where the use of continuous actions have clear advantages over the use of fine grained discrete actions. This new algorithm is derived from a baseline discrete actions algorithm implemented within a kind of κ-nearest neighbors approach in order to produce a probabilistic representation...
Red tides pose a significant environmental and economic threat in the Gulf of Mexico. Timely detection of red tides is important for understanding this phenomenon. In this paper, learning approaches based on k-nearest neighbors, random forests and support vector machines have been evaluated for red tide detection from MODIS satellite images. Detection results from our algorithms were compared with...
In distribution systems, the determination of the load is relatively simple when measurements are available. Frequently, due to various causes such as metering and transmission equipment failures, data are missing for part or all of a day. In these cases, estimation a ldquocorrectedrdquo value must be made. The paper presents two methods (k-Nearest Neighbors, (kNNs) and Clustering methods) for treatment...
We introduce a text-based image feature and demonstrate that it consistently improves performance on hard object classification problems. The feature is built using an auxiliary dataset of images annotated with tags, downloaded from the Internet. We do not inspect or correct the tags and expect that they are noisy. We obtain the text feature of an unannotated image from the tags of its k-nearest neighbors...
Accurate probability estimation generated by learning models is desirable in some practical applications, such as medical diagnosis. In this paper, we empirically study traditional decision-tree learning models and their variants in terms of probability estimation, measured by conditional log likelihood (CLL). Furthermore, we also compare decision tree learning with other kinds of representative learning:...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.