The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Individual protection, physically or mentally, is very important for someone living in a risk environment. Insurance is one of the individual protections due to accident, blaze, critical diseases or death. Insurance company plays a critical role in providing competitive product insurance that covers flexible features depend on customer requirements. In order to compete with other competitors and fulfill...
Paper deals with the problem of designing efficient classifiers for a special case of incremental concept drift. We focus on its classification based on the multiple classifier system. For the problem under consideration we propose four simple methods of combining classification and evaluate them via computer experiments.
Nonparametric Wilcoxon regressors, which generalize the rank-based Wilcoxon approach for linear parametric regression problems to nonparametric neural networks, were recently developed aiming at improving robustness against outliers in nonlinear regression problems. It is natural to investigate if the Wilcoxon approach can also be generalized to nonparametric classification problems. Motivated by...
This paper provides a short review of various association rule classifiers (ARC) that have been developed over the past decade and the common structure behind most ARCs. Furthermore, different pruning and classification schemes used in various ARCs are reviewed and two ARCs are discussed which break with the standard structure behind ARC.
Random forest is an excellent ensemble learning method, which is composed of multiple decision trees grown on random input samples and splitting nodes on a random subset of features. Due to its good classification and generalization ability, random forest has achieved success in various domains. However, random forest will generate many noisy trees when it learns from the data set that has high dimension...
Linear regression and classification techniques are very common in statistical data analysis but they are often able to extract from data only linear models, which can be a limitation in real data context. Aim of this study is to build an innovative procedure to overcome this defect. Initially, a multiple linear regression analysis using the best-subset algorithm was performed to determine the variables...
The Ant Colony Optimization (ACO) technique was inspired by the ants' behaviour throughout their exploration for food. The use of this technique has been very successful for several problems. Besides, Data Mining (DM) has emerged as an important technology with numerous practical applications, due to the wide availability of a vast amount of data. The collaborative use of ACO and DM is very promising...
Gene Selection is very important problem in the classification of serious diseases in clinical information systems. A limitation of these gene selection methods is that they may result in gene sets with some redundancy and yield an unnecessary large number of candidate genes for classification analysis. In the current work, a hybrid approach is presented in order to classify diseases, such as colon...
Stemming is a fundamental step in processing textual data preceding the tasks of text mining, Information Retrieval (IR), and natural language processing (NLP). The common goal of stemming is to standardize words by reducing a word to its base (root or stem), thus can be also considered a feature reduction technique. This paper aims at presenting a new dictionary free, content-based Arabic stemmer...
In this paper, a modified XCS is proposed to reduce the numbers of learned rules. XCS is a type of learning classifier systems and has been proven able to find accurate, maximal generalizations. However, XCS usually produces too many rules such that the readability of the classification model is greatly reduced. As a result, XCS users may not be able to obtain the desired knowledge or useful information...
This study proposes a novel classification technique of GA/k-prototypes in combination with a genetic algorithm to take the advantage of k-prototypes clustering mechanism for supporting the classification purpose. A genetic algorithm is used to adjust the weight applied to input attributes in order to enable a majority of the data records in each cluster to be with the same outcome class. We conduct...
The sparse model character of 1-norm penalty term of Least Absolute Shrinkage and Selection Operator (LASSO) can be applied to automatic feature selection. Since 1-norm SVM is also designed with 1-norm (LASSO) penalty term, this study labels it as LASSO for classification. This paper introduces the smooth technique into 1-norm SVM and calls it smooth LASSO for classification (SLASSO) to provide simultaneous...
The county level of basic public services analysis and classification play an important role in county economic growth and improve benefit of healthy development of urbanization in China. According to the county level of basic public services data which is large scale and imbalance, this paper presented a support vector machine model to classify the county level of basic public services. The method...
Dataflow programming has been used to describe signal processing applications for many years, traditionally with cyclostatic dataflow (CSDF) or synchronous dataflow (SDF) models that restrict expressive power in favor of compile-time analysis and predictability. Dynamic dataflow is not restricted with respect to expressive power, but it does require runtime scheduling in the general case. Fortunately,...
Classification is a widely researched area in the machine learning and fuzzy communities with several approaches proposed by both communities. Some of the most relevant rule-based approaches from the machine learning community might include decision trees and rule inducers. The fuzzy community has also proposed many rule-based approaches, such as fuzzy decision trees and genetic fuzzy systems. This...
Simplified Silhouette Filter (SSF) is a recently introduced feature selection method that automatically estimates the number of features to be selected. To do so, a sampling strategy is combined with a clustering algorithm that seeks clusters of correlated (potentially redundant) features. It is well known that the choice of a similarity measure may have great impact in clustering results. As a consequence,...
In this paper, classification of audio sources is presented to supplement current work on existing system for localization of audio sources. The question of achieving the audio classification lies in the convenient discrimination of the feature vector in the feature vector space. Characteristics based on frequency analysis were chosen and used as feature vector. Artificial neural network was applied...
A new method is presented which combines a deterministic analytical method and a probabilistic measure to classify rock types on the basis of their hyperspectral curve shape. This method is a supervised learning algorithm using Gaussian Processes (GPs) and the Observation Angle Dependent (OAD) covariance function. The OAD covariance function makes use of the properties of the Spectral Angle Mapper...
We give sub linear-time approximation algorithms for some optimization problems arising in machine learning, such as training linear classifiers and finding minimum enclosing balls. Our algorithms can be extended to some kernelized versions of these problems, such as SVDD, hard margin SVM, and L2-SVM, for which sub linear-time algorithms were not known before. These new algorithms use a combination...
Data mining or Knowledge discovery is seen as an increasingly important tool by modern business to transform data into an informational advantage. Mining is a process of finding correlations among dozens of fields in large relational databases and extracts useful information that can be used to increase revenue, cuts costs, or both. Classification is a supervised machine learning procedure and an...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.