The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Tracking concept drifts in data streams has recently become a hot topic in data mining. Most of the existing work is built on a single-window-based mechanism to detect concept drifts. Due to the inherent limitation of the single-window-based mechanism, it is a challenge to handle different types of drifts. Motivated by this, a new classification algorithm based on a double-window mechanism for handling...
In machine learning classification, the classifier can be described by some rules, and the rules can be expressed by fuzzy granules corresponding to fuzzy concepts. In this paper we will introduce fuzzy information granulation to the process of building fuzzy classifier. Furthermore, we will present an optimized information granulation based machine learning classification algorithm. Experiments carried...
Classification on noisy data streams has recently become one of the most important topics in streaming data mining. In this paper, a Classification algorithm for mining Data Streams based on Mixture Models of C4.5 and NB is proposed called CDSMM. In this algorithm, C4.5 is used as the base classifiers, the hypothesis testing method is introduced for the detection of concept drifts, and a Naïve Bayes...
“Gain-Based Separation” is a novel heuristic that modifies the standard multiclass decision tree learning algorithm to produce forests that can describe an example or object with multiple classifications. When the information gain at a node would be higher if all examples of a particular classification were removed, those examples are reserved for another tree. In this way, the algorithm performs...
Census can provide the fundamental population data of the whole nation. The census data are rich with hidden information that can be used for the investigation of national conditions and national power. Data Mining aims at extract the implicit, previously unknown, and potentially useful knowledge from voluminous, non-complete, fuzzy, stochastic data. Using Data Mining in census data can make full...
A new method for detecting and classifying loudspeaker faults is presented in this paper. Total response of high-order harmonics groups is measured and used as defect features of loudspeaker. Based on support vector machine (SVM), we built a classification system combined with one-class SVM and Directed Acyclic Graphic SVM (DAGSVM). Comparing with K-nearest neighbor (k-NN) classifier, the accuracy...
A new clustering classification approach based on fuzzy closeness relationship (FCR) is studied in this paper. As we know, fuzzy clustering classification is one of important and valid methods to knowledge discovery. One of problems in fuzzy clustering classification is to determine a certain fuzzy sample classification in given limited sample space. Another is its validity, that is to say, if the...
A detailed discussion on contributions from feature attributes to the classifying attribute in the nonlinear classification model based on the Choquet integral is given in this paper. The work provides a new understanding to the geometric structure of the model with contribution rates from the feature attributes towards the classification, as well as the interaction among them.
Accuracy is a very important criterion for the classifier in the process of classification. In this paper, a unified paradigm for the calculation of accuracy evaluated different classifier, using topological covering-based granular computing, is presented under the given sample space and different ideal classification assumptions. And corresponding examples for the calculation of accuracy in different...
Travel Information Service (TIS) system is an important component in Intelligent Transportation System (ITS). By studying the relationship between the three factors in traffic which means travelers, vehicles and roads, presents a new classification method of TIS for public travel, and transforms each type of service into common representation using XML (eXtensible Markup Language), so that the services...
This paper presents an extension to the Rule-Based Similarity (RBS) model a novel rough set approach to the problem of learning a similarity relation from data. The original model, proposed in [1], applied the notion of Tversky's feature contrast model in a rough set framework to facilitate an accurate case-based classification. In the dynamic RBS model, a dynamic reducts technique is used to broaden...
This work describes the development and evaluation of a recognizer for different levels of cognitive workload in the car. We collected multiple biosignal streams (skin conductance, pulse, respiration, EEG) during an experiment in a driving simulator in which the drivers performed a primary driving task and several secondary tasks of varying difficulty. From this data, an SVM based workload classifier...
We point out a problem inherent in the optimization scheme of many popular feature selection methods. It follows from the implicit assumption that higher feature selection criterion value always indicates more preferable subset even if the value difference is marginal. This assumption ignores the reliability issues of particular feature preferences, over-fitting and feature acquisition cost. We propose...
Classifier ensembles based on selection-fusion strategy have recently aroused enormous interest. The main idea underlying this strategy is to use miniensembles instead of monolithic base classifiers in an ensemble in order to improve the overall performance. This paper proposes a classifier selection method to be used in selection-fusion strategies. The method involves first splitting the original...
Multi-way data analysis is a multivariate data analysis technique having a wide application in some fields. Nevertheless, the development of classification tools for this type of representation is incipient yet. In this paper we study the dissimilarity representation for the classification of three-way data, as dissimilarities allow the representation of multi-dimensional objects in a natural way...
In this paper, we address the problem of finding the best wavelet basis in wavelet packet analysis for applications based on classification. We implement and evaluate our proposed method in the design of a self-paced 2-state mental task-based brain-computer interface (BCI) as one possible type of classification-based applications. The autoregressive coefficients of the best wavelet basis are concatenated...
The standard 2-norm support vector machine (SVM for short) is known for its good performance in classification and regression problems. In this paper, the 1-norm support vector machine is considered and a novel smoothing function method for Support Vector Classification(SVC) and Regression (SVR) are proposed in an attempt to overcome some drawbacks of the former methods which are complex, subtle,...
We consider the problem of classification in nonadaptive dimensionality reduction. Specifically, we bound the increase in classification error of Fisher's Linear Discriminant classifier resulting from randomly projecting the high dimensional data into a lower dimensional space and both learning the classifier and performing the classification in the projected space. Our bound is reasonably tight,...
Decomposition methods are multiclass classification schemes where the polychotomy is reduced into several dichotomies. Each dichotomy is addressed by a classifier trained on a training set derived from the original one on the basis of the decomposition rule adopted. These new training sets may present a disproportion between the classes, harming the global recognition accuracy. Indeed, traditional...
In this paper, we propose a tree-structured multi-class classifier to identify annotations and overlapping text from machine printed documents. Each node of the tree-structured classifier is a binary weak learner. Unlike normal decision tree(DT) which only considers a subset of training data at each node and is susceptible to over-fitting, we boost the tree using all training data at each node with...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.