The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
In this paper we propose an approach for Chinese question analysis and answer extraction. A general question analysis process contains keyword extraction and question classification. Question classification plays a crucial role in automatic question answering. To implement the question classification, we have carried out experiments with Support Vector Machines (SVM) using four kinds of features:...
Normal support vector machine (SVM) algorithms are not suitable for classification of large data sets because of high training complexity. This paper introduces a novel SVM classification approach for large data sets. It has two phases. In the first phase, an approximate classification is obtained by SVM using fast clustering techniques to select the training data from the original data set. In the...
Recently, a huge wave of social media has generated significant impact in people's perceptions about technological domains. They are captured in several blogs/forums, where the themes relate to products of several companies. One of the companies can be interested to track them as resources for customer perceptions and detect user sentiments. The keyword-based approaches for identifying such themes...
In this paper, we propose touching round grain segmentation technique based on center of individual grain and concavity of image boundary. The objective of this work is to identify single grain and detect position of touching round grain in binary image. First, center of grain is detected by using conventional technique such as Morphological algorithm and Color component labeling. From the detected...
The k-means method is a widely used clustering technique because of its simplicity and speed. However, the clustering result depends heavily on the chosen initial value. In this report, we propose a seeding method with independent component analysis for the k-means method. Using a benchmark dataset, we evaluate the performance of our proposed method and compare it with other seeding methods.
As the AHP is a relatively crude method of sorting, this paper is intended for improving the construction of AHP hierarchical structure model, which clusters the layer according to the grey cluster of the grey system, in order to reduce the inconsistency of judging matrix and the number of judging matrix. At the same time, the improvement can not only simplified the hierarchical structure, but also...
This paper presents an enhanced method of partitioning a dataset into clusters when dealing with the handwritten signature recognition problem. The goal of the present system is improving the performance of two previously developed systems. In the first version of our system we dealt with data extraction from signature images and obtained a recognition rate of 91.04% using the Naïve Bayes classifier...
An approach to identification of the phishing target of a given (suspicious) webpage is proposed by clustering the webpage set consisting of its all associated webpages and the given webpage itself. We first find its associated webpages, and then explore their relationships to the given webpage as their features for clustering. Such relationships include link relationship, ranking relationship, text...
We propose a method for multi-object segmentation in a projection plane. Our algorithm requires a stereo camera system called Subtraction Stereo, which extracts foreground information with a fixed stereo camera. The main contribution of this paper is how the image sequences that include partial occlusion of the foreground objects can be accurately segmented using mean shift clustering in real-time...
Fusion of multiple information sources can yield significant benefits to accomplishing certain learning tasks. This paper exploits the sparse representation of signals for the problem of data clustering. The method is built within the framework of spectral clustering algorithms, which convexly combines a real graph constructed from the given physical features with a virtual graph constructed from...
An inherent problem of unsupervised texture segmentation is the absence of previous knowledge regarding the texture patterns present in the images to be segmented. A new efficient methodology for unsupervised image segmentation based on texture is proposed. It takes advantage of a supervised pixel-based texture classifier trained with feature vectors associated with a set of texture patterns initially...
Based on fuzzy C-means method and the characteristics of kernel-based method, the algorithm of kernel-based fuzzy clustering is presented, in which the objective function of fuzzy C-means is substituted by Gaussian kernel objective function. The approach of kernel-based fuzzy C-means clustering is used in the classification and recognition of remote sensing images, and the result shows that it can...
In this paper, an easily implemented semi-supervised graph learning method is presented for dimensionality reduction and clustering, using the most of prior knowledge from limited pairwise constraints. We extend instance-level constraints to space-level constraints to construct a more meaningful graph. By decomposing the (normalized) Laplacian matrix of this graph, to use the bottom eigenvectors leads...
Most of existing dimensionality reduction methods obtain the low-dimensional embedding via preserving a certain property of the data, such as locality, neighborhood relationship. However, the intrinsic cluster structure of data, which plays a key role in analyzing and utilizing the data, has been ignored by the state-of-the-art dimensionality reduction methods. Hence, in this paper we propose a novel...
A new forest leaf area index (LAI) inversion method from multisource and multi-angle data combined with radiative transfer model and the strategy of k-means clustering and artificial neural network (ANN) was discussed. The four different temporal satellite images of Landsat-5 TM (L5TM) and Beijing-1 microsatellite multispectral sensors (BJI) were selected to construct multisource and multi-angle data...
This paper addresses the problem of improving the accuracy of character recognition with a limited quantity of data. The key ideas are twofold. One is distortion-tolerant template matching via hierarchical global/partial affine transformation (GAT/PAT) correlation to absorb both linear and nonlinear distortions in a parametric manner. The other is use of multiple templates per category obtained by...
Real-time classification of Internet traffic according to application types is vital for network management and surveillance. Identifying emerging applications based on well-known port numbers is no longer reliable. While deep packet inspection (DPI) solutions can be accurate, they require constant updates of signatures and become infeasible for encrypted payload especially in multimedia applications...
A genetic algorithm can be applied to various search or optimization problems. However, there exists a problem that it takes too much cost to evaluate a large number of individuals. To deal with the problem, the fitness approximation method which reduces the cost of the evaluation with the similar performance to the general GA is needed. We proposed the fitness approximation using a combination of...
In traditional machine learning applications, only labeled data is used to train the classifier. Labeled data are difficult, expensive, time-consuming and require human experts to be obtained in several real applications. Semi-supervised learning address this issue. Semi-supervised learning uses large amount of unlabeled data, combined with the labeled data, to build better classifiers. The semi-supervised...
A new methodology to learn descriptive linguistic Fuzzy Rule-based System Knowledge Bases from examples based on the combination of fuzzy clustering and evolutionary simultaneous rule selection and membership functions tuning is presented in this work. Fuzzy clustering is used to achieve a preliminary description of the data, in other words to obtain information on the definition of the linguistic...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.