The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
With the rapid development of society and technology, home service robot is becoming cheaper and smarter. Facing with the difficulties of aging and shortage of labor, we can use home service robot (HSR) as a good companion and servant. However, the security and reliability problems have become bottlenecks in this field. It is meaningful to do researches on fault diagnosis of HSR. Due to its excellent...
With the evolution of large computer data, every corner of the society is filled with a variety of text information. Indeed, large data information that need manage by people has been unable to meet the rapid development of society. Therefore, the technology of efficient management and accurate positioning of vast quantities of text information has become a hot topic in the research community. In...
In this work we derive a novel clustering scheme for hyperspectral pixels according to the material they sense. We utilize statistical correlations that pixels sensing the same material exhibit. Specifically, kernel learning is combined with a norm-one regularized canonical correlations framework that can perform data clustering on nonlinearly dependent data. To tackle the derived minimization formulation...
The classification performances of the traditional one-class support vector machine (OCSVM) and its variants are often not satisfying when outliers are complex. To deal with this case, assigning smaller weights to these outliers may alleviate their influence upon the classification boundary and enhance the robustness of OCSVM. In this paper, a novel adaptive-weighted one-class support vector machine...
Nonstationary streaming data are characterized by changes in the underlying distribution between subsequent time steps. Learning in such environments becomes even more challenging when labeled data are available only at the initial time step, and the algorithm is provided unlabeled data thereafter, a scenario referred to as extreme verification latency. Our previously introduced COMPOSE framework...
In clustering applications, multiple views of the data are often available. Although clustering could be done within each view independently, exploiting information across views is promising to gain clustering accuracy improvement. A common assumption in the field of multi-view learning is that the clustering results from multiple views should be consistent with a latent clustering. However, the potential...
The imbalanced learning problem is becoming pervasive in today's data mining applications. This problem refers to the uneven distribution of instances among the classes which poses difficulty in the classification of rare instances. Several undersampling as well as oversampling methods were proposed to deal with such imbalance. Many undersampling techniques do not consider distribution of information...
We address the problem of how to design a more effective co-training scheme to tackle the multi-view spectral clustering. The conventional co-training procedure treats information from all views equally and often converges to a compromised consensus view that does not fully utilize the multiview information. We instead propose to learn an augmented view and construct its corresponding affinity matrix...
The disadvantages of BOW (Bag of words model) for image classification include the large amount of data in generating a codebook by clustering, redundant code words that may affect the classification results and so on. The process of BOW for the classification can be improved through the Laplace weights to improved fuzzy C means algorithm, and obtaining codebook with more ability to distinguish between...
Dynamic ranking learning problem is considered when the training sample is a data stream, consisting of a sequence of a series of objects characterized by a set of features and relative ranks within each series. The problem is reduced to preference learning to rank on clusters in the feature space of ranked objects, while aggregated training dataset is formed from the centers of clusters and estimates...
Clustering algorithm is often used to analyze the communication data for network intrusion detection system. However, network communication data are mixed, e.g., numerical and categorical data. So, at first, this paper put forward a method for representing the cluster center (prototype) of mixed-type data. Then respectively in combination with the continuity characteristic of the numerical attributes...
Fault diagnosis is an important procedure to ensure the equipment efficiency and stability. The diagnosis process is actually a pattern recognition process, and usually, the fault samples are lack of tags of fault types. In this case, the non-supervised learning method is more available, and kernel clustering is one of the most effective methods. In this paper, a novel electromagnetic particle swarm...
Stochastic Gradient Descent (SGD) based method offers a viable solution to training large-scale dataset. However, the traditional SGD-based methods cannot get benefit from the distribution or geometry information carried in data. The reason is that these methods make use of the uniform distribution over the entire training set so as to sample the next data point for updating the model. We address...
Kriging or Gaussian Process Regression has been successfully applied in many fields. One of the major bottlenecks of Kriging is the complexity in both processing time (cubic) and memory (quadratic) in the number of data points. To overcome these limitations, a variety of approximation algorithms have been proposed. One of these approximation algorithms is Optimally Weighted Cluster Kriging (OWCK)...
In order to better model complex real-world data and to develop robust features that capture relevant information, we usually employ unsupervised feature learning to learn a layer of features representations from unlabeled data. However, developing domain-specific features for each task is expensive, time-consuming and requires expertise of the data. In this paper, we introduce multi-instance clustering...
In traditional multiple instance learning (MIL), both positive and negative bags are required to learn a prediction function. However, a high human cost is needed to know the label of each bag—positive or negative. Only positive bags contain our focus (positive instances) while negative bags consist of noise or background (negative instances). So we do not expect to spend too much to label the negative...
The difficulties of data streams, i.e. Infinite length, the occurrence of concept-drift and the possible emergence of novel classes, are topics of high relevance in the field of recognition systems. To overcome all of these problems, the system should be updated continuously with new data while the amount of processing time should be kept small. We propose an incremental Parzen window kernel density...
Information flow detection is dedicated to tracking the dynamics and evolution of Web information spreading across the entire web over time. How to choose a comfortable information granularity to detect and how to track information evolution from one to another are the main challenges. Besides, the technological problem of doing that with a large scale information efficiently is yet to be solved....
Support vector machines (SVMs) are a widely-used machine learning technique, but they suffer from a significant drawback of high time and memory training complexity, which should be endured especially in big data problems. SVMs incorporate kernel functions — it involves selecting the kernel and induces an additional computational effort. In this paper, we address these issues and propose an SVM framework...
The label tree-based classification is one of the most popular approaches for reducing the testing complexity to sublinear with the large number of classes. One of the popular approaches to generate the label tree is to apply recursively a spectral clustering algorithm to an affinity matrix for partition set of class labels into subsets, each subset corresponds to a child node of the tree. To obtain...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.