The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
As many real-world data can elegantly be represented as graphs, various graph kernels and methods for computing them have been proposed. Surprisingly, many of the recent graph kernels do not employ the kernel trick anymore but rather compute an explicit feature map and report higher efficiency. So, is there really no benefit of the kernel trick when it comes to graphs? Triggered by this question,...
As an important branch of biomedical information extraction, Protein-Protein Interaction extraction (PPIe) from biomedical literatures has been widely researched, and machine learning methods have achieved great success for this task. However, the word feature generally adopted in the existing methods suffers badly from vocabulary gap and data sparseness, weakening the classification performance....
The internet and the Web 2.0 gave rise to a wide variety of user generated content. This caused a massive growth in the amount and availability of opinionated information. This collection of complex, unstructured information is often referred as Big Data. A common practical application of such Big Data is social media sentiment analysis. The general aim of sentiment analysis is to determine/extract...
Chemo informatics aims to predict molecular properties using informational methods. Computer science's research fields concerned by this domain are machine learning and graph theory. An interesting approach consists in using graph kernels which allow to combine graph theory and machine learning frameworks. Graph kernels allow to define a similarity measure between molecular graphs corresponding to...
Spectral unmixing is the process of identifying pure spectral signatures, called endmembers, from a hyperspectral data, and then expressing each pixel vector in terms of the fractional abundances of these endmembers. Most of the endmember extraction methods in the literature use only the spectral information, whereas the spatial composition of the data is disregarded. Spatial preprocessing methods,...
This paper discusses in detail the behavior of the basic k-means algorithm with four more new algorithms with varied distance measures on gene expression data. In data mining, k-means clustering is a method which aims to partition n observations into k clusters in which each observation belongs to the cluster with the nearest mean. The traditional k-means is one of the most popular clustering methods...
Support vector machine approach is an effective technique to solve poly-dimensional outlier detection, which can avoid the curse of dimensionality problem and has higher accuracy. One-class support vector machine-based outlier detection techniques take advantage of spatial and temporal correlations that exist between sensor data to cooperatively identify outliers. However, for large scale training...
Hyper graph is a data structure that captures many-to-many relations. It comes up in various contexts, one of those being the task of detecting fraudulent users of an on-line system given known associations between the users and types of activities they take part in. In this work we explore three approaches for applying general-purpose machine learning methods to such data. We evaluate the proposed...
With the recent emergence of mobile platforms capable of executing increasingly complex software and the rising ubiquity of using mobile platforms in sensitive applications such as banking, there is a rising danger associated with malware targeted at mobile devices. The problem of detecting such malware presents unique challenges due to the limited resources avalible and limited privileges granted...
Kernel machines (such as support vector machines) have demonstrated excellent performance in numerous areas of pattern recognitions. However, traditional kernel machines do not make efficient use of both labeled training data and unlabeled testing data. Moreover, high dimensional and nonlinear distributed data generally degrade the performance of kernel classifiers due to the curse of dimensionality...
This paper proposes a new hybrid intelligent method for probabilistic short-term load forecasting (STLF) in power systems. It consists of Relevance Vector Machine (RVM) of the statistical learning method called Kernel Machine and regression tree (RT) of data mining. As the preconditioned technique of data, RT is used to classify learning data into some clusters with the data similarity. After classifying...
In this paper, we propose a new similarity-based k-partitions clustering approach, called CAWP. Given the similarities of pairs of objects in the dataset, CAWP groups these objects into K non-overlaped clusters. Each cluster is represented by multiple objects with different weights, called prototype weight. The more representative an object is with respect to a cluster, the larger prototype weight...
Support vector machines (SVMs) often contain a large number of support vectors which reduce the run-time speeds of decision functions. In addition, this might cause an over fitting effect where the resulting SVM adapts itself to the noise in the training set rather than the true underlying data distribution and will probably fail to correctly classify unseen examples. To obtain more fast and accurate...
The dimensionality disaster problem exist in pattern recognition process, the fault diagnosis method on nonlinear feature kernel extraction is presented here. The fisher linear discriminant analysis method is extended to nonlinear fields by kernel technology, the original feature space is mapped into observation space for features linearization, The fault pattern is classified by fisher linear discriminant...
An algorithm is presented for clustering sequential data in which each unit is a collection of vectors. An example of such a type of data is speaker data in a speaker clustering problem. The algorithm first constructs affinity matrices between each pair of units, using a modified version of the Point Distribution algorithm which is initially developed for mining patterns between vector and item data...
Document clustering algorithms usually use vector space model (VSM) as their underlying model for document representation. VSM assumes that terms are independent and accordingly ignores any semantic relations between them. This results in mapping documents to a space where the proximity between document vectors does not reflect their true semantic similarity. In this paper, we propose the use of semantic...
Note that network coding allows immediate nodes to mix information from different data flows. On the basis of this observation, we prove that under suitable conditions, communicators can communicate with perfect secrecy over wiretap networks. Our method of secure communication over wiretap networks without keys is fundamentally distinct from cryptographic means. Furthermore, a linear network code...
Continuous wavelet transforms of multivariable vector function spaces are discussed. In the weak topology we get the reconstruction formulas of the continuous wavelet transforms of multivariable vector function spaces produced by the integral kernel of the transform multivariable vector functions and those of it produced by the integral kernel of the multivariable vector functions which are different...
Support Vector Machines have been widely used in pattern recognition, regression estimation, and operator inversion. Optimization algorithm is the bottleneck of Support Vector Machines, determining its performance, affecting its practical applications in various fields widely. Ordinary algorithm cannot predict which vectors the Support Vector Machines will be sensitive to. This paper introduces a...
In this paper, we present a geodesic discriminant analysis (GDA) algorithm, which generalize linear discriminant analysis (LDA) in linear manifold space to curved Riemannian manifold space, with the aid of Riemannian logarithmic map. Compared with LDA, GDA is more suitable to deal with data that lie on curved manifold. We show that GDA is the generalization of LDA, and LDA is the special case of GDA:...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.