The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Multi-task feature learning aims to identify the shared features among tasks to improve generalization. Recent works have shown that the non-convex learning model often returns a better solution than the convex alternatives. Thus a non-convex model based on the capped-1, 1 regularization was proposed in [1], and the corresponding efficient multi-stage multi-task feature learning algorithm (MSMTFL)...
Classifier fusion is a well-studied problem in which decisions from multiple classifiers are combined at the score, rank, or decision level to obtain better results than a single classifier. Subsequently, various techniques for combining classifiers at each of these levels have been proposed in the literature. Many popular methods entail scaling and normalizing the scores obtained by each classifier...
Image registration is an important and fundamental problem in computer vision and image processing. Although there are currently a large number of image registration algorithms such as RANSAC and its extensions, image registration under very noisy conditions remains difficult when it cannot obtain enough number of correct corresponding points. This paper solves this issue by introducing a random resample...
The use of different evaluation measures for classification tasks have gained a significant amount of attention in the past decade, specially for those problems with multiple and imbalanced classes [1], [2]. However, the optimization of classifiers with respect to these measures is still heuristic, using ad-hoc rules with classical accuracy-optimized classifiers. We propose a classifier designed specifically...
In this paper, we introduce methods for mining spatiotemporal event sequences from event datasets with evolving region objects. Spatiotemporal event sequences are the ordered lists of event types whose event instances frequently follow each other in spatiotemporal context. Two Apriori-based algorithms are designed for the task of spatiotemporal event sequence mining. We provide explanations for interestingness...
We present a unified approach for simultaneous clustering and outlier detection in data. We utilize some properties of a family of quadratic optimization problems related to dominant sets, a well-known graph-theoretic notion of a cluster which generalizes the concept of a maximal clique to edge-weighted graphs. Unlike most (all) of the previous techniques, in our framework the number of clusters arises...
This paper addresses a problem in which we learn a regression model from sets of training data. Each of the sets has an only single label, and only one of the training data in the set reflects the label. This is particularly the case when the label is attached to a group of data, such as time-series data. The label is not attached to the point of the sequence but rather attached to particular time...
In clustering applications, multiple views of the data are often available. Although clustering could be done within each view independently, exploiting information across views is promising to gain clustering accuracy improvement. A common assumption in the field of multi-view learning is that the clustering results from multiple views should be consistent with a latent clustering. However, the potential...
In this paper we consider the problem of training a Support Vector Machine (SVM) online using a stream of data in random order. We provide a fast online training algorithm for general SVM on very large datasets. Based on the geometric interpretation of SVM known as the polytope distance, our algorithm uses a gradient descent procedure to solve the problem. With high probability our algorithm outputs...
Neighborhood Covering Reduction (NCR) is an effective tool to learn rules from structural data for classification. However, the existing neighborhood covering model is not robust enough. A neighborhood is constructed according to the nearest heterogeneous samples. This strategy over focuses on the boundary samples and makes the model sensitive to noise. To tackle this problem, we proposed a Rough...
Recording of the activity of people working in an office or in a living room is important for several goals: to design an evacuation route, to measure the degree of ADL (Activity of Daily Living) of single-living elderly persons, and to analyze the working contents of people, and so on. Camera systems are available for these goals, but they are weak for the light condition change (not available at...
Graph-based semi-supervised learning has recently come into focus for to its two defining phases: graph construction, which converts the data into a graph, and label inference, which predicts the appropriate labels for unlabeled data using the constructed graph. And the label inference is based on the smoothness assumption of semi-supervised learning. In this study, we propose an enhanced label inference...
In a Bayesian Network (BN), a target node is independent of all other nodes given its Markov Blanket (MB). By finding the MB, many problem can be solved directly or indirectly. There exist predominately two different approaches to finding the MB: the score-based and the constraint-based algorithms. We introduce a new Markov Blanket learning algorithm, Hybrid Markov Blanket (HMB) discovery, by combining...
Kernel principal component analysis (kPCA) learns nonlinear modes of variation in the data by nonlinearly mapping the data to kernel feature space and performing (linear) PCA in the associated reproducing kernel Hilbert space (RKHS). However, several widely-used Mercer kernels map data to a Hilbert sphere in RKHS. For such directional data in RKHS, linear analyses can be unnatural or suboptimal. Hence,...
The existence of reliable evaluation datasets for cell image registration algorithms is crucial for quantitative comparison of registration approaches. A new technique for creating real live cell image sequences for this purpose was introduced recently. These datasets contain stable structures bleached by argon laser in the cell nucleus. In this work, we propose an approach for automatic detection...
The annotation of cellular nuclei in images of tissue sections is a time consuming but crucial task in quantitative microscopy. We present a machine learning framework incorporating expert knowledge enabling biologists to annotate a large number of nuclear images in a reasonable time. The proposed system is designed to generate three successive levels of annotation, each presenting more details until...
In this paper we present a novel unsupervised feature representation by extracting salient symmetries in RGB-D images using the proposed moment-based symmetric patch detector. A fast indexing structure is also derived to group local symmetric patches into semantically meaningful symmetric parts. Given an RGB-D image, the hash-based symmetric patch indexing speeds up the searches of symmetric patch...
Real-world datasets consist of data representations (views) from different sources which often provide information complementary to each other. Multi-view learning algorithms aim at exploiting the complementary information present in different views for clustering and classification tasks. Several multi-view clustering methods that aim at partitioning objects into clusters based on multiple representations...
Algorithmic methods are demonstrated for information extraction from table header elements, including data categories and data hierarchies. The table headers are found with the Minimum Index Point Search algorithm. The header-path alignment and header completion algorithms yield database-ready table content and configuration statistics on a random sample of 400 diverse tables with ground truth and...
Document layout segmentation and recognition is an important task in the creation of digitized documents collections, especially when dealing with historical documents. This paper presents an hybrid approach to layout segmentation as well as a strategy to classify document regions, which is applied to the process of digitization of an historical encyclopedia. Our layout analysis method merges a classic...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.