The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
A time evolving graph is becoming increasingly abundant in a wide variety of application domains. While several classes of advanced frequent patterns in time evolving graphs are proposed, in this paper, correlation and contrast patterns on link formations are developed, which can be regarded as nontrivial upgrades of corresponding patterns in item set mining into the graph domain. More concretely,...
Recently, constraint programming has been proposed as a declarative framework for constraint-based pattern mining. In constraint programming, a problem is modelled in terms of constraints and search is done by a general solver. Similar to most pattern mining algorithms, these solvers typically employ exhaustive depth-first search, where constraints are used to prune the search space and make the search...
This paper addresses the problem of learning based single image super-resolution. Previous research on this problem employs human user to provide a set of images that are similar to the target image as a reference. Then the super-resolution algorithm can learn from the provided reference images to predict the high resolution details for the target image. We propose a fully automatic scheme, which...
Heterogeneous Feature Fusion Machines (HFFM) is a kernel based logistic regression model which effectively fuses multiple features for visual recognition tasks. However, its batch mode solution suffers inefficiency and poor scalability as common batch algorithm does. In this paper, we developed a novel algorithm based on multiple kernels and group LASSO technique to solve this model, called online...
In this work, we present a novel indexing method, which contains visual phrase quantization, two-dimensional inverted index and Approximate RANSAC (ARANSAC), for mobile image retrieval. First of all, visual phrase quantization is proposed by mapping the SIFT descriptor to two visual words in an order and representing the image by bag of visual phrases. Then, a two-dimensional inverted index is developed,...
Support vector clustering (SVC) is a flexible clustering method inspired by support vector machines (SVM). Due to its advantage in discovering clusters of arbitrary shapes, it has been widely used in many applications. However, one bottleneck which restricts the scalability of the method is its significantly high time complexity. Both of its two main stages, namely, sphere construction and cluster...
Customer transactions tend to change overtime with changing customer behaviour patterns. Classifier models, however, are often designed to perform prediction on data which is assumed to be static. These classifier models thus deteriorate in performance overtime when predicting in the context of evolving data. Robust adaptive classification models are therefore needed to detect and adjust to the kind...
In this paper, we propose a hybrid approach for music recommendation. Firstly, we describe an approach for creating music recommendations based on user-supplied tags that are augmented with a hierarchical structure extracted for top level genres from Dbpedia. In this structure, each genre is represented by its stylistic origins, typical instruments, derivative forms, sub genres and fusion genres....
Multi-task feature selection refers to the problem of selecting a common predictive set of features over multiple related learning tasks. The problem is encountered for example in applications, where one can afford only a limited set of feature extractors for solving several tasks. In this work, we present a regularized least-squares (RLS) based algorithm for multi-task greedy forward feature selection...
In this paper we present two novel multivariate time series representations to classify physiological data of different lengths. The representations may be applied to any group of multivariate time series data that examine the state or health of an entity. Multivariate Bag-of-Patterns and Stacked Bags of-Patterns improve on their univariate counterpart, inspired by the bag-of-words model, by using...
Face indexing and retrieval are basic tasks of search engines. Most current search engines use text information such as keywords and captions rather than visual content for indexing. This approach returns many irrelevant results, since faces and names are not usually aligned in video data. We propose an unsupervised framework for indexing faces in video archives of broadcast news. First, the faces...
An interesting challenge in data stream mining is the detection of events where events are generally defined as anything previously unknown in the data. Therefore outliers, but also model changes or drifts, can be considered as possible events. Various methods for event detection have been proposed for different types of events. In this paper, we describe a more general framework for event detection...
Discovering infrequent causal relationships can help us prevent or correct negative outcomes caused by their antecedents. In this paper, we propose an innovative data mining framework and apply it to mine potential causal associations in electronic patient datasets where the drug-related events of interest occur infrequently. Specifically, we created a novel interestingness measure, exclusive causal-leverage,...
In active learning algorithms, informative samples are usually queried for true labels according to the disagreement of existing hypotheses. However we observed that, when the streaming dataset has skewed class membership, the imbalanced data classification problem is caused in active learning. The Minority class is overwhelmed by the majority class in generating the hypotheses. In this paper, for...
The ventricular system inside the brain is known to enlarge and change shape given conditions such as Alzheimer's disease. This change in shape may provide a way to assess the level of cognitive impairment of a patient, as well as other intellectual characteristics. This paper describes the use of trees to represent the 3D space containing the third and lateral ventricles, and classification of these...
As the Singapore's population ages rapidly, the number of geriatric inpatients in Singapore is expected to rise significantly. This will certainly exert greater pressures on the efficient management of hospital resources. Hospital length of stay (LOS) is an important indicator of hospital activity and management because of its direct relation to resource consumption. Planning of hospital resources...
Biclustering is a popular method for micro array dataset analysis. It allows for condition set and gene set points clustering simultaneously. However, the noisy data in micro array may disturb the mining results. In order to reduce the influence of noise and find more biological biclusters, we propose an algorithm, FT Cluster, to mine fault-tolerant biclusters in micro array dataset. Unlike traditional...
Pattern Recognition and Data Mining can provide invaluable insights in the field of neuro oncology. This is because the clinical analysis of brain tumors requires the use of non-invasive methods that generate complex data in electronic format. Magnetic resonance, in the modalities of imaging and spectroscopy, is one of these methods that has been widely applied to this purpose. The heterogeneity of...
Whenever new sequences of DNA or proteins have been decoded it is almost compulsory to look at similar sequences and papers describing those sequences in order to both collect relevant information concerning the function and activity of the new sequences and/or know what is known already about similar sequences that might be useful in the explanation of the function or activity of the newly discovered...
Protein domains are minimal structural units that can independently fold and carry out discrete biological functions. Evolutionary divergence amongst proteins not only cause considerable sequence changes of protein domains of similar folds and functions, but can also give rise to remarkable length variations. Rapid and heuristic sequence search algorithms are generally sensitive and effective in recognizing...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.