The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
As video summarization techniques have attracted more and more attention for efficient multimedia data management, quality assessment of video summary is required. To address the lack of automatic evaluation techniques, this paper proposes a novel framework including several new algorithms to assess the quality of the video summary against a given reference. First, we partition the reference video...
We propose a probabilistic model for the relevance feedback of users looking for target images. This model takes into account user errors and user uncertainty about distinguishing similarly relevant images. Based on this model, we have developed an algorithm, which selects images to be presented to the user for further relevance feedback until a satisfactory image is found. In each query session,...
We introduce a novel technique to detect anomalies in images. The notion of normalcy is given by a baseline of images, under the assumption that the majority of such images is normal. The key of our approach is a featureless probabilistic representation of images, based on the length of the codeword necessary to represent each image. Such codeword's lengths are then used for anomaly detection based...
In contrast with most Internet topology measurement research, our concern here is not to obtain a map as complete and precise as possible of the whole internet. Instead, we claim that each machine's view of this topology, which we call ego-centered view, is an object worth of study in itself. We design and implement an ego-centered measurement tool, and perform radar-like measurements consisting of...
Recently, a new temporal dataset has been made public: it is made of a series of twelve 100 M pages snapshots of the .uk domain. The Web graphs of the twelve snapshots have been merged into a single time-aware graph that provide constant-time access to temporal information. In this paper we present the first statistical analysis performed on this graph, with the goal of checking whether the information...
Personal blogs are one of the most interconnected and socially networked type of social media. The capability of placing "comments'' on blog posts makes the blogosphere rather a complex environment.In this paper, we study the behavior of bloggers who place comments on others' posts and examine if it is possible to detect spam comments.We look at the functionality of different network motif profiles...
We propose a dynamic graph-based relational mining approach using graph-rewriting rules to learns patterns in networks that structurally change over time. A dynamic graph containing a sequence of graphs over time represents dynamic properties as well as structural properties of the network. Our approach discovers graph-rewriting rules, which describe the structural transformations between two sequential...
In a typical realistic scenario, there exist some past data about the structure of the network which are analyzed with respect to some possibly future spreading process, such as behavior, opinion, disease, or computer malware. How sensitive are the predictions made about spread and spreaders to the changes in the structure of the network? We investigate the answer to this question by considering seven...
High-dimensional data presents a significant challenge to a broad spectrum of pattern recognition and machine-learning applications. Dimensionality reduction (DR) methods serve to remove unwanted variance and make such problems tractable. Several nonlinear DR methods, such as the well known ISOMAP algorithm, rely on a neighborhood graph to compute geodesic distances between data points. These graphs...
We present a demo of ESTER, a search engine that combines the ease of use, speed and scalability of full-text search with the powerful semantic capabilities of ontologies. ESTER supports full-text queries, ontological queries and combinations of these, yet its interface is as easy as can be: A standard search field with semantic information provided interactively as one types. ESTER works by reducing...
We present Graphite, a system that allows the user to visually construct a query pattern, finds both its exact and approximate matching subgraphs in large attributed graphs, and visualizes the matches. For example, in a social network where a person's occupation is an attribute, the user can draw a 'star' query for "finding a CEO who has interacted with a Secretary, a Manager, and an Accountant,...
We present an interactive system to query, explore and navigate data according to a hierarchical knowledge model that had been automatically populated from unstructured textual data. Our system differs from systems assisting in the navigation of domain ontologies and mining between pairs of concepts in that it enables access to unstructured data by abstract concepts and relations between them. Concepts...
This paper presents G-REX, a versatile data mining framework based on genetic programming. What differs G-REX from other GP frameworks is that it doesn't strive to be a general purpose framework. This allows G-REX to include more functionality specific to data mining like preprocessing, evaluation- and optimization methods, but also a multitude of predefined classification and regression models. Examples...
Although a huge amount of remote sensing data has been provided by Earth observation satellites, few data manipulation techniques and information extraction in large data sets have been developed. In this context, the present paper aims to show a new system for spatial data mining, and two test cases applied to land use change in the Brazilian Amazon region. We present the operational environment...
This demonstration concerns a system designed and implemented to automatically build multimodal aggregations of informative news items coming from the two domains of digital television and the Web. Though in recent times several technological solutions have addressed the problem of clustering online articles, little is available which is capable of integrating these two sources of information. The...
Nowadays, small and medium enterprises (SMEs) are forced to compete on a global market and to make strategic decisions in short periods of time. In order to allow SMEs access to information technologies which can support their competition on a global scale, public administrations are fostering the setting up of digital districts. In this paper, we describe a distributed collaborative data mining platform,...
On-line data stream mining has attracted much research interest, but systems that can be used as a workbench for online mining have not been researched, since they pose many difficult research challenges. The proposed system addresses these challenges by an architecture based on three main technical advances, (i) introduction of new constructs and synoptic data structures whereby complex KDD queries...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.