The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
A sensor network data gathering and visualization infrastructure is demonstrated, comprising of global sensor networks (GSN) middleware and Microsoft SensorMap. Users are invited to actively participate in the process of monitoring real-world deployments and can inspect measured data in the form of contour plots overlayed onto a high resolution map and a digital topographic model. Users can go back...
Keyword search on relational databases provides users with insights that they can not easily observe using the traditional RDBMS techniques. Here, an l-keyword query is specified by a set of l keywords, {k1, k2, middot middot middot , kl}. It finds how the tuples that contain the keywords are connected in a relational database via the possible foreign key references. Conceptually, it is to find some...
Increasing prevalence of large-scale distributed monitoring and computing environments such as sensor networks, scientific federations, Grids etc., has led to a renewed interest in the area of distributed query processing and optimization. In this paper we address a general, distributed multi-query processing problem motivated by the need to minimize the communication cost in these environments. Specifically...
Update of applications in SaaS is expected to be a continuous efforts and cannot be done overnight or over the weekend. In such migration efforts, users are trained and shifted from a existing version to a new version successively. There is a long period of time when both versions of applications co-exist. Supporting two systems at the same time is not a cost efficient option and two systems may suffer...
In this paper, we have postulated the problem of using discrete speech utterances to annotate an image as that of disambiguation across multiple N-best lists. Our solution is based on the Maximum Entropy approach and uses correlations between tags in an existing corpus of images to set up the constrains of the corresponding constrained optimization problem. Our experiments suggest that the proposed...
Mediator-based data integration systems resolve exploratory queries by joining data elements across sources. In the presence of uncertainties, such multiple expansions can quickly lead to spurious connections and incorrect results. The BioRank project investigates formalisms for modeling uncertainty during scientific data integration and for ranking uncertain query results. Our motivating application...
While models for data provenance have been extensively studied in the literature, the efficient evaluation of the resulting provenance queries remains an open problem. Traditional query optimization techniques, like the use of general-purpose indexes, or the materialization of provenance data, fail on different fronts to address the problem. Provenance-specific optimization techniques, like the use...
Sponsored search auctions form a multibillion dollar industry. Search providers auction advertisement slots on search result pages to advertisers who are charged only if the end-user clicks on the advertiser's ad. The high volume of searches presents an opportunity for sharing the workrequired to resolve multiple auctions that occur simultaneously. We provide techniques for efficiently resolving sponsored...
Consider multiple users searching for a hotel room, based on size, cost, distance to the beach, etc. Users may have variable preferences expressed by different weights on the attributes of the searched objects. Although individual preference queries can be evaluated by selecting the object in the database with the highest aggregate score, in the case of multiple requests at the same time, a single...
A large number of online databases are hidden behind form-like interfaces which allow users to execute search queries by specifying selection conditions in the interface. Most of these interfaces return restricted answers (e.g., only top-k of the selected tuples), while many of them also accompany each answer with the COUNT of the selected tuples. In this paper, we propose techniques which leverage...
When dealing with massive quantities of data, top-k queries are a powerful technique for returning only the k most relevant tuples for inspection, based on a scoring function. The problem of efficiently answering such ranking queries has been studied and analyzed extensively within traditional database settings. The importance of the top-k is perhaps even greater in probabilistic databases, where...
Developing powerful paradigms for programming sensor networks is critical to realize the full potential of sensor networks as collaborative data processing engines. In this article, we motivate and develop a deductive framework for programming sensor networks, extending the prior vision of viewing sensor network as a distributed database. The deductive programming approach is declarative, very expressive,...
Consider an information repository whose content is categorized. A data item (in the repository) can belong to multiple categories and new data is continuously added to the system. In this paper, we describe a system, CS*, which takes a keyword query and returns the relevant top-K categories. In contrast, traditional keyword search returns the top-K documents (i.e., data items) relevant to a user...
Nowadays contents in Internet like weblogs, wikipedia and news sites become "live". How to notify and provide users with the relevant contents becomes a challenge. Unlike conventional Web search technology or the RSS feed, this paper envisions a personalized full-text content filtering and dissemination system in a highly distributed environment such as a Distributed Hash Table (DHT). Users...
Temporal data analysis in data warehouses and datastreaming systems often uses time decay to reduce the importance of older tuples, without eliminating their influence, on the results of the analysis. While exponential time decay is commonly used in practice, other decay functions (e.g. polynomial decay) are not, even though they have been identified as useful. We argue that this is because the usual...
We present SMART, a self-tuning, bandwidth-aware monitoring system that maximizes result precision of continuous aggregate queries over dynamic data streams. While prior approaches minimize bandwidth cost under fixed precision constraints, they may still overload a monitoring system during traffic bursts. To facilitate practical deployment of monitoring systems, SMART therefore bounds the worst-case...
In community web management systems (CWMS), storage structures inspired by universal tables are being used increasingly to manage sparse datasets. Such a sparse wide table (SWT) typically embodies thousands of attributes, with many of them being undefined in each tuple, and low-dimensional structured similarity search on a combination of numerical and text attributes is a common operation. However,...
In recent years, there has been a growing interest for peer-to-peer (P2P) based computing and applications. One of the most important challenges in P2P environments is to quickly locate relevant data across many participating peers. In this demonstration, we present psiX, which is an Internet-scale service for publishing and locating XML documents. This service runs on several PlanetLab nodes geographically...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.