The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
We formulate three intuitive semantic properties for top-k queries in probabilistic databases, and propose Global-Topk query semantics which satisfies all of them. We provide a dynamic programming algorithm to evaluate top-k queries under Global-Topk in simple probabilistic relations. For general probabilistic relations, we show a polynomial reduction to the simple case. Our analysis shows that the...
We would like to discuss with workshop participants the iTrails framework for pay-as-you-go information integration, which was recently presented at VLDB 2007 (M. A. V. Salles et al., 2007). iTrails allows users to provide mini-mappings on their data that sharply increase the quality of search results. The core idea is to extend the semantics of a standard graphical search engine such that the quality...
In this article, we are interested in accelerating similarity search in high dimensional vector spaces. The presented approach, called HiPeR, is based on a hierarchy of sub- spaces and indexes: it performs nearest neighbors search across spaces of different dimensions, by beginning with the lowest dimensions up to the highest ones, with the aim of minimizing the effects of the curse of dimensionality...
In order to better fit a variety of pattern recognition problems over strings, using a normalised version of the edit or Levenshtein distance is considered to be an appropriate approach. The goal of normalisation is to take into account the lengths of the strings. We define a new normalisation, contextual, where each edit operation is divided by the length of the string on which the edit operation...
There are several pieces of information that can be utilized in order to improve the efficiency of similarity searches on high-dimensional data. The most commonly used information is the distribution of the data itself but the use of dimensional choice based on the information in the query as well as the parameters of the distribution can provide an effective improvement in the query processing speed...
For computer scientists the problem of biological data retrieval has become synonymous with homology-based retrieval of primary gene sequence data and their associated protein products. This perspective is accessible to computer scientists, as primary sequence data is modeled as strings and fundamental algorithmic tools can be applied. However, by sticking with this formative foundation, we computer...
In this environment each mobile peer has a local database that stores and manages a collection of reports (where each report is a product record, or a coupon), and all the local databases maintained by the mobile peers form a mobile e-commerce (or, more generally, a mobile P2P) database. Queries on this database are posed by potential customers, and they search for coupons and products pertaining...
This paper discusses our ongoing research efforts in regards to processing multi-scale queries in service-oriented environments. A multi-scale query is a query over traditional datasets in conjunction with streaming data and that may involve spatio-temporal aspects. In a service-oriented environment, classic and streaming data are made available and processed through services, which we refer to as...
The database research community has recently recognized the usefulness of skyline query. As an extension of existing database operator, the skyline query is valuable for multi-criteria decision making. However, current research tends to assume that the skyline operator is applied to one table which is not true for many applications on Web databases. In Web databases, tables are distributed in different...
Recent years have seen a proliferation of work on the Semantic Web, an initiative to enable intelligent agents to reason about and utilize World Wide Web content and services. Concurrently, the networking community has developed a concept of the knowledge plane, using artificial intelligence to reason about and manage network behavior. These two efforts have progressed independently despite potential...
All pivot-based algorithms for similarity search use a set of reference points called pivots. The pivot-based search algorithm precomputes some distances to these reference points, which are used to discard objects during a search without comparing them directly with the query. Though most of the algorithms proposed to date select these reference points at random, previous works have shown the importance...
In this paper we introduce a new M-tree building method, utilizing the classic idea of forced reinsertions. In case a leaf is about to split, some distant objects are removed from the leaf (reducing the covering radius), and then again inserted into the M-tree in a usual way. A regular leaf split is performed only after a series of unsuccessful reinsertion attempts. We expect the forced reinsertions...
Applications that perform on-the-fly integration of data inside a database (DB) with "exo-DB data " (that is, data outside a DB) are becoming common. Ease of integration and speed of performance are key considerations in such applications. To this end, we have devised a means to mix data in a traditional DB with references to fine-grained exo-DB data of arbitrary formats, and a means to...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.