The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
In this paper, we consider the problem of recovery from committed malicious transactions in distributed databases. We define several useful dependency relations among transactions and based on them present an online recovery scheme for restoring the consistency of a database
Energy conservation and access efficiency are two fundamental though competing goals in broadcast wireless networks. To tackle the energy penalty from sequential searching, the interleaving of index with data items has been proposed. Although, quite important contributions exist on providing broadcast indexes, they have one or more of the following problems. Firstly, all of them assume total ordering...
Sequential pattern mining is very important because it is the basis of many applications. Yet how to efficiently implement the mining is difficult due to the inherent characteristic of the problem - the large size of the dataset. Although there has been a great deal of effort on sequential pattern mining in recent years, its performance is still far from satisfactory. In this paper, we have proposed...
In our era knowledge is not "just" information any more, it is an asset. Data mining can be used to extract important knowledge from large databases. These days, it is often the case that such databases are distributed among several organizations who would like to cooperate in order to extract global knowledge, but at the same time, privacy concerns may prevent the parties from directly...
Large information systems (IS) comprise of several independent applications that share a common set of resources and data. Usually, there are implicit and subtle dependencies across these applications that are not specifically captured. This is especially so if the applications are bought off the shelf or are developed by independent third parties. Dependencies or global semantic constraints are difficult...
Data marts storing pre-aggregated data, prepared for further roll-ups, play an essential role in data warehouse environments and lead to significant performance gains in the query evaluation. However, in order to ensure the completeness of query results on the data mart without to access the underlying data warehouse, null values need to be stored explicitly; this process is denoted as negative caching...
In this paper, we present SURCH, a novel decentralized algorithm for efficient processing of queries generated in sensor networks. Unlike existing techniques, SURCH is fully distributed and does not require the existence or construction of a communication infrastructure. It exploits the broadcast nature of wireless communication to optimize query propagation and evaluation. In SURCH, partial results...
As data stream management systems (DSMSs) become more and more popular, there is an increasing need to protect such systems from adversaries. In this paper we present an approach to secure DSMSs. We propose a general security framework and an access control model to secure DSMSs. We implement our framework into an existing data stream management system and show that our approach not only works, by...
Unlike numerical preferences, preferences on attribute values do not show an inherent total order, but skyline computation has to rely on partial orderings explicitly stated by the user. In such orders many object values are incomparable, hence skylines sizes become unpractical. However, the Pareto semantics can be modified to benefit from indifferences: skyline result sizes can be essentially reduced...
Histograms are being used as non-parametric selectivity estimators for one-dimensional data. For high-dimensional data it is common to either compute one-dimensional histograms for each attribute or to compute a multi-dimensional equi-width histogram for a set of attributes. This either yields small low-quality or large high-quality histograms. In this paper we introduce HIRED (high-dimensional histograms...
Evolving from heterogeneous database systems one of the main problems in peer data management systems (PDMS) is distributed query processing. With the absence of global knowledge such strategies have to focus on routing the query efficiently to only those peers that are most likely to contribute to the final result. Using routing indexes is one possibility to achieve this. Since data may change over...
Multi-dimensional data structures are applied in many real index applications, i.e. data mining, indexing multimedia data, indexing of text documents and so on. Many index structures and algorithms have been proposed. There are two major approaches to multi-dimensional indexing: data structures to indexing metric and vector spaces. R-trees, R*-trees and (B)UB-trees are representatives of the vector...
We propose an incremental algorithm for discovering clusters of duplicate tuples in large databases. The core of the approach is the usage of an indexing technique which, for any newly arrived tuple mu, allows to efficiently retrieve a set of tuples in the database which are mostly similar to mu, and which are likely to refer to the same real-world entity which is associated with mu. The proposed...
The recently increased amount of information stored in XML format has lead to the development and wide deployment of so-called native XML database management systems (XML DBMS). In parallel, (object-)relational DBMS remain well known, approved and widely used for persistent storage of data. There are many research and industrial areas, including virtual enterprises, Web portals, digital libraries,...
In this paper, we report on a parallel implementation of XQuery. As XQuery is being used for processing large datasets, and/or for compute-intensive applications, efficiency of XQuery implementations is becoming an important issue. Our work has specifically focused on scientific data processing and data mining applications. Parallelization of this class of XQuery queries involves a number of challenges,...
The paper proposes a logic framework for modeling the interaction among deductive databases and computing consistent answers to logic queries in a P2P environment. As usual, data are exchanged among peers by using logical rules, called mapping rules. The novelty of our approach is that only data not violating integrity constraints are exchanged. The (declarative) semantics of a P2P system is defined...
Diseases such as avian influenza, severe acute respiratory syndrome (SARS) and Creutzfeldt-Jacob syndrome represent a new era of biological threats. Nowadays, these hazards breed, mutate and evolve at tremendous speed. Furthermore, they may spread out at the same speed as which we travel. This reveals an urgent need for an agent capable of dealing with such threats. Data warehouses are databases which...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.