The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
The problem of mining sequential patterns was recently introduced in [3]. We are given a database of sequences, where each sequence is a list of transactions ordered by transaction-time, and each transaction is a set of items. The problem is to discover all sequential patterns with a user-specified minimum support, where the support of a pattern is the number of data-sequences that contain the pattern...
Classification is an important problem in the emerging field of data mining. Although classification has been studied extensively in the past, most of the classification algorithms are designed only for memory-resident data, thus limiting their suitability for data mining large data sets. This paper discusses issues in building a scalable classifier and presents the design of SLIQ, a new classifier...
Active databases are an important topic of current database research. However, the semantics of the underlying mechanism, event-condition-action rules (or ECA rules for short), is unclear so far. In order to define a clear semantics for sets of active rules, we first derive the requirements such a semantics must fulfill. Since currently no semantics fulfills these requirements, we continue with the...
In this paper, we extend event types supported by Chimera, an active object-oriented database system. Chimera rules currently support disjunctive expressions of set-oriented, elementary event types; our proposal introduces instance-oriented event types, arbitrary boolean expressions (including negation), and precedence operators. Thus, we introduce a new event calculus, whose distinguishing feature...
We describe the development of a tool, called MDM, for the management of multiple models and the translation of database schemes. This tool can be at the basis of an integrated CASE environment, supporting the analysis and design of information systems, that allows different representations for the same data schemes. We first present a graph-theoretic framework that allows us to formally investigate...
Many new non-standard database management systems (NDBMSs) and data models have been proposed with the promise to facilitate the construction of better engineering environments and tools and to solve integration problems in environments. However, there is hardly any evidence or experience to what extent these goals are actually met. This paper summarizes experience gained in several major experiments...
We develop a formal basis of correct schema transformations. Schemas are formalized as abstract data types, and correct schema transformations are formalized as information-preserving signature interpretations. Our formalism captures transformations of all schema components, making it possible to transform uniformly constraints and queries along with structures. In addition, our formalism captures...
In this paper we define the concept of self-maintainable views — these are views that can be maintained using only the contents of the view and the database modifications, without accessing any of the underlying databases. We derive tight conditions under which several types of select-project-join are self-maintainable upon insertions, deletions and updates. Self-Maintainability is a desirable property...
We describe Monet, a novel database system, designed to get maximum performance out of today's workstations and symmetric multiprocessors. Monet is a type- and algebra-extensible database system using the Decomposed Storage Model (DSM) and employing shared memory parallelism. It applies purely main-memory algorithms for processing and uses OS virtual memory primitives for handling large data...
Complex queries, with aggregates, views and nested subqueries are important in decision-support applications. Such queries are represented as multi-block queries where a query block may be a view definition containing aggregates or a correlated nested subquery. Beyond transformations that propagate predicates across blocks, the problem of optimizing such queries has not been addressed adequately....
Efficient query processing is one of the key promises of database technology. With the evolution of supported data models—from relational via nested relational to object-oriented—the need for such efficiency has not diminished, and the general problem has increased in complexity. In this paper, we present a heuristics-based, extensible algorithm for the translation of object-oriented query...
ARC II is a learning system that allows to discover relationships from symbolic data. The learning strategy is based on probabilistic induction and produces dependence relationships between a fact and a set of facts. The system also takes into account dated facts or events in order to produce causal relationships between an event (effect), and a set of facts (cause) including at least one event. Relationships...
Many distributed databases use an epidemic approach to manage replicated data. In this approach, user operations are executed on a single replica. Asynchronously, a separate activity performs periodic pair-wise comparison of data item copies to detect and bring up to date obsolete copies. The overhead due to comparison of data copies grows linearly with the number of data items in the database, which...
Derived data is maintained in a database system to correlate and summarize base data which record real world facts. As base data changes, derived data needs to be recomputed. A high performance system should execute all these updates and recomputations in a timely fashion so that the data remains fresh and useful, while at the same time executing user transactions quickly. This paper studies the intricate...
Queries on partitioned signature files, namely Quick Filter (QF), can lead to retrieve from disk a large number of blocks, depending on the specific query pattern. In order to reduce the overall retrieval time, we consider multi-block read schedules that, provided contiguous allocation of blocks of the file on disk surface is guaranteed by the storage system, transfer more than one block at a time...
We propose a uniform and flexible mechanism to make reference links from SGML documents to database objects. In addition to typical document logical structures such as sections and paragraphs, our mechanism allows arbitrary character strings in documents as source of these links. By using this mechanism, SGML attributes and their values of marked-up words can be transparently stored as database attributes,...
A query to a nucleotide database is a DNA sequence. Answers are similar sequences, that is, sequences with a high-quality local alignment. Existing techniques for finding answers use exhaustive search, but it is likely that, with increasing database size, these algorithms will become prohibitively expensive. We have developed a partitioned search approach, in which local alignment string matching...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.