The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Web databases are now pervasive. Such a database can be accessed via its query interface (usually HTML query form) only. Extracting Web query interfaces is a critical step in data integration across multiple Web databases, which creates a formal representation of a query form by extracting a set of query conditions in it. This paper presents a novel approach to extracting Web query interfaces. In...
Search queries on biomedical databases like PubMed often return a large number of results, only a small subset of which is relevant to the user. Ranking and categorization, which can also be combined, have been proposed to alleviate this information overload problem. Results categorization for biomedical databases is the focus of this work. A natural way to organize biomedical citations is according...
Multi-tenant data management is a form of software as a service (SaaS), whereby a third party service provider hosts databases as a service and provides its customers with seamless mechanisms to create, store and access their databases at the host site. One of the main problems in such a system, as we shall discuss in this paper, is scalability, namely the ability to serve an increasing number of...
Requirements from new types of applications call for new database system solutions. Computational science applications performing distributed computations on grid networks with requirements for efficient storage and query solutions are now emerging. For this purpose we have developed DASCOSA-DB, a P2P-based distributed database system, which in addition to providing location-transparent storage and...
Traditional databases manage only deterministic information, but now many applications that use databases involve uncertain data. For example, it is infeasible for a sensor database to contain only the exact value of each sensor at all points in time. The uncertainty is inherent in these systems due to measurement and sampling errors, and resource limitations. This paper aims at the query processing...
We have developed a system to process database queries over composed data providing Web services. The queries are transformed into execution plans containing an operator that invokes any Web service for given arguments. A common pattern in these query execution plans is that the output of one Web service call is the input for another, etc. The challenge addressed in this paper is to develop methods...
There is a growing realization that uncertain information is a first-class citizen in modern database management. As such, we need techniques to correctly and efficiently process uncertain data in database systems. In particular, data reduction techniques that can produce concise, accurate synopses of large probabilistic relations are crucial. Similar to their deterministic relation counterparts,...
Extracting information and insights from large databases is a time-consuming activity and has received considerable research attention recently. In this demo, we present DynaCet - a domain independent system that provides effective minimum-effort based dynamic faceted search solutions over enterprise databases. At every step, Dynacet suggests facets depending on the user response in the previous step...
When merging data from various sources, it is often the case that small variations in data format and interpretation cause traditional functional dependencies (FDs) to be violated, without there being an intrinsic violation of semantics. Examples include differing address formats, or different reported latitude/longitudes for a given address. In this paper, we define metric functional dependencies,...
Particle simulation has become an important research tool in many scientific and engineering fields. Data generated by such simulations impose great challenges to database storage and query processing. One of the queries against particle simulation data, the spatial distance histogram (SDH) query, is the building block of many high-level analytics, and requires quadratic time to compute using a straightforward...
A previously proposed keyword search paradigm produces, as a query result, a ranked list of object summaries (OSs); each OS summarizes all data held in a relational database about a particular data subject (DS). This paper further investigates the ranking of OSs and their tuples as to facilitate (1) the top-k ranking of OSs and also (2) the generation of partial size-l OSs (i.e. comprised of the l...
Database developers today use data access APIs such as ADO.NET to execute SQL queries from their application. These applications often have security problems such as SQL injection vulnerabilities and performance problems such as poorly written SQL queries. However today's compilers have little or no understanding of data access APIs or DBMS, and hence the above problems can go undetected until much...
Immortal DB is a transaction time database system that is built into a commercial database system rather than being layered on top. This enables it to have performance that is very close to the performance of an unversioned current time database system. Achieving such competitive performance is essential for wide acceptance of this temporal functionality. In this paper we describe further performance...
Web monitoring 2.0 supports the complex information needs of clients who probe multiple information sources and generate mashups by integrating across these volatile streams. A proxy that aims at satisfying multiple customized client profiles will face a scalability challenge in trying to maximize the number of clients served while at the same time fully satisfying complex client needs. In this paper,...
Inspired by the great success of information retrieval (IR) style keyword search on the Web, keyword search on XML has emerged recently. The difference between text database and XML database results in three new challenges: (1) Identify the user search intention, i.e. identify the XML node types that user wants to search for and search via. (2) Resolve keyword ambiguity problems: a keyword can appear...
In database systems that support fine-grained access controls, each user has access rights that determine which tuples are accessible and which are inaccessible. Queries are answered as if the inaccessible tuples are not present in the database. Thus, users with different access rights may get different answers to a given query. To process queries efficiently in the presence of fine-grained access...
We introduce a framework for reordering join pipelines at runtime in a database system. This framework incorporates novel techniques for simulating the execution of a join pipeline using random samples and statistical summaries. Our simulation techniques provide accurate runtime cardinality estimates along all alternative execution paths of a join pipeline. These estimates are then utilized to compare...
Auditing the changes to a database is critical for identifying malicious behavior, maintaining data quality, and improving system performance. But an accurate audit log is a historical record of the past that can also pose a serious threat to privacy. Policies which limit data retention conflict with the goal of accurate auditing, and data owners have to carefully balance the need for policy compliance...
Information systems are subject to a perpetual evolution, which is particularly pressing in Web information systems, due to their distributed and often collaborative nature. Such continuous adaptation process, comes with a very high cost, because of the intrinsic complexity of the task and the serious ramifications of such changes upon database-centric information system softwares. Therefore, there...
In the database outsourcing paradigm, a data owner (DO) delegates its DBMS administration to a specialized service provider (SP) that receives and processes queries from clients. The traditional outsourcing model (TOM) requires that the DO and the SP maintain authenticated data structures to enable authentication of query results. In this paper, we present SAE, a novel outsourcing model that separates...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.