The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Uncertain data are inherent in many applications such as environmental surveillance and quantitative economics research. As an important problem in many applications, KNN query has been extensively investigated in the literature. In this paper, we study the problem of processing rank based KNN query against uncertain data. Besides applying the expected rank semantic to compute KNN, we also introduce...
In this paper, an efficient algorithm U-CPSQ is used to handle continuous probabilistic skyline queries. The main idea is as following: Firstly, according to the new probabilistic dominance relation defined in this paper, it is possible for us to compute the skyline probability for any points and get the initial p-skyline. Secondly, two types of events affecting p-skyline are defined, by tracking...
A RFIDSLT (RFIDStreamLineageTracing) method, which can trace the lineage of event and results of query over sliding windows of RFID data streams, is proposed in this paper. It constructs lineage-tracing algorithm adapt to vary instance based on transacted information of RFID data streams, provides online continuous lineage-query for RFID data streams. RFIDSLT is composed of confirming the length of...
Aggregation is among the core functionalities of OLAP systems. Frequently, such queries are issued in decision support systems to identify interesting groups of data. In conventional settings, the queries take a long time to compute (hours!) and produce massive result-sets at varying degrees of aggregation. Providing real time analysis results to Web users can enhance the utility of sites dealing...
To improve search precision, a model of personalized meta-search engine is presented. In this model, a query result ranking algorithm called Weighted-Position & Abstract Ranking based-on User Profile (WPARUP) is proposed, which integrates these threes factors: global similarity between query result's abstract and the query, the position information of query results, the relevance between member...
Virtually all query optimization methods in data stream management system (DSMS) require a means of estimating the number of distinct values of an attribute in a data stream. Accurate assessment of the number of distinct values can be crucial for selecting a good query plan. Due to data streams' continuous, real-time and unbounded characteristics, data streams may not be stored in limited memory an...
Document clustering has been widely used in information retrieval systems in order to improve the efficiency and also the effectiveness of ranked output systems using cluster hypothesis. This hypothesis states that relevant documents tend to be more similar to each other than to non-relevant documents, and therefore tend to appear in the same clusters. So far, the effectiveness of cluster hypothesis...
Facing today's information flood needs efficient means for personalization. Therefore XML query processing over large volumes of data needs to make the most out of already spent processing time by caching common (sub)expressions for reuse. This is especially promising for the new paradigm of personalized preference queries. Here a sequence of possible query relaxations is a-priori determined by the...
Skyline queries have received a lot of attention due to their intuitive query formulation. Following the concept of Pareto optimality all dasiabestpsila database items satisfying different aspects of the query are returned to the user. However, this often results in huge result set sizes. In everyday's life users face the same problem. But here, when confronted with a too large variety of choices...
Based on the semantic query expansion, we propose a novel approach to re-rank the query result that enables users to retrieve the required documents quickly. Our technique is different from others in that (1) ontology entries are added to the query by disambiguating word senses. (2) We propose a new algorithm to mine frequent patterns, which is used to construct semantic user focus. (3) It utilizes...
There are a large number of indirect schema mappings between peers in the network. To improve the efficiency of data exchange and queries, indirect mappings are needed to be composed. Defined the combination operations of schema elements in indirect mappings, and gave the expression of indirect mappings. Proposed a strategy, named schema element back, to solve the problem of indirect mapping composition,...
Summary form only given. Consider a universe of items, each of which is associated with a weight, and a database consisting of subsets of these items. Given a query set, a weighted set similarity query identifies either (i) all sets in the database whose normalized similarity to the query set is above a pre-specified threshold, or (ii) the sets in the database with the k highest similarity values...
Data access latency can be reduced for databases by using caching. Semantic caching enhances the performance of normal caching by locally answering the partially overlapped queries. Query processing (generation of probe and remainder query from the incoming queries) and cache management need to be addressed in its totality to really enjoy these benefits. That is, there is a need of correct, complete...
We consider the problem of unfair use of distributed information systems such as P2P networks. A fair user states a limited number of queries or requests not only to a single node but also to the system as a whole. A user is considered to be unfair if he floods the system, i.e. states queries to a substantial fraction of the nodes of the system. Such a node is called a heavy hitter. We design an randomized...
With the needs of decision-support information of enterprise and the fast development of computer technologies data warehouse technology come out. The data warehouse is a repository of information collected from multiple, possibly heterogeneous, autonomous, distributed databases. The information stored at the data warehouse is in form of views referred to as materialized views. The design of data...
Current existing semantic composition methods mainly rely on ontology reasoning to support automated service composition. However, in reality, ontologies are generally unavailable or ontology reasoning is time-consuming; thus existing semantic composition methods are becoming impractical in the general service integration field. To address this problem, in this paper, we present an innovative composition...
Research on cross-language information retrieval (CLIR) increasingly concentrates in candidate translation selection of the keywords in the query. The accuracy of translation has a direct impact on accurate rate and recalled rate. This thesis presents three methods based on HowNet to resolve query translation ambiguity of CLIR. The first is based on semantic relation, and it uses semantic relation...
With tremendous and ever-growing amounts of electronic documents from World Wide Web and digital libraries, it becomes more and more difficult to get information that people really want. In order to predigest search process, people use clustering method to browse through search results. However traditional Chinese information clustering techniques are inadequate since they don't generate clusters...
There are several pieces of information that can be utilized in order to improve the efficiency of similarity searches on high-dimensional data. The most commonly used information is the distribution of the data itself but the use of dimensional choice based on the information in the query as well as the parameters of the distribution can provide an effective improvement in the query processing speed...
Preference queries are crucial for various applications (e.g. digital libraries) as they allow users to discover and order data of interest in a personalized way. In this paper, we define preferences as preorders over relational attributes and their respective domains. Then, we rely on appropriate linearizations to provide a natural semantics for the block sequence answering a preference query. Moreover,...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.