The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
The code behind dynamic webpages often includes calls to database libraries, with queries formed using a combination of static text and values computed at runtime. In this paper, we describe our work on a program analysis for extracting models of database queries that can compactly represent all queries that could be used in a specific database library call. We also describe our work on parsing partial...
This paper explores the causality and responsibility problem (CRP) for the non-answers to probabilistic reverse skyline queries (PRSQ). Towards this, we propose an efficient algorithm called CP to compute the causality and responsibility for the non-answers to PRSQ. CP first finds candidate causes, and then, it performs verification to obtain actual causes with their responsibilities, during which...
Frequent pattern mining discovers associations among different items in large sets of data. In many real-world applications, the presence of an object or a characteristic cannot be given exactly all the time. Instead, they can be better expressed in terms of probability and such data is called uncertain data. Mining frequent patterns from uncertain data is challenging due to presence of existential...
Classification is the process of finding a model or function that describes and distinguishes data classes or concepts, for the purpose of being able to use the model to predict the class of objects whose class label is unknown. The goal of classification is to accurately predict the target class for each case in the data. In sequence database having sequences, in which each sequence is a list of...
To retrain an existing multilayer perceptron (MLP) on-line using newly observed data, it is necessary to incorporate the new information while preserving the performance of the network. This is known as the “plasticitystability” problem. For this purpose, we proposed an algorithm for on-line training with guide data (OLTA-GD). OLTA-GD is good for implementation in portable/wearable computing devices...
Skyline queries are currently the most notable type of multi-criteria search algorithm. A skyline query returns all of the data points in a given a dataset that are not dominated by other data points. However, this type of query is limited by the fact that the number of results cannot be controlled. In some cases, this can result in an excessive number of results, whereas other cases result in an...
This paper presents the architecture and modeling approach of a Matlab-based toolbox for developing and testing home energy management (HEM) algorithms under a number of typical operation conditions. This toolbox serves as a developer platform that includes a graphical user interface, a model database, a computational engine, and an input-output database. The model database consists of home appliance...
Data mining is nothing but the process of automatically searching large stores of data to discover patterns and trends that go beyond simple analysis. So it is observed that while doing clustering there may be a chance of occurring dissimilar data object in a cluster. This paper introduces such technology that makes the patterns more accurate, and it helps to search more accurate analysis of data...
Graph database is revealed as alternative of traditional relation database for the reason that graph is flexible and self-explaining structure, which can cope with any kind of complex structure. Recently, graph database is extensively used to represent specially multi linked data of web, RDF data, social network, chemical structure, gene, network structure, publication links, and many more. This research...
Sky is one of the most significant subject matter commonly seen in outdoor photos. We propose a highly efficient sky detection algorithm. First, we detect a rough sky-ground boundary. Then, we calculate the parameters related to appearance of sky. Finally, we use these parameters to construct a hybrid probability model that indicates how possible a pixel belongs to sky. Moreover, an image processing...
Efficiency and effectiveness are two key factors to evaluate a human segmentation algorithm for real vision applications. However, most existing algorithms only focus on one of them. That is, fast and accurate human segmentation is not yet well addressed. In this paper, we propose a super-fast and highly accurate human segmentation method with very deep convolutional neural networks. We also provide...
With large companies and corporations becoming increasingly responsible for data collection, in recent years, a growing number of scientists have proposed using a variety of algorithms and different theories to solve the database problem. Even though existing solutions are effective in many cases many, problems are left to solve during the integration of database. The entity resolution (ER) is a crucial...
IoT/Bigdata is a hot research topic all over the world in recent years and is expecting to change the world greatly in the near future. Comparing with the data in traditional websites, Bigdata from IoT devices have 4 big V-features, i.e., volume, velocity, variety, and veracity. Due to the above four features, it is hard to provide timely services to users by data analysis, especially with the great...
Cloud mining is an incorporation of two robust technologies viz. Data mining and cloud storage and computing. Data mining has a vast scope in various fields, but when it is applied on a cloud platform, whose purpose is to provide seamless service; it adds value to the concept. The general idea is to research and test association mining algorithms and also to build a practical application which can...
Overlapped fingerprints are commonly encountered in latent fingerprints lifted from crime scenes. Such overlapped fingerprints can hardly be processed by state-of-the-art fingerprint matchers. Several methods have been proposed to separate the overlapped fingerprints. However, these methods neither provide a robust separation results, nor could be generalized to most overlapped fingerprints. In this...
Deduplication is a commonly-used technique on disk-based storage pools. However, deduplication has not been used for tape-based pools: tape characteristics, such as high mount and seek times combined with data fragmentation resulting from deduplication create a toxic combination that leads to unacceptably high retrieval times. This work proposes DedupT, a system that efficiently supports deduplication...
Analytics applications for reporting and human interaction with big data rely upon scalable frameworks for data ingest, storage, and computation. Batch processing of analytic workloads increases latency of results and can perform redundant computation. In real-world applications, new data points are continuously arriving and a suite of algorithms must be updated to reflect the changes. Reducing the...
We assume a database of items in which each item is described by a set of attributes, some of which could be multi-valued. We refer to each of the distinct attribute values as a feature. We also assume that we have information about the interactions (such as visits or likes) between a set of users and those items. In our paper, we would like to rank the features of an item using user-item interactions...
Manifold learning (ML) is a known non-linear technique for representing high dimensional data. Despite the potential power of ML techniques, they fail in representing an unseen test data accurately. To better model the geometric structure of manifolds, Manifold Alignment (MA) techniques have been proposed recently, where the majority of these algorithms rely on point correspondences between two manifolds...
Cloud computing provides the possibility of a solution to the problems caused by the massive amounts of data. As an open source cloud computing platform, Hadoop has been widely used in the commercial. MapReduce model is one of the important parts of Hadoop, and it can support parallel computing and schedule tasks automatically. Because of these, it can improve the efficiency of the configuration while...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.