The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
The research discusses the issue how to describe data quality and what should be taken into account when developing an universal data quality management solution. The proposed approach is to create quality specifications for each kind of data objects and to make them executable. The specification can be executed step-by-step according to business process descriptions, ensuring the gradual accumulation...
The paper presents the results of the research of the clustering algorithm DBSCAN practical implementation within the framework of the objective clustering inductive technology. As experimental, the data Aggregation and Compound of the Computing school of the East Finland University and the gene expression sequences of lung cancer patients of the database ArrayExpres were used. The architecture of...
This paper describes our recent effort to develop a cognitive application to transform invoice processing in P2P (Procure to Pay) process in Finance operations. The context of this problem is with service providers who offer invoice processing services to their global clients. Due to the language barrier, the service provider usually maintain both offshore and near-shore teams, where the offshore...
NoSQL datastores allow the efficient management of high volumes of heterogeneous and unstructured data, meeting the requirements of a variety of today ICT applications. However, most of these systems poorly support data security, and recent surveys show that their simplistic support for data protection is considered as a reason not to use them.1 In recent years, Attribute Based Access Control (ABAC)...
International development banks provide low-interest loans to developing countries in an effort to stimulate social and economic development. These loans support key infrastructure projects including the building of roads, schools, and hospitals. However, despite the best efforts of development banks, these loan funds are often lost to fraud, corruption, and collusion. In an effort to sanction and...
Mathematical Models Database (MMD) is an online repository of mathematical models that can be easily used for research purposes, in teaching or in performing comparative tests. In a single free-of-charge, freely accessible Internet service all the data from the database is available with an expected (in its final form) several ready-to-use models, either purely mathematical or physics-based. Apart...
Conversion of rainrate statistics from different integration times to 1 min is studied using 6 years (2007–2011, 2013) of disdrometric rainrate data over a tropical coastal station, Thiruvananthapuram (, ), and an inland station, Gadanki ( , ), India. In this study, we used two established empirical models, namely Chebil...
Database-centric applications (DCAs) usually contain a large number of tables, attributes, and constraints describing the underlying data model. Understanding how database tables and attributes are used in the source code along with the constraints related to these usages is an important component of DCA maintenance. However, documenting database-related operations and their constraints in the source...
In this paper, we report on an evaluation of four representative Big Data management systems (BDMSs): Mon-goDB, Hive, AsterixDB, and a commercial parallel shared-nothing relational database system. In terms of features, all offer to store and manage large volumes of data, and all provide some degree of query processing capabilities on top of such data. Our evaluation is based on a micro-benchmark...
Collaborative e-Science applications often need to manage large numbers of user identities, profiles, and groups. However, developing and maintaining such capabilities is often challenging given the plethora of security protocols available and requirements for scalable, robust, and highly available implementations. Globus Nexus is a professionally hosted Platform-as-a-Service that provides these capabilities...
The high-volume, low-latency world of network traffic presents significant obstacles for complex analysis techniques. The unique challenge of adapting powerful but high-latency models to realtime network streams is the basis of our cyber security project. In this paper we discuss our use of NoSQL databases in a framework that enables the application of computationally expensive models against a real-time...
Large-scale data processing systems frequently require users to make timely and high-value business decisions based upon information that is received from a variety of heterogeneous sources. Such heterogeneity is especially true of service-oriented systems, which are often dynamic in nature and composed of multiple interacting services. However, in order to establish user trust in such systems, there...
This paper is an analysis of adaptation techniques for French acoustic models (hidden Markov models). The LVCSR engine Julius, the Hidden Markov Model Toolkit (HTK) and the K-Fold CV technique are used together to build three different adaptation methods: Maximum Likelihood a priori (ML), Maximum Likelihood Linear Regression (MLLR) and Maximum a posteriori (MAP). Experimental results by means of word...
As large size RDF data (Resource Description Framework) emerges, managing it is becoming much more important. Most of the existing methods are complex, and inefficient. In order to easily manipulate the database, we propose a new RDFS (RDF Schema) storing strategy based on relational database. In this paper, we distinguish between schema information and instance data, and handle independent RDF schema...
Discovering the relationships among data resources in dataspace is an important issue, which is the basis for creating index, browsing, searching, querying, lineage and other services.However current researches mostly focus on the assumption that the relationships among data resources have been obtained, so they have more or less limitation. In order to solve this problem we propose an approach to...
In this paper, we propose a new ant based clustering algorithm. The algorithm takes inspiration from the sound communication properties of real ants. Artificial ants communicate directly with each others in order to merge similar group of objects. The proposed algorithm was tested and evaluated. The obtained results are very encouraging in comparison with the famous k-means and some ant based clustering...
With the knowledge management requirement growing, enterprises are becoming increasingly aware of the significance of interlinking business information across structured and semi-structured data sources. This problem has become more important with the growing amount of semi-structured data often found in XML repositories, web logs, biological databases, etc. Effectively creating links between semi-structured...
Today's mobile devices have inherited many of the characteristics of desktop computing -- including the assumptions that the user's full attention and dexterity can be focused on the interface. Unfortunately, on-the-go users are impaired by their mobility and often find desktop-style Windows Icon Menu and Pointer (WIMP) interfaces difficult, if not impossible, to use while performing their primary...
In this paper, we propose a new SystemC-based fault injection technique that has improved fault representation in visible and on-the-fly data and signal registers. The technique is minimum intrusive since it only requires replacing the original data or signal types to fault injection enabler types. We compare the proposed simulation technique with recently reported SystemC-based techniques and show...
Recently, dataspace has been proposed as a new architecture in the evolution of information integration. However, how to fulfill the vision of dataspace, i.e. pay-as-you-go integration, is still a suspensive issue, and the first challenge is data modeling. In this paper, we present a flexible data model called triple model which is similar to RDF but more suitable to dataspace applications. Information...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.