The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Over the course of time an organization working with geospatial data accumulates tons of data both in the form of vector and raster formats. This data is a result of coordinated processes within the organization and external sources such as other collaborative organizations, projects and agencies, crowd sourcing efforts, etc. The massive amount of data accumulated as a result and the recent developments...
Recently the No-SQL databases has been popularly used in the mobile and web applications. There are several types of No-SQL databases such as columnar, key-value, graph databases and finally the document store database which is efficient and has more dynamic queries than the normal RDBMS. This paper proposes an automatic method to map the data from the relational database to document store database...
XML has been evolved from being used for data representation to a structured document format. It has also grown in size from several bytes of small number of files to giga-or terabytes of size with millions of files. XML databases have been developed to efficiently store XML documents with XQuery as querying language from XML documents. However, they have difficulties in dealing with huge number of...
Datawarehouses can be extremely large and ressource demanding, which is not always affordable in a local environment. Hence, in order to deal with the big amounts of data held in the datawarehouses, Cloud warehousing seems to be the solution. On the other hand, many entreprises use datawarehouses for data analysis and use XML to deal with semi-structured data but also to take advantage of the web...
The volume of XML data is tremendous in many areas, especially in data logging and scientific areas. XPath query is the core operation of XML process. It is a challenge to query massive XML data stored in a distributed manner. In this paper, we present an efficient distributed XPath query processing using MapReduce, which simultaneously processes queries for a massive volume of XML data. We first...
NoSQL Database Management Systems (DBMS) have increased in prominence and market share in the last few years. IBM Cloudant is one of the well-known enterprise NoSQL DBMS which has been offered in both cloud and also as an onpremise standalone product (i.e. Cloudant Local). Cloudant has a large number of customers such as Samsung, Adobe, DHL, etc. In this paper, we will demonstrate how Cloudant can...
The Online Metadata Editor (OME) is a web-based tool to help document scientific data in a well-structured, popular scientific metadata format. In this paper, we will discuss the newest tool that Oak Ridge National Laboratory (ORNL) has developed to generate, edit, and manage metadata and how it is helping dataintensive science centers and projects, such as the U.S. Department of Energy's Next Generation...
With the rise of big data, people begin to focus on storage of it. In this paper, first, we design four different storage solutions on Hadoop platform with HBase, MySQL, XML and plain text for specific data storage. Data is stored in Hadoop Distributed File System except for MySQL storage solution, in which data is stored directly in the Linux File System. Second, for each solution, we write specific...
We describe here an agent-based Distributed Analytical Search (DAS) tool to search and query distributed “big data” sources regardless of data's location, content or format. DAS semantically analyzes natural language queries from a web-based user interface. It automatically translates the query to a set of sub-queries by deploying a combination of planning and traditional database query optimization...
For manufacturing industry seamless workflows along the whole product lifecycle are crucial for sustainable success. Although the integration of all used systems to a single supersystem is not reasonable, the integration of data is essential. This contribution introduces a model-based concept to integrate data elements of distributed data systems and sources to one virtual database. Specific view...
Timely access to quality data and linkage of data beyond disciplinary boundaries is essential for the marine research community. Therefore the national “Marine Network for Integrated Data Access” is establishing the “Data Portal German Marine Research” to facilitate seamless access to marine data and services and to promote the exchange and dissemination of marine data interlinked with corresponding...
As the amount of data being exchanged over the network increases, algorithms originally implemented for running on a single machine have been re-designed to work in a distributed manner, with a processing platform that splits tasks among machines and cores. Brand new frameworks have emerged for the analysis of unbound streams of data, aiming at processing data and retrieving information nearly real-time...
Twig query is considered the core query pattern in most XML query language. With the XML document size becoming larger, single site cannot deal with such volume data in storage capacity and compute ability. Partitioning the large data and distributed parallel processing query is an efficient and effective way. This paper proposes Twig MRR algorithm for evaluating XML twig query over large XML data...
Web Warehouse has conquered the limitation of data warehouse geographical dependencies. With the advent of web warehouse, now decision makers of an organization can retrieve decisions related knowledge through internet. When data is fetched from a web warehouse located on web server the data security, Integrity and confidentiality problems rise up. To overcome security threats and availability issues...
Hadoop as open source software that implements the MapReduce framework is an ideal solution to speed up a XML parallel query processing. We proposed a distributed caching architecture in Hadoop cluster, called switch-SSD which cache XML query results en-route in the network switching nodes. Switch-SSD extends extend OpenFlow switches limited memory space with SSD for caching XML query results in the...
Electronic Health Record (EHR) Systems are widely considered a crucial tool for the excellence in patient care, especially in the context of chronic diseases. Nevertheless, patients often do not have full control on their clinical data, which are generated by different health centers. Moreover, collecting, storing and providing clinical data are intensive tasks for health structures, which frequently...
At present, the power system is building up on top of a series of auxiliary systems for examples communication systems, monitoring systems, marketing systems and so on. All the systems work based on the shared power system data which are defined using Common Information Model (CIM). Due to diversiform reasons, errors may exist in the data. Therefore the verification technologies are developed. So...
At present, the power system is building up on top of a series of auxiliary systems for examples communication systems, monitoring systems, marketing systems and so on. All the systems work based on the shared power system data which are defined using Common Information Model (CIM). Due to diversiform reasons, errors may exist in the data. Therefore the verification technologies are developed. So...
In the Internet of Things (IoT) era, we need to face increased masses of cross-domain data stored in different formats (either relational, XML, JSON, textual) and data streams (produced by sensors), that can be highly or loosely structured and need to be integrated for analysis. Recently, many NoSQL systems (e.g., MongoDB, Cassandra, HBASE) have been born for coping the scalability issues of current...
The computational science community is approaching petascale level simulations that will produce massive amount of datasets. While the computational power of supercomputers keep increasing, the I/O systems have not kept pace, resulting in a significant performance bottle neck. We propose a solution, VisDSI, to address the problem by 1) using traditional high performance clusters with disks directly...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.