The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
The Mid-Atlantic Regional Association Coastal Ocean Observing System (MARACOOS) is one of the eleven Regional Associations (RAs) comprising the coastal network of the U.S. Integrated Ocean Observing System (US IOOS). MARACOOS involves participants from academia, government, the private sector, and non-profit entities, and covers the ocean and estuaries from Cape Cod, MA to Cape Hatteras, NC. The high...
In the big data era, the information about the same object collected from multiple sources is inevitably conflicting. The task of identifying true information (i.e., the truths) among conflicting data is referred to as truth discovery, which incorporates the estimation of source reliability degrees into the aggregation of multi-source data. However, in many real-world applications, large-scale data...
Through analyzing the characteristics of the data processing in the environment of cloud computing, the author puts forward the distributed optimization storage model based on hash distribution. Based on massive amounts of data distribution, this model completed the data backup protocol based on master copy of distributed design and the Paxos distributed system protocol design on the basis of the...
In the era of big data processing, it is desirable to manage large volumes of data with high scalability, confidentiality protection, and flexible types of search queries. In this paper, we propose a design to store encrypted data on a cluster of distributed servers while supporting secure and authorized Boolean queries. In particular, the data owner encrypts the database with encrypted searchable...
The blockchain technology is gaining momentum because of its possible application to other systems than the cryptocurrency one. Indeed, blockchain, as a de-centralized system based on a distributed digital ledger, can be utilized to securely manage any kind of assets, constructing a system that is independent of any authorization entity. In this paper, we briefly present blockchain and our work in...
Data assurance and resilience are crucial security issues in cloud-based IoT applications. With the widespread adoption of drones in IoT scenarios such as warfare, agriculture and delivery, effective solutions to protect data integrity and communications between drones and the control system have been in urgent demand to prevent potential vulnerabilities that may cause heavy losses. To secure drone...
These days each data moves toward becoming hard-ware and put away as Databases, which thusly may encase most levered private, monetary, and administration truths. As the information have important resources for the clients, it additionally most extreme foreseen to the mean adversaries and in this way, to secure in inconsistency of every single potential adversary activities is basic. These days the...
Real-world stream data with skewed distribution raises unique challenges to distributed stream processing systems. Existing stream workload partitioning schemes usually use a “one size fits all” design, which leverage either a shuffle grouping or a key grouping strategy for partitioning the stream workloads among multiple processing units, leading to notable problems of unsatisfied system throughput...
Recently, due to rapid development of information and communication technologies, the data are created and consumed in the avalanche way. Distributed computing create preconditions for analyzing and processing such Big Data by distributing the computations among a number of compute nodes. In this work, performance of distributed computing environments on the basis of Hadoop and Spark frameworks is...
Service elasticity, the ability to rapidly expand or shrink service processing capacity on demand, has become a first-class property in the domain of infrastructure services. Scalable NoSQL data stores are the de-facto choice of applications aiming for scalable, highly available data persistence. The elasticity of such data stores is still challenging, due to the complexity and performance impact...
We present EclipseMR, a novel MapReduce framework prototype that efficiently utilizes a large distributed memory in cluster environments. EclipseMR consists of double-layered consistent hash rings - a decentralized DHT-based file system and an in-memory key-value store that employs consistent hashing. The in-memory key-value store in EclipseMR is designed not only to cache local data but also remote...
Causal consistency is an intermediate consistency model that can be achieved together with high availability and high-performance requirements even in presence of network partitions. In the context of partitioned data stores, it has been shown that implicit dependency tracking using clocks is more efficient than explicit dependency tracking by sending dependency check messages. Existing clock-based...
It is common for real-world applications to analyze big graphs using distributed graph processing systems. Popular in-memory systems require an enormous amount of resources to handle big graphs. While several out-of-core approaches have been proposed for processing big graphs on disk, the high disk I/O overhead could significantly reduce performance. In this paper, we propose GraphH to enable high-performance...
Social media networks as well as online graph analytics operate on large-scale graphs with millions of vertices, even billions in some cases. Low-latency access is essential, but caching suffers from the mostly irregular access patterns of the aforementioned application domains. Hence, distributed in-memory systems are proposed keeping all data always in memory. But, the sheer amount of small data...
This article considers problems of distributed database structure development connected with presenting similar data in different nodes. It is suggested to use the results of users SQL-queries parsing for optimizing DB structure in order to increase local requests and synchronization speed.
Master-Master replication enables us to replicate data through distributed writes and distributed reads. This is an advantage over Master-Slave replication. There are two problems we need to address in this paper with regards to master-master replication. The first is its nature of being loosely consistent and the second one is updates needs to be first agreed up on to be committed. In this paper,...
With the rapid development of Internet technologies such as cloud computing and big data, the scales of distributed information systems in big companies have grown to enormous sizes. Automatic detection and diagnosis of system faults in the large-scale information systems is complicated and important in both practice and research. In this paper, we propose a Graph-based Fault Diagnosis approach in...
With the widely application of distributed systems, to solve the Sessions information sharing between various servers is becoming increasingly important. Meanwhile, multiple clients also have to tackle the problem of Session information sharing. The traditional Session method conducts information sharing via information replication. But, to a high degree, it has become the bottleneck of browse speed...
Consistent hashing is used for distributing the data uniformly over a given set of servers in a topology. However, uniform distribution of the data over a given set of servers does not guarantee a uniform distribution of the workload associated with the data over the set of servers. When the workload is skewed over a small subset of data items the traditional re-partitioning approach used for handling...
To solve the problems of heterogeneous data types and large amount of calculation in making decision for big data, an optimized distributed OLAP system for big data is proposed in this paper. The system provides data acquisition for different data sources, and supports two types of OLAP engines, Impala and Kylin. First of all, the architecture of the system is proposed, consisting of four modules,...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.