The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
The requirements of computing and storage for High Energy physics experiments are growing rapidly with the expansion of the scale of experiments, the forthcoming completion of Chinese Spallation Neutron Source(CSNS) has also higher requirements for computing system. The new computing pattern, cloud computing, can make IT resources configuration flexible and management centralized. So from the research...
As The integration of Physical space and cyberspace, the large-scale data distributing to diversification terminal which is geographical distribution of mass has become a huge challenge. When the data size can't be processed by the technology for traditional scope, how to deal with the user quality of service and efficient use of system resources has become an important issue of concern, with the...
The Patent Cloud Platform is under construction in Chongqing. As the most important part of the Patent Cloud Platform, the Patent Data Resource Service Center is urgent. Tens of millions of patent data from seven countries and two organizations would be shared for public. For the heterogeneous of patent data from different data sources, some essential pre-processing methods should be carried out to...
Hakka culture is an important part of China southern culture. There are many features of Hakka culture data collected through the digital and information technology, such as a diverse data type and format, unstructured data, and huge volume, which brings big difficulty to manage and use them. NoSQL technology has high availability and high scalability, which provides new methods for the storage and...
Although most online social networks rely on a centralized infrastructure, several proposals of Distributed Online Social Networks (DOSNs) have been recently presented. Since in DOSNs user profiles are stored on the peers of the users belonging to the network, one of the main challenges comes from guaranteeing the profile availability when the owner of the data is not online. In this paper, we propose...
Our last decade has experienced significant growth in terms of data generated by billions of people connected to the Internet. Recent prognoses about Big Data, Internet of Thing, Cloud Computing show growing demand for an efficient processing of huge amount of data with strict time limits. The Best data distribution on Shared Nothing Architecture (SN) is a major issue. SDDS (Scalable Distributed Data...
Due to its special characteristics, offering Hadoop-as-a-Service (HaaS) in a virtualized environment is more challenging than providing any other kinds of cloud services. This paper describes those challenges and presents solutions to them, ranging from an experimental project to open source and commercial enterprise systems to public cloud providers.
Erasure coding has been increasingly replacing replication in distributed storage systems, thanks to its lower storage overhead with the same level of failure tolerance. However, with lower storage overhead, the reconstruction overhead of erasure codes can increase significantly as well. Under the ever-changing workload, in which the data access can be highly skewed, it is difficult to achieve a well...
It is a challenge to improve utilization of physical resources when allocating resources for virtual data centers (VDCs). In this work we proposed two reliability-aware VDC embedding methods NMP and NMPD. To reduce embedding cost, the proposals try to split VMs into clusters based on topological potential and modularity and assign each cluster to a server. Via simulations, we show that NMP can accept...
The process of knowledge discovery applied in distributed databases implies finding useful knowledge from mining data sets stored in real implementations of distributed databases. Distributed Databases represents a software system that allows a multitude of applications to access the data stored in local or remote databases. In this scenario, the data distribution is achieved through the process of...
This paper investigates the hierarchical deployment and over-provisioning of energy storage devices (ESDs) in data ceners by (i) adopting a realistic power delivery architecture (from Intel) for centralized ESD structure as the starting point, (ii) presenting a novel and realistic power delivery architecture, borrowing the best features of the centralized ESD structure from Intel and distributed single-level...
Linked data mining has become one of the key questions in High Performance graph mining in recent years. However, the existing Resource Description Framework (RDF) database engines are not scalable and are less reliable in heterogeneous clouds. In this paper we describe the design and implementation of Acacia-RDF which is a scalable distributed RDF graph database engine developed with X10 programming...
MapReduce is a popular computing model for parallel data processing on large-scale datasets, which can vary from gigabytes to terabytes and petabytes. Though Hadoop MapReduce normally uses Hadoop Distributed File System (HDFS) local file system, it can be configured to use a remote file system. Then, an interesting question is raised: for a given application, which is the best running platform among...
This paper analyzed the shortcomings that existed in the Memcached Caching Strategy. An improved scheme DCSAE (Data Cached Strategy on Asynchronously Updating for Eventual Consistency) which used asynchronous update based on the eventual consistency theory was proposed. DCSAE generated virtual nodes and assigned them to real nodes according to the weight of each node. The weight was calculated in...
In this paper, we introduce Mayflower, a new distributed filesystem that is co-designed from the ground up to work together with a network control plane. In addition to the standard distributed filesystem components, Mayflower has a flow monitor and manager running alongside a software-defined networking controller. This tight coupling with the network controller enables Mayflower to make intelligent...
Large-scale Internet applications provide service to end users with servers, which may be located at geographically distributed data centers. Users may require different delay constraints for different services. To meet the service delay requirements to end users, the data centers must provide enough server resources which incur a large amount of electricity and dollars cost. In this paper, we tackle...
This paper describes the unique features and functionalities of Doctor Locator — an online system for locating doctors. It can be used to find all the necessary data regarding a doctor in Bangladesh. It also explains why this system, which is hosted online and is also available in the form of a smartphone application for both android and iOS platforms, is different from all the other existing systems...
Cloud platforms offer computing, storage and other related resources to cloud consumers in the form of Virtual Machines (VMs), and allow VMs scaling according to the workload characteristic. Specially, with cloud computing, service providers need no longer to maintain a large number of expensive physical machines, which can significantly reduce the cost. However, it is still a challenge for service...
The world keeps contributing to the increase in data everyday drastically. Scientific applications, weather forecasting, researches, hospitals, military services are few such major contributors. As the amount of data increases, the need to provide efficient, easy to use solutions has become one of the main issues for these type of computations. The best solution to this issue is the use of Distributed...
With enterprises collecting feedback down to every possible detail, data repositories are being over flooded with information. In-order to access valuable information, these data should be processed using sophisticated statistical analysis. Traditional analytical tools, existing statistical software and data management systems find it challenging to perform deep analysis upon large data libraries...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.