The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
High performance computing (HPC) means the aggregation of computational power to increase the ability of processing large problems in science, engineering, and business. HPC on the cloud allows performing on demand HPC tasks by high performance clusters in a cloud environment. The connection structure of the nodes in HPC clusters should provide fast internode communication. It is important that scalability...
In this paper we evaluate and compare two representativeand popular distributed processing engines for large scalebig data analytics, Spark and graph based engine GraphLab. Wedesign a benchmark suite including representative algorithmsand datasets to compare the performances of the computingengines, from performance aspects of running time, memory andCPU usage, network and I/O overhead. The benchmark...
Based on the investigation of periodic shrew distributed DoS Attacks among enormous normal end-users' flow in cloud computing, this paper proposed a new method to take frequency-domain characteristics from the autocorrelation sequence of network flow as clustering feature to group end-user flow data by BIRTH algorithm, and re-merge these clustering results into new groups by overcoming the deficiency...
As we all know, it is an era of information explosion, in which we always get huge amounts of information. Therefore, it is in urgent need of picking out the useful and interesting information quickly. In order to solve this serious problem, recommendation system arises at the historic moment. Among the existing recommendation algorithms, the item-based collaborative filtering recommendation algorithm...
Computation of maximal exact matches (MEMs) is an important problem in comparing genomic sequences. Optimal sequential algorithms for computing MEMs have been already introduced and integrated in a number of software tools. To cope with large data and exploit new computing paradigms like cloud computing, it is important to develop efficient and ready-to-use solutions running on distributed parallel...
In the Cloud manufacturing environment, there are a huge number of cloud services, which are dynamic and changeful. The management of cloud services is very difficult. The paper presents a hypergraph clustering-based method to manage cloud manufacturing services. The clustering-based Cloud Manufacturing Service Management Model is presented, which contains three layers: manufacturing resources, cloud...
Massive amount of genomics data are being produced nowadays by Next Generation Sequencing machines. The suffix array is currently the best choice for indexing genomics data, because of its efficiency and large number of applications. In this paper, we address the problem of constructing the suffix array on computer cluster in the cloud. We present a solution that automates the establishment of a computer...
Frequent pattern mining has a critical role in mining associations, sequential patterns, correlations, causality, episodes, multidimensional patterns, emerging patterns, and many other significant data mining tasks. With the exponential growth of available data, most of the traditional frequent pattern mining algorithms become ineffective due to either huge resource requirements or large communications...
Aiming to problem of how to find center control site location of computer cluster to reduce the communication cost under cloud computing situation, using the method of replacing the distance function and improving update sample operator of standard RPCL(Rival Penalized Competitive Learning) algorithm, an improved RPCL algorithm is designed to handle to this problem. Though theoretical derivation,...
This article first introduce the Core Architecture and operational mechanism of cloud computing and HADOOP platform, then put forward the technical architecture of data mining platform Based on HADOOP. After a thorough understanding to the Map Reduce programming pattern, HSPRINT algorithm is realized in the decision tree. At last the effectiveness of the algorithm is verified through experiments.
With the continuous advance of the smart grid, power enterprises accumulated massive data. In order to avoid the appearance of the data grave, data mining algorithms are used to do mining for massive data. However, if traditional mining algorithms are used to handle massive data, their processing performance encounters a bottleneck. The k-means algorithm based on cloud computing is implemented in...
With the concepts of cloud computing springing up, the researches of data mining clustering algorithm which is based on cloud computing become a research focus for scholars both at home and abroad. This article aiming at the extensive data clustering problem, using cloud computing technology, according to Hadoop platform does a deep research based on cloud computing platforms Hadoop and parallel K-means...
Cloud storage has become increasingly popular due to its convenience, cost-effectiveness and scalability. It provides the basis for a slate of file hosting services, which offer users the ability to synchronize their files between the servers and their devices. Naive file synchronization, however, requires the whole file to be transmitted to all other locations (servers, devices) whenever the file...
Security is an important issue for building and sustaining trust relationship in cloud computing and in the usage of web-based applications. Consequently, intrusion detectors that adopt allowable and disallowable concepts are used in network forensics. The disallowable policy enforcers alert on events that are known to be bad while the allowable policy enforcers monitor events that deviate from known...
Based on the problem of TB level mass data lacking of parallel patterns which is distributed on Earth and accessed by Internet, we focus on the research of parallel computing architecture structure--virtual cluster based on cloud computing. Meanwhile, the parallel data mining algorithm is studied, and the effectiveness of parallel data mining algorithm based on this platform is proved.
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.