The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
With the expansion of semantic web, there is an increasing number of heterogeneous information data such as RDF. Nowadays, most of solutions operating RDF are centralized and have limitations of scalability. In this paper, we propose a method to solve the problem by storing and querying RDF data based on HBase and Hadoop. A storage model classified by class and predicate is used in HBase to support...
As a data integration technology, ontology has been widely used in the knowledge systems and domain knowledge representation. An ontology construction and fusion technology for large-scale data is very important. In this paper, we propose a parallel ontology construction and fusion approach adapted to MapReduce framework based on the traditional ontology technology and MapReduce. The method separates...
The processing and mining information in large scale graph data have proven to be challenging. The bulk synchronous parallel (BSP) computing model is suitable for this task. In this paper, we implement the multi-level step-wise partitioning (MSP) algorithm in BSP programming model, and replace the original graph partition method. The results on both experimental data and real world data proved this...
The rapid development of Internet technology has brought the problem of information overload, and recommendation algorithm is put forward and considered to be the most effective way to solve the problem. Most of the traditional research about recommendation algorithm is focused on accuracy and diversity. However, in the practical engineering application, massive data process will be the most serious...
The MapReduce framework has been employed in many papers to process the large-scale graph. In this paper, we propose a multi-source message passing model to achieve multi-source traversal of graph in one iterative progress, which largely improve the parallelism efficiency of graph algorithm involving multi-source traversal which occurs in many complex graph algorithms. As the model can traverse the...
Since the birth of the concept of Content Based Image Retrieval, it has been a hot issue of computer vision and pattern recognition subject. In the information age, with the rapid growth of the data, how can we retrieval image quickly and accurately from the vast amounts of image data become a problem. This paper proposes a method of applying the Map/Reduce model in the Cloud Computing to Content...
many parallel computational models have been employed in many papers to process the large-scale graph. In this paper, we propose a message passing model Router which could be invoked by most of current parallel computational models to process the large graph. The model is good at solving the multi-source traversal problem which often occurs in many complex graph algorithms. As the model can traverse...
SQL as a database language has been widely used in the modern society. Its function mainly focuses on the data processing, which can be used in data-mining. Due to the rapid growth of data, large-scale data processing is becoming a focal point of information techniques. Though we can still use SQL, but where to store the data and how to get the data efficiently, cost effectively, can be a tricky problem...
The rapid growth of data promotes the development of parallel computing. MapReduce, which is a simplified programming model of distributed parallel computing, is becoming more and more popular. In this paper, we design and implementation of parallel statistical algorithm based on Hadoop's MapReduce model. The algorithm, which is used to grasp the overall characteristics of massive data, involves the...
Data analysis has been widely used in the enterprises for its high efficiency and accuracy, especially in the field of telecommunication industry, such as User Behavior Analysis, Customer Churn Prediction, etc. However, as the exponential growth of data, traditional data analysis tools can not handle such large-scale dataset. Furthermore, as business gets more and more complicated, there is much more...
The connected component of an undirected graph plays an important part in graph theory. It is straightforward to compute the connected components of a graph in linear time using either breadth-first search or depth-first search. However when confronted with large scale data, both of the two algorithms are hard to execute. In this paper, we introduce a recently proposed community detection technique...
Structure mining plays an important part in the researches in biology, physics, Internet or telecommunications in recently emerging network science. As a main task in this area, the problem of maximal clique enumeration has attracted much interest and been studied in variant avenues in prior works. However, most of these works mainly rely on single chip computational capacity and have been constrained...
The continued exponential growth in both the volume and the complexity of information is giving birth to a new challenge to the specific requirements of analysts, researchers and intelligence providers. In this paper, to move the scientific activity forward to practice, we elaborate a prototype of our on-going constructed system, CosDic, for knowledge discovery from extremely large-scale datasets...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.