The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
In this study, based on clustering algorithms, we perform data mining on land price data in Taichung City during past ten years. For big data analysis, we combine Hadoop HDFS and MapReduce with R language and visualize results on Google Maps. We also study performances of K-means and Fuzzy C-means clustering algorithms, executed in the Hadoop cloud and a stand-alone PC. The experimental results show...
Clustering is one of the fundamental data mining procedures. Bisecting K-means (BKM) clustering has been studied to have higher computing efficiency and better clustering quality when compared with the basic Lloyd version of the K-means clustering. Elkan's method of utilizing triangle inequality significantly reduces distance calculations, and is applicable to each K-means iteration without affecting...
Clustering is useful for discovering underlying groups and identifying interesting patterns in scientific data and engineering systems. Affinity propagation (AP) is an effective clustering algorithm which has been successfully applied to broad areas of computer science. To generate high quality clusters, AP iteratively performs information propagation on the full similarity matrix and requires excessive...
Recently, social event recommendation, which is to recommend a list of upcoming events to a user, has attracted a lot of research interests. In this paper, we first construct a heterogeneous graph to express the interactions among different types of entities in event-based social network. Based on the constructed graph, we propose a novel recommendation algorithm called reverse random walk with restart...
Graph structures are often used for representingdata object and link between them in large datasets. Knowledge extraction from these data relies on finding the connected components within these graphs. Given a large graph G = (V, E), where V is the set of vertices and E is the set of edges, the problem is to find the connected components efficiently. The problem offinding the connected components...
Distributed computations on graphs gained importance with the emergence of large graphs, e.g., in the web or social networks. Frameworks like Hadoop, Giraph and Spark are used for their processing. Yet, they require advanced programming techniques to minimize skew and data shuffling. Declarative, query-like, but at the same time efficient solutions like Pig for general purpose analytics are lacking...
With the help of Internet, Massive Open Online Courses (MOOC) are recognized as a new path to learn courses via the web instead of in the traditional classrooms. MOOC can break many limits such as distance, time, participants, on the traditional courses. At the same time, it brings some new issues, such as high drop out ratio. Nowadays increasing MOOC courses are available and even more common people...
In the Web of data, entities are described by interlinked data rather than documents on the Web. In this work, we focus on entity resolution in the Web of data, i.e., identifying descriptions that refer to the same real-world entity. To reduce the required number of pairwise comparisons, methods for entity resolution perform blocking as a pre-processing step. A blocking technique places similar entity...
The MapReduce paradigm has become ubiquitous within Big Data Analytics. Within this field, Social Networks exist as an important area of applications as it relies on the large scale analysis of graphs. To enable the scalability of Social Networks, we consider the application of MapReduce design patterns for the determination of graph-based metrics. Specifically, we detail the application of a MapReduce-based...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.