The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
This research aimed to apply statistical methods able to survey individuals group into subgroups with similar features on their skills with technology. The sample included the participation of teachers who responded to the online survey of semi-structured survey on the experiences of use and level of knowledge of technological tools to support education. The data collected were organized into tables...
In this paper, the implementation of the K-means clustering algorithm on a Hadoop cluster with FPGA-based hardware accelerators is presented. The proposed design follows MapReduce programming model and uses Hadoop distribution file system (HDFS) for storing large dataset. The proposed FPGA-based hardware accelerator for speed up the K-means clustering algorithm is implemented on Xilinx VC707 evaluation...
Provides an abstract for each of the four plenary presentations and a brief professional biography of each presenter. The complete presentations were not made available for publication as part of the conference proceedings.
Recently, due to the popularity of Web 2.0, considerable attention has been paid to the opinion leader discovery in social network. By identifying the opinion leaders, companies or governments can manipulate the selling or guiding public opinion, respectively. Additionally, detecting the influential comments is able to understand the source and trend of public opinion formation. However, mining opinion...
Recently Information Technology is used extensively for wide range of application for example solutions enabled through e-commerce to different web based information system. This usage has lead to development of large textual data base. Mostly this information data is stored in unstructured text. This large data developed has lead to the need of its systematic clustering for easy data retrieval organization...
Many colleges have accumulated a large amount of information, such as achievement data and consumption records. According to the above information, we attempt to identify the student group from various aspects. Given this, we can acquire the characteristics of students in different groups. In this way, the college can have a better understanding of students to accomplish the reasonable management...
One of the most important machine learning techniques include clustering of data into different clusters or categories. There are several decent algorithms and techniques that exist to perform clustering on small to medium scale data. In the era of Big Data and with applications being large-scale and data-intensive in nature, there is a significant increment in volume, variety and velocity of data...
Historically, eye tracking systems have been a very useful tool for finding salient regions in interfaces that naturally attract the visual attention of users. Scan paths are created as the eye moves from one salient region to another. Research has shown that a relationship exists between scan path direction and cognitive load when navigating a user interface. The analysis of scan paths during interface...
In a previous paper [1] we introduced an optimized version of the K-Means Algorithm. Unlike the standard version of the K-Means algorithm that iteratively traverses the entire data set in order to decide to which cluster the data items belong, the proposed optimization relies on the observation that after performing only a few iterations the centroids get very close to their final position causing...
Searching is a prime operation in computer science and numerous methods has been devised to make it efficient. Hashing is one such searching technique with objective of limiting the searching complexity to O (1) i.e. finding the desired item in one attempt. But achieving complexity of O (1) is quite difficult or usually not possible. This happens because there is no perfect mapping function for insertion...
Active semi-supervised learning can play an important role in classification scenarios in which labeled data are difficult to obtain, while unlabeled data can be easily acquired. This paper focuses on an active semi-supervised algorithm that can be driven by multiple clustering hierarchies. If there is one or more hierarchies that can reasonably align clusters with class labels, then a few queries...
Clustering is among the most common data mining techniques and Fuzzy clustering can model the world even more realistically and more precisely. One of the most favorable fuzzy clustering methods is the Fuzzy C-Means (FCM) algorithm, which is actually identical to the (original) K-Means clustering algorithm fueled with a fuzzy flavor. However, there are some issues with the fuzzy clustering methods;...
Now a day's many of crimes are related to financial domain so forensic analysis of such documents is required. Due to digitization many of documents for investigation is faster. If analyzer analyzes the document manually it will time consuming and tedious task so, we follow the approach which will specify the clustering algorithm to document for forensic analysis of seize system which will help the...
Community structure is a common feature in real-world network. Overlap community detection is an important method to analyze topology structure and function of the network. Most algorithms are based on the network structure, without considering the node attributes. In this paper, we propose an overlapping community detection algorithm based on node convergence degree which combines the network topology...
Data mining has gained much importance in the field of research these days. It makes perfect blend for analyzing data of any fields and provide decision based output. Data generation and storage these days are done at high speed. Non stationary systems play holistic role in providing such data. Availability of such data creates scope of analysis for researchers. Such data which are continuous, unbounded,...
This paper proposes an approach using MapReduce-based Rocchio relevance feedback algorithm, which improved the traditional Rocchio algorithm in the MapReduce paradigm, to resolve the problem of massive information filtering. Traditional text classification algorithms have vital impact on information filtering.
Virtual learning community is a kind of learning environment based on network is a new type of learning organization. However, Virtual Learning Community in the teaching data is often messy, fragmentary, it's value is often difficult to be detected and reasonable to use data mining techniques to deal with data will give us a analysis to study the effect of get twice the result with half the effort...
Image matching is one of the active areas of research in today's world. There exist many ways in which two images can be matched. When the size of the image is very large, it incurs more computational time and cost. In this paper, an efficient algorithm to match large images is proposed. This algorithm works by generating key features using SIFT algorithm. This SIFT features are then clustered using...
With the amount of data increasing rapidly, how to improve the scalability of nonlinear clustering has become a very crucial and challenging problem. In this paper, we design an efficient parallel nonlinear clustering algorithm by using a four-stage MapReduce framework. In our approach, we need to compute two quantities based on distance matrices, which, however, is difficult to compute in a MapReduce...
In today's digital world scenario, digital data is coming in and going out faster than ever before. This data is of no use until we extract some useful content from it. But, it is impractical and inefficient to use traditional database management techniques on big data. That's why, big data technologies like Hadoop comes to existence. Hadoop is an open source framework, which can be used to process...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.