The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Existing approach to model sensor movement data as pairwise connections in networks implicitly assumes the Markov property and loses higher-order movement patterns. While the higher-order network (HON) captures higher-order movement patterns, there has not yet been a visualization tool tailored for HON. Based on our prior work, in this demo we present HoNVis, a comprehensive visualization and interactive...
Internet is a rich source of information, but it consists of miscellaneous data fragments. Analytical “surfing” on World Wide Web opens extraordinary prospects for attain and maintain competitive advantage of enterprises and for scientific researches. Exploratory OLAP is one of the main agenda of analytical processing of heterogeneous data sources. The author proposes an original approach to exploratory...
In community question and answering sites, pairs of questions and their high-quality answers (like best answers selected by askers) can be valuable knowledge available to others. However lots of questions receive multiple answers but askers do not label either one as the accepted or best one even when some replies answer their questions. To solve this problem, high-quality answer prediction or best...
In recent years, higher education has been gaining importance in graduate students to make successful careers. So, academic organizations are given utmost importance for quality in academics to build the careers of the students. Faculty performance plays a vital role in academic institutions. In this paper, the performance of faculty members is evaluated on the basis of different parameters are taken...
Due to the huge increase in the size of the data it becomes troublesome to perform efficient analysis using the current traditional techniques. Big data put forward a lot of challenges due to its several characteristics like volume, velocity, variety, variability, value and complexity. Today there is not only a necessity for efficient data mining techniques to process large volume of data but in addition...
Logs, which record runtime information of modern systems, are widely utilized by developers (and operators) in system development and maintenance. Due to the ever-increasing size of logs, data mining models are often adopted to help developers extract system behavior information. However, before feeding logs into data mining models, logs need to be parsed by a log parser because of their unstructured...
Class evolution, the phenomenon of class emergence and disappearance, is an important research topic for data stream mining. All previous studies implicitly regard class evolution as a transient change, which is not true for many real-world problems. This paper concerns the scenario where classes emerge or disappear gradually. A class-based ensemble approach, namely Class-Based ensemble for Class...
Data streams are significantly influenced by the notion change that is termed as concept drift. The act of knowledge discovery from the data streams under notion adaption is a significant act to achieve the conventional learning of the streaming data. The concept drift for conventional learning of streaming data can be done under set of notions that can be either static or dynamic. Due to the large...
The data mining techniques are employed for efficient and real time analysis of Weather and Climate data. The main goal of studies on Climate is that users e.g. farmers, Scientist, decision & policy maker etc., from different industries e.g. Agriculture, Scientific, Aerospace etc., is required to understand the importance of various changes in weather and climate parameters like rainfall, humidity,...
In the processing of source retrieval in plagiarism detection, rationale for keywords extraction is to select only those phrases or words which maximize the chance of retrieving source documents matching the suspicious document. TF-IDF (term frequency-inverse document frequency), weighted TF-IDF (the weighted term frequency-inverse document frequency, namely, the TF-IDF of a term with a different...
This paper discusses the implementation of a decision support system for the prediction of asthma in a group of children with related medical factors. The system makes use of the survey data that is gathered as part of ISAAC Phase One Study, obtained through questionnaires completed by adolescents at school and at home by the parents of the children. The model is tested on cross-sectional study data...
Data reduction is a process of reducing the datasets in volume, almost used in all real time applications. Although there are several techniques available, many researchers have used K-Means clustering in reducing the datasets. In this paper, three different methods were used to replace missing values with mean, median and a predicted score; the cleaned datasets were reduced using K-Means clustering...
Many bio-medical databases such cohort study data suffer from potential errors involved with human factors like mistyping, overlooking some fields. It is crucial to detect such errors at the data entry stage using some techniques like outlier detection. Because such data lie in high-dimensional space and contain many null values, i.e., missing values, most conventional outlier detections are not a...
Clickstream data is one of the most important sources of information in websites usage and customers' behavior in Banks e-services. A number of web usage mining scenarios are possible depending on the available information. While simple traffic analysis based on click stream data may easily be performed to improve the e-banks services. The banks need data mining techniques to substantially improve...
Nowadays data mining techniques have been widely applied to telecommunications, finance, Internet, industry, agriculture, education, software engineering, etc, thus started a continuously offering data mining courses for undergraduate and graduate levels to provide a strong academic support all over the world. Hereon, this paper described the data warehouse & data mining course for computer science...
With the advent of the Web and various specialized digital libraries, the automatic extraction of useful information from text has become an increasingly important research in Data mining. In this paper we present a new MH based algorithm that extracts both the topics expressed in large text document collections and also models how the authors of documents use those topics. The methodology is illustrated...
The continued exponential growth in volume of literature data is giving birth to a new challenge to the bibliographic analysis service and the traditional features such as keyword search, author search and statistics services could not satisfy researchers for in-depth analysis. The emerging of community analysis in social networks is becoming a hot topic in many domains and disciplines such as sociology,...
Clustering is considered as the most important unsupervised learning problem. It aims to find some structure in a collection of unlabeled data. Dealing with a large quantity of data items can be problematic because of time complexity. On the other hand high dimensional data is a challenge arena in data clustering e.g. time series data. Novel algorithms are needed to be robust, scalable, efficient...
Only considering freshness or total click number of a certain theme may lead to unreasonable themes updating. To improve the reasonableness of updating themes on homepage, this paper proposes a novel model based on theme interestingness of browser group (TIBG). TIBG can be used to calculate the theme's real-time popularity which tracks information related to theme browser's actual interest. Firstly,...
The ability to predict the students' academic performance is very important in institution educational system. Recently some researchers have been proposed data mining techniques for higher education. In this paper, we compare two data mining techniques which are: Artificial neural network (ANN) and the combination of clustering and decision tree classification techniques for predicting and classifying...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.