The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Many real-world datasets suffer from the problem of missing values. Imputation which replaces missing values with plausible values is a major method for classification with data containing missing values. However, powerful imputation methods including multiple imputation are usually computationally intensive for estimating missing values in unseen incomplete instances. Rule-based classification algorithms...
By analyzing the disadvantages of the traditional KNN using lazy learning that directly classify the data based on the K neighboring classes using the majority voting method, a new Sigmoid weighted classification algorithm WKS (Weighted KNN Based On Sigmoid) was proposed. WKS provides a new method for learning and training, since each training data di ∊ D contributes to the correct classification...
It is a simple task for humans to visually identify objects. However, computer-based image recognition remains challenging. In this paper we describe an approach for image recognition with specific focus on automated recognition of plants and flowers. The approach taken utilizes deep learning capabilities and unlike other approaches that focus on static images for feature classification, we utilize...
The paper exposes the behavior of the Decision Trees (DT) algorithms on a big database with many cases and many attributes: Forest Covertype (FC) from UCI Knowledge Discovery in Databases Archive. In classification experiments considered have been taken into account 22 splitting criteria and two pruning methods whose performances were presented in terms of classification error rate on test data, data...
One of the major causes of death in the world is Heart Failure. This disease affects directly the heart's pumping job. Because of this perturbation, nutriments and oxygen are not well circulated and distributed. The New York Heart Association has classified this disease into four different classes based on patient symptoms. In this paper, we are using a data mining technique, more precisely a sequential...
In this paper, we propose to determine whether the viewer's behavior changes or not before, during and after watching a TV program. Are there any behaviors specific to each particular phase of viewing? Here, we propose a flexible and nonintrusive method based on the use of three categories of everyday connected objects (i.e. Smartphone, smartwatch and remote control). Data were collected during participants'...
Recently, multi-label classification has gained prime importance among the classification problems. The applications of classification problems has increased so rapidly that the need for efficient and accurate classifiers has become a vital requirement in the area of data mining. Multi-label classification problem is distinguished from the single label classification because of the capability to handle...
Graduate employability is an increasingly major concern for academic institutions and assessing student employability provides a way of linking student skills and employer business requirements. Enhancing student assessment methods for employability can improve their understanding about companies in order to get suitable company for them. So, enhanced employability prediction of student can help them...
Tagging provides a convenient means to assign tokens of identification to research papers which facilitate recommendation, search and disposition process of research papers. This paper contributes a document centered approach for auto-tagging of research papers. The auto-tagging method mainly comprises of two processes:- classification and tag selection. The classification process involves automatic...
The present paper presents a novel approach for semi-supervised classification of remote sensing imagery using {K-Means+(GMM-EM)} clustering cascade followed by selection of an amount of clustered pixels to be added to the training set according to their GMM responsibilities. The proposed method has the following steps: (a) clustering of the multispectral pixels using the cascade composed by K-means...
A decision tree is an important classification technique in data mining classification. Decision trees have proved to be valuable tools for the classification, description, and generalization of data. J48 is a decision tree algorithm which is used to create classification model. J48 is an open source Java implementation of the C4.5 algorithm in the Weka data mining tool. In this paper, we present...
Image classification mainly uses the classifier to classify the extracted image features. In the traditional image feature extraction, it is difficult to set the appropriate feature patterns for the complex images. Simultaneously, the training algorithm of the classifier also affects the accuracy of image classification. In order to solve these problems, the combination of deep belief networks and...
The purpose of data mining is to explore, find and hence analyze relevant data from a massive data source using various technical means. This paper introduces the development of data mining to date, its functions, tasks and algorithms, as well as the process of data mining. The application and problems of data mining are also presented and finally the potential future development of data mining technology...
This paper explores the potential of Machine Learning (ML) and Artificial Intelligence (AI) to lever Internet of Things (IoT) and Big Data in the development of personalised services in Smart Cities. We do this by studying the performance of four well-known ML classification algorithms (Bayes Network (BN), Naïve Bayesian (NB), J48, and Nearest Neighbour (NN)) in correlating the effects of weather...
Data mining techniques is rapidly increasing in the research of educational domains. Educational data mining aims to discover hidden knowledge and patterns about student performance. This paper proposes a student performance prediction model by applying two classification algorithms: KNN and Naïve Bayes on educational data set of secondary schools, collected from the ministry of education in Gaza...
Opinion mining is an interested area of research, which epitomize the customer reviews of a product or service and express whether the opinions are positive or negative. Various methods have been proposed as classifiers for opinion mining such as Naïve Bayesian, and Support vector machine, these methods classify opinion without giving us the reasons about why the instance opinion is classified to...
We present the key steps in the dynamogram classification algorithm development. These are data processing, procedures of generation and selection of features, constructing of a neural network classifier and estimation of its work quality. To estimate the possibility to single out complex defects (subclasses), we analyzed the structure of the input pattern sample with the aid of clusterization algorithms...
In this paper, we present an application designed to analyze news articles from Romanian mass media and extract opinions about political entities relevant to the major political stage. The application was created with the desire to study media polarization around important political events, such as legislative or presidential elections. The application uses different crawlers to extract the data from...
Extracting opinion words and product features is an important task in many sentiment analysis applications. Opinion lexicon also plays a very important role because it is very useful for a wide range of tasks. Although there are several opinion lexicons available, it is hard to maintain a universal opinion lexicon to cover all domains. So, it is necessary to expand a known opinion lexicon that are...
Education can be utilized as a tool to face many problems, overcome many hurdles in life. The knowledge obtained from education helps to enhance opportunities in one's employment development. To extract useful information from the knowledge obtained, Educational Data Mining is widely used. Educational data mining provides the process of applying different data mining tools and techniques to analyze...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.