The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Recently many industries and companies are developing machine learning algorithms and services, and they are publishing them on the internet. However, because most of people who want to use the machine learning services to analyze data are familiar with sheet data rather than programming language, it is difficult to use those services written in programming language. For the reason, we developed a...
The paper exposes the behavior of the Decision Trees (DT) algorithms on a big database with many cases and many attributes: Forest Covertype (FC) from UCI Knowledge Discovery in Databases Archive. In classification experiments considered have been taken into account 22 splitting criteria and two pruning methods whose performances were presented in terms of classification error rate on test data, data...
Sequential pattern mining is a data mining technique that aims to extract and analyze frequent subsequences from sequences of events or items with time constraint. Sequence data mining was introduced in 1995 with the well-known Apriori algorithm. The algorithm studied the transactions through time, in order to extract frequent patterns from the sequences of products related to a customer. Later, this...
The feature subset selection, along with the parameters of classifier significantly influences the classification accuracy. In order to ensure the optimal classification performance, the artificial bee colony (ABC) algorithm is proposed to simultaneously optimize the feature subset and the parameters of support vector machines (SVM), meanwhile for improving the optimizing performance of ABC algorithm,...
Recommendation system has been placed much emphasis by researchers and programmers to deal with the information overload. Collaborative filtering algorithm is the most commonly used one. In order to enhance its performance, the Matrix Factorization was discovered to base the collaborative filtering. This paper elaborates on the collaborative filtering algorithm based on Matrix Factorization and gives...
There has been a rapid rise in the number of users getting connected online via social networking sites. To communicate with other users and share their thoughts and opinions, online users' tend to use texts in the form of blogs, posts, tweets, messages, reviews, comments etc. Thus, there has been an immense possibility complemented with a wide gamut of research in the field of Opinion Mining or Sentiment...
This paper examined the students' history of accessing the university Learning Management System (LMS) data. Classification techniques are used to build an educational model based on Knowledge Discovery in Databases (KDD) to predict learner's behavior. It identified the most valuable influencer for learning outcomes of the learners; it generated prediction models using the J48 decision tree algorithm...
In this paper, we proposed a new algorithm based on independent keypoints databases for indoor place recognition. In analogy with set operation, a new kind of operations for keypoints sets are defined to describe the process of independent keypoints database establishment and place classification. To obtain the databases, keypoints are firstly extracted from sample images whose class are known, and...
Leptospirosis is a disease that affects mainly low-income populations, with an incidence of 500,000 cases per year worldwide[1]. The disease has symptoms often confused with other febrile syndromes, such as dengue, influenza and viral hepatitis. Improved diagnosis of patients with leptospirosis is very important for health professionals, epidemiological surveillance and primarily for rapid evaluation...
In this work, the effectiveness of the popular classification techniques k-Nearest Neighbour (kNN) algorithm is integrated with Ant Colony Optimization (ACO) to predict the likelihood of getting heart disease. The analysis has been performed in two phases. In the first phase, the kNN classification is used to classify the test data. In the second phase, the ACO is used to initialize the population...
In most state-of-the-art non-distortion-specific no-reference image quality assessment (NDS NR-IQA) methods, the image quality is predicted by training a regression model based on examples of distorted images and their corresponding human subjective scores. However, one drawback of these approaches is the fact that they require a training phase of the regression parameters. In this paper, a non-parametric...
Dropout rates for students in correspondence and open courses are on increase. There is a need of analysis of factors causing increase in dropout rate. The discovery of hidden knowledge from the educational data system by the effective process of data mining technology to analyze factors affecting student drop out can lead to a better academic planning and management to reduce students drop out from...
Data stored in educational database is increasing day by day. Data mining algorithms can be used to find hidden patterns from the student's database. These patterns can be used to find academic performance of students. The main aim of this study was to determine factors that influence the student's performance. This paper proposes Generalized Sequential Pattern mining algorithm for finding frequent...
A waybill is a document that accompanies the freight during transportation. The document contains essential information such as, origin and destination of the freight, involved actors, and the type of freight being transported. We believe, the information from a waybill, when presented in an electronic format, can be utilized for building knowledge about the freight movement. The knowledge may be...
Medical databases contain massive volume of clinical data which could provide valuable information regarding diagnosis, prognosis and treatment plan when mining algorithms are used in appropriate manner. The irrelevant, redundant and incomplete data in medical databases makes the extraction of useful pattern a difficult process. Feature selection, a robust data preprocessing method selects attributes...
Data Classification and predictions are one of the prime tasks in Data mining. They continue to play a vital role in the area of computer science and data processing field. Clustering and classifications in Data Mining are used in various domains to give meaning to the available data and give some useful prediction results which can be applied to some of the crucial problem areas of the real world...
Mining is a process of searching data in huge database to infer useful information and deduces relationships and patterns. Though we can predict certain patterns from our database manually, but as soon as size of data increases (becomes in terabytes) it becomes difficult and tedious to deduce the important information from huge database (or data warehouse). Various data-mining algorithms exist to...
Knowledge discovery is an important tool for the intelligent business to transform data into useful information that will increase the business revenue. Data mining techniques support automatic exploration of data, and attempts to classify the patterns and trends in data, and also infer decision rules from those patterns. Classification of dataset is an important function of mining which is a supervised...
In recent years, data mining technology is more and more widely used with the rapid development of network technology and database technique. Moreover, the data mining technology has been the research emphasis of experts and scholars in various kind of field, especially the hot pot of artificial intelligence. The application functions of data mining technology is rich in: classification analysis,...
Monitoring the ECG (Electrocardiogram) of patients with cardiac disorder is of paramount importance since they may undergo cardiac arrhythmias even without noticing it. Majority of existing devices for this purpose are only capable of recording ECG of patients which are analyzed later on by cardiologists. This paper presents a new system developed to continuously monitor the ECG of patients and analyze...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.