The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Researchers in higher education are beginning to explore the potential of data mining in analyzing data for the purpose of giving quality service and needs of their graduates. Thus, educational data mining emerges as one tools to study academic data to identify patterns and help for decision making affecting the education. This paper predicts the employability of IT graduates using nine variables...
Data mining is now one of the most active field of research. Extracting those nuggets of information is becoming crucial and one of its important technique is classification. It helps to group the data in some predefined classes. Various techniques for classification exists which classifies the data using different algorithms. Each algorithm has its own area of best and worst performance. This paper...
From a large amount of data, significant knowledge is discovered by means of applying techniques in the knowledge management process and those techniques is known as Data mining techniques. For a specific domain, a form of knowledge discovery called data mining is necessary for solving the problems. The classes of unknown data are detected by the technique called classification. Neural networks, rule...
The aim of this study is to compares some classification techniques used to predict the performance of student. It is helps to analyse the slow leaner in the semester exams that are likely study in poor which are used to improve their skill as early to achieve the goal in end semester. The task can be processed based on the several attributes to predict the performance of the student activity respectively...
Data mining approaches have been used in business purposes since its inception; however, at present it is used successfully in new and emerging areas like education systems. Government of Bangladesh emphasizes the need to improve the education system. In this research, we use data mining approaches to predict students' final outcome, i.e., final grade in a particular course by overcoming the problem...
Violations of listed companies to disclose accounting information will mislead the ordinary investors seriously and bring huge losses to investors. Therefore, it is particularly necessary to analyze and identify the violations of listed companies based on scientific and effective methods to avoid investment risks in advance. In this paper, we firstly use t-statistic to select eight useful and characteristic...
Interwell connectivity of injection-production system is a kind of important information of reservoir performance analysis. It is largely significant for researching the distribution of remaining-oil and adjusting the oilfield development plan. In order to change the status quo of inferring interwell connectivity in NO.1 oil production plant of Daqing Oilfield, an automatic identification method based...
Data mining is an area of computer science with a huge prospective, which is the process of discovering or extracting information from large database or datasets. There are many different areas under Data Mining and one of them is Classification or the supervised learning. Classification also can be implemented through a number of different approaches or algorithms. We have conducted the comparison...
The C4.5 Algorithm can result in a thriving decision tree and will overfit the training data while training the model. In order to overcome those disadvantages, this paper proposed a post-pruning decision tree algorithm based on Bayesian theory, in which each branch of the decision tree generated by the C4.5 algorithm is validated by Bayesian theorem, and then those branches that do not meet the conditions...
Time series shapelets are small and local time series subsequences which are in some sense maximally representative of a class. E.Keogh uses distance of the shapelet to classify objects. Even though shapelet classification can be interpretable and more accurate than many state-of-the-art classifiers, there is one main limitation of shapelets, i.e. shapelet classification training process is offline,...
In order to compare the classification accuracies and performance differences between traditional and probability-based decision tree classifiers, and come to understand those algorithms, which aim to improve construction efficiency of probability-based decision trees, mentioned in "Decisions Trees for Uncertain Data", this paper tested several algorithms, named AVG, UDT, UDT-BP, UDT-LP,...
Node splitting is good or bad depends on the measure method of the impurity. We propose a new decision tree feature selection strategy based on maximum similarity, called fsms. First, splitting the dataset into subset according to each attribute value, calculating the sum of average similarity of each subset, then selecting the attribute with the maximum similarity as the best splitting attribute...
The paper aims to develop the predictive models for dengue outbreak detection using Multiple Rule Based Classifiers. The rule based classifiers used are the Decision Tree, Rough Set Classifier, Naive Bayes, and Associative Classifier. Dengue fever (DF) and dengue hemorrhagic fever (DHF) have been continuously becoming a public health related issues in Malaysia and growing pandemic as reported by World...
Within the complex and competitive semiconductor manufacturing industry, lot cycle time (CT) remains one of the key performance indicators. Its reduction is of strategic importance as it contributes to cost decreasing, time-to-market shortening, faster fault detection, achieving throughput targets, and improving production-resource scheduling. To reduce CT, we suggest and investigate a data-driven...
The credit scoring has been regarded as a critical topic and its related departments make efforts to collect huge amount of data to avoid wrong decision. An effective classificatory model will objectively help managers instead of intuitive experience. This study proposes five approaches combining with the back-propagation neural network (BPN) classifier for features selection that retains sufficient...
Rising of computer violence, such as Distributed Denial of Service (DDoS), web vandalism, and cyber bullying are becoming more serious issues when they are politically motivated and intentionally conducted to generate fear in society. These kinds of activity are categorized as cyber terrorism. As the number of such cases increase, the availability of information regarding these actions is required...
The data generated within the construction industry has become increasingly overwhelming. Data mining technology presents an opportunity to increase significantly the rate at which the volumes of data generated through the maintenance process can be turned into useful information. This can be done using classification algorithms to discover patterns and correlations within a large volume of data....
This paper compares performance of several classifiers provided in WEKA such as Bayes, decision tree and classification rules in classifying student's learning style. The student's preferences and behavior while using e-learning system have been observed and analyzed and twenty attributes have been selected to map into Felder Silverman learning style model. There are four learning dimensions in Felder...
RBF networks are good at prediction tasks of data mining, and k-means clustering algorithm is one of the mostly used clustering algorithms for basis functions of RBF networks. K-means clustering algorithm needs the number of clusters for initialization, and depending on the number of clusters, the accuracy of RBF networks change. But we cannot resort to increasing the number of clusters in the RBF...
Individual credit risk evaluation is an important and challenging data mining problem in financial analysis domain. This paper compares the effectiveness of four data mining algorithms - logistic regression (LR), decision tree (C4.5), support vector machine (SVM) and neural networks (NN) by applying them to two credit data sets. Experiment results show that the LR and SVM algorithms produced the best...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.