Serwis Infona wykorzystuje pliki cookies (ciasteczka). Są to wartości tekstowe, zapamiętywane przez przeglądarkę na urządzeniu użytkownika. Nasz serwis ma dostęp do tych wartości oraz wykorzystuje je do zapamiętania danych dotyczących użytkownika, takich jak np. ustawienia (typu widok ekranu, wybór języka interfejsu), zapamiętanie zalogowania. Korzystanie z serwisu Infona oznacza zgodę na zapis informacji i ich wykorzystanie dla celów korzytania z serwisu. Więcej informacji można znaleźć w Polityce prywatności oraz Regulaminie serwisu. Zamknięcie tego okienka potwierdza zapoznanie się z informacją o plikach cookies, akceptację polityki prywatności i regulaminu oraz sposobu wykorzystywania plików cookies w serwisie. Możesz zmienić ustawienia obsługi cookies w swojej przeglądarce.
Ordinal classification is a form of multi-class classification where there is an inherent ordering between the classes, but not a meaningful numeric difference between them. Although conventional methods, designed for nominal classes or regression problems, can be used to solve the ordinal data problem, there are benefits in developing models specific to this kind of data. This paper introduces a...
Gene Selection is very important problem in the classification of serious diseases in clinical information systems. A limitation of these gene selection methods is that they may result in gene sets with some redundancy and yield an unnecessary large number of candidate genes for classification analysis. In the current work, a hybrid approach is presented in order to classify diseases, such as colon...
The county level of basic public services analysis and classification play an important role in county economic growth and improve benefit of healthy development of urbanization in China. According to the county level of basic public services data which is large scale and imbalance, this paper presented a support vector machine model to classify the county level of basic public services. The method...
Classification is a widely researched area in the machine learning and fuzzy communities with several approaches proposed by both communities. Some of the most relevant rule-based approaches from the machine learning community might include decision trees and rule inducers. The fuzzy community has also proposed many rule-based approaches, such as fuzzy decision trees and genetic fuzzy systems. This...
Data mining or Knowledge discovery is seen as an increasingly important tool by modern business to transform data into an informational advantage. Mining is a process of finding correlations among dozens of fields in large relational databases and extracts useful information that can be used to increase revenue, cuts costs, or both. Classification is a supervised machine learning procedure and an...
Census can provide the fundamental population data of the whole nation. The census data are rich with hidden information that can be used for the investigation of national conditions and national power. Data Mining aims at extract the implicit, previously unknown, and potentially useful knowledge from voluminous, non-complete, fuzzy, stochastic data. Using Data Mining in census data can make full...
In this paper, we propose a tree-structured multi-class classifier to identify annotations and overlapping text from machine printed documents. Each node of the tree-structured classifier is a binary weak learner. Unlike normal decision tree(DT) which only considers a subset of training data at each node and is susceptible to over-fitting, we boost the tree using all training data at each node with...
The decision tree algorithm is a hot point in the field of data mining, which is usually used to form classifiers and prediction models. In practice, it has a wide application. This paper describes the decision tree technology and its development process, focuses on typical decision tree algorithms, analyzes their advantages and disadvantages, compares several algorithms, and finally discusses the...
Classify maintenance request is one of the processes in the large software system to support maintainers in doing their daily maintenance tasks more effectively. Categorizing these maintenance requests are an essential requirement in managing the maintenance request for software maintainer and need a great effort as well as determining classification. Hence, this paper presents the framework from...
Classification is one of the most efficient data mining techniques in Machine Learning. In classification, Decision trees can handle high dimensional data. But, decision trees yield poor performance in medical health care. So, In this paper, we investigate the use of Receiver Operating Characteristic (ROC) curve for the evaluation of machine learning algorithms. In particular, we investigate the use...
Credit scoring using predictive models can help in the process of assessing credit worthiness during the credit evaluation process. The objective of credit scoring models is to assign credit risk score to determine if a customer is likely to default on the financial obligation. Construction of credit scoring models requires data mining techniques. Using historical data on payments, demographic characteristics...
This research paper uses association rules and classification techniques to extract undiscovered information of diabetes. Previous phase of this research included the preliminary results of some undiscovered decision factors and side effects of diabetes, by considering diabetes type 1 and type 2 patients' data set. Advanced and reliable data mining techniques are used throughout this research to the...
In this paper, we propose a fuzzy clustering decision tree (FCDT) for the classification problem with large number of classes and continuous attributes. A hierarchical clustering concept is introduced to achieve a finer fuzzy partition. The proposed clustering algorithm split the data set into leaf clusters using splitting attributes based on a separation matrix and fuzzy rules. The leaf clusters...
City scientific and technological progress level classification and promotion play a central role in spurring city income growth and reducing poverty. Based on the Chinese city data availability, this paper built evaluation index system on the level of city scientific and technological progress. According to the city scientific and technological progress data which is large scale and imbalance, this...
Classification is one of the tasks in data mining. Nowadays, there are many classification techniques being used to solve classification problems such as neural network, genetic algorithm, Bayesian and others. In this article, we attempt to present a study on how talent management can be implemented using decision tree induction techniques. By using this approach, talent performance can be predicted...
Monitoring, assessment and prediction of environmental risks that chemicals pose demand rapid and accurate diagnostic assays. One important goal of microarray experiments is to discover novel biomarkers for toxicity evaluation. A variety of toxicological effects have been associated with explosive compounds 2,4,6-trinitrotoluene (TNT) and 1,3,5-trinitro-1,3,5-triazacyclohexane (RDX). Here we developed...
The SPRINT algorithm describes a distributed way to construct a decision tree for classification in large data sets. It can be applied to in-network classification tree construction. The costly data transfer of sensor data to the sink can be avoided while execution time is still acceptable. The SPRINT algorithm and its extensions are introduced. Furthermore, different scenarios that implement classification...
Michie et al. show in [1] that decision trees perform better than twenty other classification algorithms in classifying binary data. In this paper we further investigate this hypothesis by comparing the decision trees with a fuzzy set-based classifier and the naive Bayes on real and artificial datasets.
High-dimensional data is a difficult case for most subspace-based classification methods because of the large number of combinations of dimensions, which have discriminatory power. This is because there are an exponential number of combinations of dimensions that could decide the correct class instance, and this combination could vary with data locality and test instance. Therefore, most summarized...
Tree induction is one of the most effective and widely used models in classification. Unfortunately, decision trees such as C4.5 have been found to provide poor probability estimates. By the empirical studies, Provost and Domingos found that probability estimation trees (PETs) give a fairly good probability estimation. However, different from normal decision trees, pruning reduces the performances...
Podaj zakres dat dla filtrowania wyświetlonych wyników. Możesz podać datę początkową, końcową lub obie daty. Daty możesz wpisać ręcznie lub wybrać za pomocą kalendarza.