The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
It is a primary task in the privacy-preserving data mining in the distributed environment how to protect privacy and at the same time acquire accurate data relation. This paper shows how two parties built a decision tree collaboratively without revealing privacy when datasets is vertically distributed, including a PPC4.5 algorithm for privacy preserving via C4.5 over vertically distributed datasets...
Classification basing on Privacy-preserving is one of the hottest spots in the field of data mining in recent years. This paper studies how to identify suspicious financial transactions under the privacy-preserving of classification algorithm. The research is studied on multi-data that from several parts by using Scalar Product Protocol under the privacy-preserving. The experimental results show that...
This research proposes the design of a fault pattern analysis algorithm based on the C4.5 decision tree technique. We study the actual data collected from a disk drive manufacturing company. Our work emphasizes the HGA manufacturing data. However, the data from the Wafer and the Slider processes are also explored as they may affect the yield of the HGA production. In our algorithm, the data is first...
This paper emphasized on the implementation of ID3 algorithm in forestry resource classification rule decision tree. Use ID3 algorithm to analysis the correlation information among forestry species, altitude, origin, forest group and rows by establishing decision tree model, and provide reference for the related decision support. It is proved that this method has a good application prospect in forest...
Wireless distributed sensor systems will enable the reliable monitoring of a variety of environments for both civil and military applications. The data model generated by sensor network is data streams. Because of the rapid data arriving speed and huge size of data set in stream model, novel one-pass algorithms are devised to support data aggregation on demand. In this paper, we focus on data aggregation,...
As the rapid development of economy, increased population and lagged water conservancy, a series of civic ecological environment problems have arisen, for example, water shortage, water pollution, ecological deteriorate, which influences the sustainable development of economy. This paper studied on evaluation of water security. An improved classification algorithm by attribute importance is provided...
Syndrome differentiation is an important topic in traditional Chinese medicine (TCM).Decision tree, one of the data mining algorithms developed, is a method to induce rules from data. In this paper, decision tree is applied to extract syndrome differentiation rules from 293 cases related to liver and kidney yin deficiency, damp-heat smoldering and Stasis and heat smoldering syndrome. Thus the decision...
Recognizing and analyzing change is an important human virtue because it enables us to anticipate future scenarios and thus allows us to act pro-actively. One approach to understand change within a domain is to analyze how models and patterns evolve. Knowing how a model changes over time is suggesting to ask: Can we use this knowledge to learn a model in anticipation, such that it better reflects...
As an important step of knowledge discovery in databases, data mining is a process of distilling critic, unclear, potential useful information or knowledge from plenty of data, which has been applied in many places. In this text, we introduce a new method of data mining that can be applied in the performance evaluation of human resource management.
This paper presents a methodology for analog designers to maintain their insights into the relationship among performance specifications, topology choice, and sizing variables, despite those insights being constantly challenged by changing process nodes and new specs. The methodology is to take a data-mining perspective on a Pareto Optimal Set of sized analog circuit topologies, then doing: extraction...
This work focuses on applying data mining techniques to a practical example of cross-selling problem raised by PAKDD'07 Data Mining Competition. Firstly, we build decision tree from high dimensional data to identify the most important features according to information gain. Some comprehensible business insights are also gained from it. Secondly, a novel re-sampling technique is proposed to resolve...
With the continuous rising of real-estate prices and the upsurge demands by residents, the loan default risk has been raised gradually due to the individual housing loan increased with years. The efficient measurement and management systems for the credit risk in individual loan should be urgently established. Such systems need a knowledge-based decision methodology to be implemented. The decision...
Firstly, we show that the C4.5 algorithm inherits all the advantages of the ID3 algorithm, at the same time, it overcomes the defect of ID3 algorithm such as can not deal directly with continuous attributes, tend to choose more attributes when using information gain attributes to confirm the test values; we introduced the basic thought, problem solving process, the theoretical basis and classification...
This paper studies data mining algorithm based on classification mode in detail, especially classification rules pick-up based on rough sets and based on construction decision trees. An improving-algorithm of decision tree model based on rough sets is given. The technique of decision tree based on rough sets is used in customer value management fields, measurement customer value and segmentation customers...
This paper focuses on continuous attributes handling for mining data stream with concept drift. Data stream is an incremental, online and real time model. Domingos and Hulten have presented a one-pass algorithm. Their system VFDT use Hoeffding inequality to achieve a probabilistic bound on the accuracy of the tree constructed. VFDTpsilas extended version CVFDT handles concept drift efficiently. In...
This paper uses the determination tree induction technology to delete the irrelevant attributes (or dimensions) and reduce the data quantity; The decision tree algorithm which at first is used in classification is used to dimensionality reduction, might achieve a higher accuracy, because it optimizes the algorithm measure computation in deletion attribute time. Finally uses the insurance example to...
Signature-based anti-viruses are very accurate, but are limited in detecting new malicious code. Dozens of new malicious codes are created every day, and the rate is expected to increase in coming years. To extend the generalization to detect unknown malicious code, heuristic methods are used; however, these are not successful enough. Recently, classification algorithms were used successfully for...
The purpose of this paper is to improve the quality of engineering project grading, the basic processes of data mining technique are introduced. Taking the engineering project grading as background, the implement cycles such as business understanding, data understanding, data preparation, modeling, evaluation and deployment are studied in detail. During modeling, the decision tree is adopted as analyzing...
In order to realization electronic parts product appearance quality detection control, one kind of processor based on the intelligent knowledge automatic extraction and system intelligence modeling was presented. In the processor, wavelet-fuzzy technique and neural network technique are combined. Uses the fuzzy wavelet extraction image feature, and wavelet function is used as fuzzy membership function...
Power quality disturbances identification is the important procedure for improving the power quality, and online application has actual value. An efficient method for power quality disturbances identification is presented in this paper. Wavelet decomposition is used for extracting the features of various disturbances, and decision tree in data mining is used for identifying the disturbances. For online...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.