The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
We present an optimization technique for general object detection and an algorithm for training decision trees. By delaying the calculation of the features as late as possible we drastically reduce the execution time. At detection we alternate between evaluating the necessary features and eliminating candidates. This enables us to have both a rich pool of features and a powerful classifier while keeping...
The accurate identification of the helicopter flight action is the basis for guiding the training of the pilot. According to the accuracy of the helicopter flight action recognition, the paper proposed a new decision-tree-based support vector machine method to realize the helicopter multi-flight action identification. Use the tree structure of the decision tree to solve the multi-class problem of...
Decision Tree is one of the most popular supervised Machine Learning algorithms; it is also the easiest to understand. But finding an optimal decision tree for a given data is a harder task and the use of multiple performance metrics adds some complexity to the problem of selecting the most appropriate DT.
Spam emails are a major threat that negatively impacts email users. Spam wastes time, financial resources of businesses, consumes network bandwidth and slows down email servers. In addition, provides a medium for distributing malicious code and there is currently not one solution to this problem. The Bag of Words (BoW) word content feature extraction method is well established for classifying spam...
An optimization classification algorithm for MRI images of premature brain injury is introduced. Based on the shortcomings of the classical ID3 algorithm in dealing with the continuous attributes of medical image, the new algorithm selects the testing feature by comparing the information gain ratio and adds the handling methods for filling null values. Then it discrete the continuous attributes by...
Multi-scale classification is an important tool for urban image classification, because the objects in an urban scene may have very different spatial scales. In technical literature, this idea is usually performed either using the same classifier (ensemble of classifiers) at multiple resolutions, or working on sets of features at multiple scales. Although this may be very efficient approach for peculiar...
Education is one of the primary requirements for leading a good life. In India, still a large section of population is not educated, which makes them lag behind everyone. For overall development of our country, the citizens have to be educated and consequently employed. This paper analyses the most important factor that will result in the improved education level of our country, using data mining...
SMOTE (Synthetic minority over-sampling technique) is a commonly used over-sampling technique to subside the imbalanced dataset problem. Traditionally SMOTE has two key important parameters, one is to control the amount of over-sampling, and the other specifies the area of the nearest neighbors. These two parameters are arbitrarily chosen by user. So there are no universally best default values. In...
In multiobjective evolutionary algorithms, most selection operators are based on the objective values or the approximated objective values. It is arguable that the selection in evolutionary algorithms is a classification problem in nature, i.e., selection equals to classifying the selected solutions into one class and the unselected ones into another class. Following this idea, we propose a classification...
We introduce a multiple instance learning algorithm based on randomized decision trees. Our model extends an existing algorithm by Bloc keel et al. [2] in several ways: 1) We learn a random forest instead of a single tree. 2) We construct the trees by splits based on non-linear boundaries on multiple features at a time. 3) We learn an optimal way of combining the decisions of multiple trees under...
Malicious users of the internet can launch Denial of Service (DoS) and Distributed Denial of Service (DDoS) attacks with the intent of making the throughput of a network next to none. As the types and number of users of the internet increases, the requirement of an effective Intrusion Detection System(IDS) to detect these attacks also increases. Different techniques such as data mining and pattern...
Natural language dialogue is an important component of interaction between ordinary users and complex computer applications. Short Text Semantic Similarity algorithms have been developed to improve the efficiency of producing sophisticated dialogue systems. Such algorithms are currently unable to discriminate between different dialogue acts (assertions, questions, instructions etc.), requiring the...
In order for Qualitative Spatial Reasoning applications to be both useful and usable, the information feedback loop between the computational engine and the user must be as seamless as possible. Inherently, computational geometry can be quite expensive, and every effort must be made to avoid inefficient or unnecessary calculations. Within the field of Region Connection Calculi, the 9-Intersection...
This work proposes a wafer probe parametric test set optimization method for predicting dies which are likely to fail in the field based on known in-field or final test fails. Large volumes of wafer probe data across 5 lots and hundreds of parametric measurements are optimized to find test sets that help predict actually observed test escapes and final test failures. Simple rules are generated to...
Ordinal classification is a form of multi-class classification where there is an inherent ordering between the classes, but not a meaningful numeric difference between them. Although conventional methods, designed for nominal classes or regression problems, can be used to solve the ordinal data problem, there are benefits in developing models specific to this kind of data. This paper introduces a...
Rule-based classifiers have been successfully applied in data mining applications. In this Paper, we have proposed a novel rule generator classifier called CORER (Colonial competitive Rule-based classifier) to improve the accuracy of data classification. The proposed classifier works based on CCA (Colonial Competitive Algorithm), a recently-developed evolutionary optimization algorithm. In order to...
Inspired originally by the Learnable Evolution Model(LEM), we investigate LEM(ID3), a hybrid of evolutionary search with ID3 decision tree learning. LEM(ID3) involves interleaved periods of learning and evolution, adopting the decision tree construction algorithm ID3 as the learning method, and a steady state EA as the evolution component. In the learning periods, ID3 is used to infer rules that attempt...
Data mining is an important process, with applications found in many business, science and industrial problems. While a wide variety of algorithms have already been proposed in the literature for classification tasks in large data sets, and the majority of them have been proven to be very effective, not all of them are flexible and easily extensible. In this paper, we introduce two new approaches...
Finding the optimal parameters in anionic co-doped titanium dioxide (TiO2) is an important task in the compound preparation on either photocatalytic-oriented or mechanical-preferred properties. This work proposes a neural network-based system to optimize the process parameters of the deposition of TiCxOyNz films. The proposed system comprises three stages, which are data processing, parameter training...
Ensemble has been proved a successful approach for enhancing the performance of single classifiers. But there are two key factors influencing the performance of an ensemble directly: accuracy of each single member and diversity between the members. There have been many approaches used in the literature to create the mentioned diversity. In this paper we add a novel approach, in which classifier type...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.