The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Decision Tree induction is commonly used classification algorithm. One of the important problems is how to use records with unknown values from training as well as testing data. Many approaches have been proposed to address the impact of unknown values at training on accuracy of prediction. However, very few techniques are there to address the problem in testing data. In our earlier work, we discussed...
Decision Tree is one of the most popular supervised Machine Learning algorithms; it is also the easiest to understand. But finding an optimal decision tree for a given data is a harder task and the use of multiple performance metrics adds some complexity to the problem of selecting the most appropriate DT.
In the view of mobile data security detection, text classification model can be realized in the application layer to detect malicious attacks. Since traditional C4.5 decision tree has the disadvantage of no considering about interaction influence between properties in attribute selection, an improved model of C4.5 decision tree based on AdaBoost algorithm is put forward. The problem in measuring the...
Accurate pedestrian detection with high speed is always of great interests especially for practical application. Detectors usually follow the feature selection paradigm, and need to first construct rich and diverse features. In particular, current state-of-the-arts generate more channels of feature by convolving the basic feature channels with filter banks, which significantly improves accuracy. In...
This paper proposes a novel learning-based image super-resolution via a weighted random forest model (SWRF). The proposed method uses the LR-HR training data to train a random forest model. The underlying idea of this approach is to use several decision trees to classify the training data based on a simple splitting threshold value at each class. A linear regression model is learnt to map the relationship...
This paper proposes a hybrid approach, integrating Decision Trees (DT) and Artificial Neural Networks (ANN) for energy price classification in deregulated electricity market. The proposed model does not aim to predict future values of energy prices, but classify and explain the negative Locational Marginal Price (LMP) that are observed in the grid. The negative LMPs are grouped by the K-means technique...
In this work, we propose a concept of emergency detection algorithm for healthcare robot which adopts discriminative restricted Boltzmann machine for anomaly detection. We will adopt anomaly detection rather than simple emergency case classification as it is hard to collect real emergency data to train the effective classifier. The conventional anomaly detection method uses decision tree to analyze...
The experimental results show that the classification result with the decision trees algorithm come up over the other classifier. The decision tree algorithm creates a predictive model that predicts the state of the affected tissue by learning simple decision rules inferred while learning.
Everyday huge amount of information are transferred from one network to another, the information may be exposed to attacks. The information and information system should be protected from unauthorized users. To provide and maintain the Confidentiality and Integrity of the information is a very tedious job so Intrusion Detection plays a very important role. Although various methods are used to protect...
Rapid urbanization has generated a large number of construction land problems in China, such as the idleness and illegal use of land. Whereas most methods have focused on the discovery of idle construction land by spatial overlay analysis, far less attention has been paid to the prediction of the idle construction land in advance. In this paper, a new method based on the Gradient Boosting Machine...
The recent computing trend is producing tons of data every minutes where the amount of imbalanced data is quite high as far as real life data sets are concerned. In practical aspects of data mining, the imbalanced data set is prone to misguide a data mining model. However, data set needs pre-processing before mining. This work focuses on some practical data mining techniques and produces a valid evaluation...
The major disadvantage of Support Vector Machine (SVM) happens in its training phase as it requires to solve a quadratic programming problem, making computation very costly. With the integration of LiDAR data and high spatial resolution orthophoto, more input data layers are available for object-based Support Vector Machine classification. Initially, confusion among classes arises because of the presence...
A random forest (RF) is a kind of ensemble machine learning algorithm used for a classification and a regression. It consists of multiple decision trees that are built from randomly sampled data. The RF has a simple, fast learning, and identification capability compared with other machine learning algorithms. It is widely used for applicable to various recognition systems. Since it is necessary to...
The ability of an intrusion detection system (IDS) to accurately detect potential attacks is crucial in protecting network resources and data from the attack's destructive effects. Among many techniques available for incorporation into IDS to improve its accuracy, classification algorithms have been demonstrated to produce impressive and efficient results in detecting IPv4-based attacks but have not...
We aim to study the modeling limitations of the commonly employed boosted decision trees classifier. Inspired by the success of large, data-hungry visual recognition models (e.g. deep convolutional neural networks), this paper focuses on the relationship between modeling capacity of the weak learners, dataset size, and dataset properties. A set of novel experiments on the Caltech Pedestrian Detection...
The Islamic State of Iraq and Syria (ISIS) is a extremist militant group in the Middle East known to employ social media for propaganda and recruiting purposes. In particular, the social media website Twitter is well known to be exploited by ISIS supporters. To this end, we devise an effective and scalable classification scheme to filter out ISIS propaganda accounts from the rest of the Twitter accounts...
kNN (k nearest neighbors) is widely adopted because of its simplicity. However, its shortcomings can not be neglected, especially its time complexity. Consequently a great amount of approaches emerged in large numbers in decades to cope with this issue with a tradeoff in performance of the classification. In this paper, a novel improved kNN algorithm is proposed with a better performance than traditional...
The HEVC(H.265) has brought in significant improvements in terms of coding efficiency. However, the reduction in bitrates comes along with an increment in computational complexity. This paper presents a data mining approach to reduce the complexity of inter partition modes in HEVC. Determining the CU-splitting in inter partition modes requires substantial resources, so the goal of the work is to terminate...
Due to the fact that video streaming is the current "killer" application and for competitiveness, telecommunication service providers need to be able to answer a fundamental question: to which extent is the available network infrastructure able to successfully provide users with a satisfactory experience when running video streaming applications? Answering this question is far from trivial...
In analyzing streaming data in which the underlying data distribution may change or the concept of interest may drift over time, the ability of a classifier to adapt to drifted concepts is very important to maintaining the prediction performance. However, the true class labels of data samples are often available only after some period of time or they are obtained by experts' efforts. In this paper,...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.