The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Machine learning and data mining techniques have been widely used in order to improve network intrusion detection in recent years. These techniques make it possible to automate anomaly detection in network traffics. One of the major problems that researchers are facing is the lack of published data available for research purposes. The KDD'99 dataset was used by researchers for over a decade even though...
With the continued explosion of digitized data, data mining and data collection have become more prevalent. With this growth, we have also seen increased concern over data privacy and intellectual property. Within this environment, an important question has emerged: Can machine learning and data mining techniques be leveraged without compromising privacy? This paper revisits the concepts and techniques...
Categorical data exist in many domains, such as text data, gene sequences, or data from Census Bureau. While such data are easy for human interpretation, they cannot be directly used by many classification methods, such as support vector machines and others, which require underlying data to be represented in a numerical format. To date, most existing learning methods convert categorical data into...
Injection molded part quality can be improved by precise process adjustment, which could rely on in-situ measurements of part quality. Geometrical and appearance quality (visually and sensory) requirements are increasing. However, direct measurement is often not feasible industrially. Therefore, process control must rely on a prediction of parts quality attributes. This study compares prediction performances...
This paper proposes a novel learning-based image Super-Resolution via a Randomized Multi-split Forests model (SRRMF). The proposed method uses the LR-HR training patch pairs to model the nonlinear patch manifold into a pairs of linear subspaces. The key idea of this approach is to use several decision trees split randomly the training data into different classes. A linear regression model is learnt...
Opinion mining is an interested area of research, which epitomize the customer reviews of a product or service and express whether the opinions are positive or negative. Various methods have been proposed as classifiers for opinion mining such as Naïve Bayesian, and Support vector machine, these methods classify opinion without giving us the reasons about why the instance opinion is classified to...
Recently the traditional video surveillance systems of crowd scenes have been deployed in various areas of applications; health monitoring, security etc. Monitoring crowds and identifying their behaviors is one of the most interesting applications of visual surveillance as it is very difficult to assess crowds by human experts. In this paper, we present inter-group and intra-group properties of crowd...
Tor is an anonymous communication system that can protect our privacy, but it also provides a haven for criminals to avoid network tracing. Therefore, anonymous traffic analysis and classification is an important part of maintaining network security. Existing Tor traffic classification methods require a large number of labeled data, and the classification accuracy rate is not satisfied for practical...
Predicting the gap between taxi demand and supply in taxi booking apps is completely new and important but challenging. However, manually mining gap rule for different conditions may become impractical because of massive and sparse taxi data. Existing works unilaterally consider demand or supply, used only few simple features and verified by little data, but not predict the gap value. Meanwhile, none...
The incidence of hypertension associated with pregnancy contributes significantly to increase maternal and fetal deaths during pregnancy and childbirth. Due to its high incidence rate and several complications, the study of this disorder has been subject of numerous investigations in an attempt to determine its prevention and improve the treatment conduction. In this context, this paper uses a data...
We consider an attention-based model that recognizes objects via a sequence of glimpses, and analyze the variation in classification accuracy with the number of glimpses. The problem of object recognition is formulated as a partially observable Markov decision process where the environment is partially observable and glimpses are actions. We show that voting from random attentional policies provides...
Aero-engine fault diagnosis plays a crucial role in safe operation and cost-effective maintenance. Early detection and isolation of component faults prior to failure of aero-engines is of utmost importance. This paper applied various classification methods, including Support Vector Machine (SVM), Decision Tree (DT), K-Nearest Neighbors (K-NN) and Linear Discriminant Analysis (LDA), to aero-engine...
At present, several studies exist describing the relevance of human factor in air transport with main focus on pilots and flight safety. Within such studies, monitoring of physiological functions is used. There are lot of physiological parameters and methods of their assessment; however, they are mostly based on principles originating from clinical practice. Yet, sensitivity and specificity of these...
Regression-based tasks have been the forerunner regarding the application of machine learning tools in the context of data mining. Problems related to price and stock prediction, selling estimation, and weather forecasting are commonly used as benchmarking for the comparison of regression techniques, just to name a few. Neural Networks, Decision Trees and Support Vector Machines are the most widely...
The Internet of things (IoT) has emerged in numerous domains for collecting and exchanging large datasets in order to ensure a continuous monitoring and realtime decision-making. IoT incorporates sensors for carrying out raw data acquisition, while data processing and analysis tasks are addressed by high performance computational facilities, such as cloud-based infrastructures (remote processing approach)...
Transfer learning has attracted more and more attention, and many scholars proposed some useful strategies. Boosting is the main strategy for transfer learning. In boosting, resampling is preferred over reweighting, and it can be applied to any base learner. In this paper, we propose a weighted-resampling method for transfer learning, called TrResampling. Firstly, resampling is applied to the data...
In this work, a novel classifier which has the ability of making binary classifications by supervised learning is introduced. The proposed classifier generates a finite state machine which is derived from the dataset used for training. The states of this machine show the likelihood that the visiting samples belong to one of the classes concerned. Learning process is realized by recording the states...
Thyroid gland influences the metabolic processes of human body due to the fact that it produces hormones. Hyperthyroidism in caused due to increase in the production of thyroid hormones. In this paper a methodology using an online ensemble of decision trees to detect thyroid-related diseases is proposed. The aim of this work is to improve the diagnostic accuracy of thyroid disease. Initially, feature...
The importance of learning important features in an automatic manner is growing exponentially as the volume of data and number of systems using pattern recognition techniques continue to increase. In this paper, arousal recognition from multi channels EEG signals was conducted using human crafted statistical features and learned features from 32 different EEG source channels. We have obtained 98.99%...
The biggest concern of Network is security. Intro find the tricks and tools of the Attackers. Data Mining techniques automatically learn the pattern of the tuples and Intelligent decision are made. Supervised learning methods finds the attack based on previous knowledge and unknown attacks are detected by using Unsupervised learning. Dos, Probe and Normal data are correctly detected by maximum Data...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.