The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Everyday huge amount of information are transferred from one network to another, the information may be exposed to attacks. The information and information system should be protected from unauthorized users. To provide and maintain the Confidentiality and Integrity of the information is a very tedious job so Intrusion Detection plays a very important role. Although various methods are used to protect...
Real time data analysis in data streams is a highly challenging area in big data. The surge in big data techniques has recently attracted considerable interest to the detection of significant changes or anomalies in data streams. There is a variety of literature across a number of fields relevant to anomaly detection. The growing number of techniques, from seemingly disconnected areas, prevents a...
Nowadays, botnets is one of the biggest challenges in cyber security. Various detection mechanisms have been proposed. Especially, research communities use machine learning algorithms as the major tool to detect botnets because of their advantages. The popular model is the combination of unsupervised learning to categorize network traffic into some groups with similar features, and apply classification...
Network intrusion detection systems need to detect abnormal behaviour in network data as soon as possible and with as little user intervention as possible. In this paper, we describe a semi-supervised network anomaly detection system. Our system uses online clustering to summarize the available network data. Clusters are represented using extended cluster features that comprise of not only features...
Network intrusion detection aims to uncover unauthorized access to computer networks. Anomaly intrusion detection uses unsupervised learning to detect attacks based on profiles of normal user behaviors. If the system is being used differently, it triggers an alarm. Current methods of intrusion detection are unable to produce alerts without a high number of false positives. The proposed research will...
The aim of this paper is to propose a method namely CLUSS — CLUstering and SMOTE Sampling that can improve the prediction performance on multiclass imbalanced problem with students' performance data. Firstly, the clustering approach is used to create a new subset from all majority classes. The new subsets consists of the groups of majority classes instances which have different characteristics. Secondly,...
Traditional Network Intrusion Detection Systems (NIDSs) rely on either specialized signatures of previously seen attacks, or on expensive and difficult to produce labeled traffic datasets for profiling and training. Both approaches share a common downside: they require the knowledge provided by an external agent, either in terms of signatures or as normal-operation profiles. In this paper we describe...
This paper proposes a hybrid diagnosis approach of breast cancer based on decision trees and clustering. Our proposed approach does not only assume distinguishing malignant from benign cases, but also makes a refined treatment of these latter. Experimental study on Wisconsin Breast Cancer Database provides a thorough analysis of the induced results and shows that we can enhance the classification...
Data streams are continuous, unbounded, usually come with high speed and have a data distribution that often changes with time. It has different issues such as memory, time, Data Processing Model. There is need of handling data streams because of its changing nature, and the data stream may be labeled or it may be unlabelled. Classification is supervised it can only handle labeled data Thus, In this...
Violations of listed companies to disclose accounting information will mislead the ordinary investors seriously and bring huge losses to investors. Therefore, it is particularly necessary to analyze and identify the violations of listed companies based on scientific and effective methods to avoid investment risks in advance. In this paper, we firstly use t-statistic to select eight useful and characteristic...
Identifying encrypted application traffic is an important issue for many network tasks including quality of service, firewall enforcement and security. This paper presents a machine learning based approach to identify high level application behavior in a given traffic trace using a holistic approach without looking into the content or without checking a static attribute. We demonstrate the effectiveness...
One of the methods most commonly used for learning and classification is using decision trees. The greatest advantages that decision trees offer is that, unlike classical trees, they provide a support for handling uncertain data sets. The paper introduces a new algorithm for building fuzzy decision trees and also offers some comparative results, by taking into account other methods. We will present...
Videology is an online video advertising, optimization, and yield management solutions provider whose business strategy is to purchase online video advertising inventory from content providers and deliver ads to their clients using behavioral and demographic targeting to maximize value. The Capstone team used a diverse set of analytical tools, including Time Series models, Principal Component analysis,...
This paper presents an automatic image annotation approach that integrates the random forest classifier with particle swarm optimization algorithm for classes' scores weighting. The proposed hybrid approach refines the output of multi-class classification that is based on the usage of random forest classifier for automatically labeling images with a number of words. Each input image is segmented using...
We present a method to detect human body parts in depth images that is based on an active learning strategy. Our aim is to built an accurate classifier using a reduced number of labeled samples in order to minimize the training computational cost as well as the image labeling cost. The active learning strategy is based on exploiting the training data distribution by sampling from a cluster-based representation...
In this paper we propose a new hyperrectangle based learning method called Large Margin Rectangle Learning (LMRL). The goal of LMRL is to combine the interpretability of decision trees and other rectangle based learning models with the accuracy gain enabled by the large margin principle known from support vector machines. LMRL consists of two basic steps: a supervised clustering step to create an...
In the recent years, forests of decision trees have seen an increasing interest from the Machine Learning community since they allow to aggregate the decisions from a set of decision trees into one robust answer. However, this approach suffers from two well-known limits: first, their performances depend on the number of trees and thus finding the right size and how to aggregate decisions could be...
Fraud is increasing with the extensive use of internet and the increase of online transactions. More advanced solutions are desired to protect financial service companies and credit card holders from constantly evolving online fraud attacks. The main objective of this paper is to construct an efficient fraud detection system which is adaptive to the behavior changes by combining classification and...
Discovery of interesting rules describing the behavioural patterns of smokers' quitting intentions is an important task in the determination of an effective tobacco control strategy. In this paper, we investigate a compact and simplified rule discovery process for predicting smokers' quitting behaviour that can provide feedback to build an scientific evidence-based adaptive tobacco control policy...
This paper presents a novel approach to knowledge extraction from large-scale datasets using a neural network when applied to the real-world problem of payment card fraud detection. Fraud is a serious and long term threat to a peaceful and democratic society. We present SOAR (Sparse Oracle-based Adaptive Rule) extraction, a practical approach to process large datasets and extract key generalizing...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.