The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
The widely known classifier chains method for multi-label classification, which is based on the binary relevance (BR) method, overcomes the disadvantages of BR and achieves higher predictive performance, but still retains important advantages of BR, most importantly low time complexity. Nevertheless, despite its advantages, it is clear that a randomly arranged chain can be poorly ordered. We overcome...
Pattern classification or clustering plays important role in a wide variety of applications in different areas like psychology and other social sciences, biology and medical sciences, pattern recognition and data mining. A lot of algorithms for supervised or unsupervised classification have been developed so far in order to achieve high classification accuracy with lower computational cost. However,...
Stack Overflow (SO) is a question and answers (Q&A) web platform on software development that is gaining in popularity. With increasing popularity often comes a very unwelcome side effect: A decrease in the average quality of a post. To keep Q&A websites like SO useful it is vital that this side effect is countered. Previous research proved to be reasonably successful in using properties...
This paper describes our efforts to apply various advanced supervised machine learning and natural language processing techniques, including Binomial Logistic Regression, Support Vector Machines, Neural Networks, Ensemble Techniques, and Latent Dirichlet Allocation (LDA), to the problem of detecting fraud in financial reporting documents available from the United States’ Security and Exchange Commission...
Data inaccuracy is an important problem in wireless sensor networks, since the accuracy is affected by harsh environments and malicious nodes. The reason for this data inaccuracy is the improper identification of outliers. To detect exact outliers in the wireless sensor networks, we propose the relative correlation based clustering (RCC) technique with high data accuracy and low computational overhead...
Research on feature selection techniques for identifying informative genes from high dimensional microarray datasets has received considerable attention. Numerous researchers have proposed various optimized solutions to reduce noises, redundancy in dataset and to enhance the accuracy and generalization of the classification model by applying many computational tools. High-dimensional microarray gene...
For classification of High Dimensional data, feature selection is the most important step for obtaining optimal result with respect to processing power required and time taken. Feature selection is a method by which the most relevant feature is selected from a set of features containing redundant and irrelevant features thereby reducing the load on the classification algorithm. This paper proposes...
Gene selection is one of important research issues in analysis of gene expression data classification. Current methods try to reduce genes by means of statistical calculations and have used semantic similarity under gene ontology. In this article a technique has been presented based on which in addition to considering biological relation among genes, redundant genes by means of hierarchical clustering...
Existing imputation algorithms for incomplete decision system are almost non-incremental and rarely consider different contribution to decision label among different features. Therefore, in order to make the most use of information hidden in existing data and reserve the original distribution characteristics, in this paper, we proposes a new incremental imputation algorithm based on attribute significance...
In this paper, we propose a multi-stage feature selection algorithm, which focuses on the reduction of redundant features and the improvement of classification performance using feature ranking (FR), correlation analysis (CA) and chaotic binary particle swarm optimization (CBPSO). In the first stage, with the purpose of selecting the most effective features for classification, FR is introduced to...
In this paper, a novel naïve Bayesian classifier based on the hybrid-weight feature attributes (short of "NBCHWFA") is proposed. NBCHWFA arranges a hybrid weight for each feature attribute by merging the effectiveness of feature attribute on classification and the dependence between feature attribute and class attribute. In order to demonstrate the feasibility and effectiveness of proposed...
System security has become significant issue in many organizations. The attacks like DoS, U2R, R2L and Probing etc., creating a serious threat to the appropriate operation of internet services as well as in host system. In recent years, intrusion detection system is designed to prevent the intruder in the host as well as in network systems. Existing host based intrusion detection systems detects the...
In this paper, we study neural network ensembles (NNE) classifier with regularized negative correlation learning (RNCL) and its application to pattern classification. In RNCL algorithm, the regularization parameter is used to control the trade off between mean square error and regularization, and to improve the ensemble's generalization ability. We propose an automatic RNCL algorithm based on gradient...
Flooding based DoS attack represents one of most danger attacks in computer networks. Maximizing the effectiveness of flooding based DoS Attack detection accuracy is the main concerns of many researchers. So, many of them are focusing on increasing the detection effectiveness by features reducing. However, limited research studies have concentrated on investigation the correlation between features...
Approaches to imbalanced classification problem usually focus on rebalancing the class sizes, neglecting the effect of hidden structure within the majority class. The aim of this paper is to highlight the effect of sub-clusters within the majority class on detecting minority class instances, and handle imbalanced classification by learning the structure in the data. We propose a decomposition based...
This Activity monitoring of workers in installations such as industries, underground tunnels, sewerage lines, remote field deployments etc. is a daunting task. Due to lack of communication systems and scarce energy resources, these scenarios pose great challenges in developing a monitoring system for workers. The design of activity recognition system for workers, using a single tri-axial accelerometer...
Today's real time applications data are stored in relational databases. In conventional approach to mine data, we often use to join several relations to form a single relation using foreign key links, which is known as flatten. Flatten may cause problems such as time consuming, data redundancy and statistical skew on data. Hence, how to mine data directly on numerous relations become a critical issue...
In this paper, a ensemble learning classification algorithm based on the novel feature selection method is proposed. The feature selection method takes full account of the discrimination and class information of each feature by calculating the scores. Specially, the scores are fused for getting a weight for each feature. We select the significant features according to the weights. The result of feature...
The accuracy and quality is the best evaluation of recommend system. This paper proposes a collaborative filtering remmendation algorithms based on computing the sematic similarity of items in order to improve the accuracy of items' similarity. The experimental results shows that the optimized algorithm can give a better prediction, by way of increasing accuracy and reducing cold-start problem of...
This article investigates a novel machine learning approach applying consensus clustering in conjunction with classification for the data mining of very large and highly dimensional ECG data sets. To obtain robust and stable clusterings, consensus functions can be applied for clustering ensembles combining a multitude of independent initial clusterings. Direct applications of consensus functions to...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.