The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
A new method for feature selection based on improved maximal relevance and minimal redundancy (mRMR) is proposed in this paper. In order to describe the influence of the added features on correlation between candidate features subset and decision, the standard mRMR was improved by introducing the calculation of parameter Sig ≥ (a, B, D). The value of Sig ≥ (a, B, D) is used to determine whether a...
In order to solve the problem of traditional methods having low accuracy in recognizing rolling bearings' faults and to reduce time in building a training model, this paper puts forward a method of recognizing faults based on wavelet packet transformation and PCA-PSO-MSVM. First of all, we extracted the energy values after all kinds of faulty signals have been transformed through wavelet packet, and...
Border Gateway Protocol (BGP) anomalies affect network operations and, hence, their detection is of interest to researchers and practitioners. Various machine learning techniques have been applied for detection of such anomalies. In this paper, we first employ the minimum Redundancy Maximum Relevance (mRMR) feature selection algorithms to extract the most relevant features used for classifying BGP...
The cheapest form of communication in the world today is email, and its simplicity makes it vulnerable to many threats. One of the most important threats to email is spam, unsolicited email, normally with an advertising content sent out as a mass mailing. Malicious spam is spam with malicious content in forms of harmful attachments or links to phishing websites. In the case of educational institutes,...
Peer to peer (P2P) traffic identification is a hot topic in the P2P traffic management. P2P traffic identification method based on support vector machine (SVM) is one of the most commonly used methods. However, the performance of SVM is mainly affected by the parameters and the features used. The traditional method is to separate the SVM parameter optimization and feature selection problem, it is...
Wireless capsule endoscopy (WCE) plays a significant role in the non-invasive small intestine screening for obscure gastrointestinal bleeding detection. However, the task of reviewing 60,000 frames to detect the bleeding encumbers the clinician, leading to visual fatigue and false diagnosis. In this paper, we propose a color feature based bleeding detection system with feature selection using a modified...
Kidney plays an important role in human bodies. It maintains homeostasis and removes some harmful substance by making and ejecting urine. Renal cell carcinoma, especially clear cell renal cell carcinoma (ccRCC), is the most common type of kidney disease that accounts for 2∼3% of human malignancies. Early diagnosis and accurate classification of ccRCC is an important factor to decrease the motility...
Based on the methods of the traditional topic-based text classification, machine learning method was performed to the coarse-grained sentiment classification of reviews. Sentiment classification involved a lot of problems. In this paper, the sentiment Vector Space Model (s-VSM) was used for text representation to solve data sparseness. In addition, the critical issues of the sentiment classification,...
The Beck Depression Inventory (BDI), a self-report questionnaire consisting of 21 question items, has been the most extensively used for depression assessment. The problem of interest here is to identify a subset of questions in the BDI that are most predictive of depression and can reveal gender differences between depression profiles. We investigate feature selection techniques to select a subset...
In this paper, we propose a hybrid classification model, which has correlation based filter feature selection algorithm and support vector machine as a classifier. In this method, features are ordered according to their Absolute correlation value with respect to the class attribute. Then top K Features are selected from ordered list of features to form a reduced dataset. The classification accuracy...
In this paper, support vector machine (SVM) and mixed gravitational search algorithm (MGSA) are utilized to detect the breast cancer tumors in mammography images. Sech template matching method is used to segment images and extract the regions of interest (ROIs). Gray-level co-occurrence matrix (GLCM) is used to extract features. The mixed GSA is used for optimization of the classifier parameters and...
Traditionally in Web crawling, the required features are extracted from the whole contents of HTML pages. However, the position which a word is located inside the HTML tags indicates its importance in the web page. This research proposes two ideas concerning the Feature Selection stage in HTML web pages. The first idea reduces the features by simply extracting them from the important tags in an HTML...
Electronic mail is one of today's most important ways to communicate and transfer information. Because of fast delivery and easy to access, it is used almost in every aspect of communication in work and life. However, the increase in email users has resulted in a dramatic increase in spam emails during the past few years. In this paper, we propose an email-filtering approach that is based on supervised...
The explosive growth of webpage number on the Web has brought up some problems in the search process. One of these problems is that the general purpose search engines often return too many irrelevant results when users are searching for specific information on a given topic. Another problem is the massive increase in the number of pages to be indexed by Web search systems. In this research, two steps...
This paper presents a process of feature selection, and classification algorithm evaluation for a continuous sleep monitoring system, using a tri-axial accelerometer attached to the subject's chest. Two feature selection algorithms, i.e., Relief-F and support vector machine recursive feature elimination (SVM-RFE), and seven classification algorithms, i.e., Bayesian network, naive Bayesian network,...
In this study, we conducted experiments on emotion classification of Indonesian Twitter text. To conduct such experiments, we built a corpus of labeled Twitter data with size of 7622 Twitter text taken from 69 Twitter accounts, manually labeled by 5 native speakers. We used 6 basic emotion labels (angry, disgust, fear, joy, sad, surprise) and add one label of neutral emotion class. Here, we compared...
Student performance classification is a challenging task for teacher and stakeholder for better academic planning and management. Data mining can be used to find knowledge from student data to improve the performance of classifying model. Before applying a classification model, feature selection method is proposed in data preprocessing process to find out the most significant and intrinsic features...
Text feature selection is the key technology in text classification and text information retrieval. The feature selection method - information gain - has extensive application in text categorization. This paper theoretically analyzed the deficiency of information gain in feature selection methods, and then introduced two improvement factors which were LDFWF (Limiting Document Frequency's Word Frequency)...
Feature selection algorithm has a great influence on the accuracy of text categorization. The traditional information gain (IG) feature selection algorithm usually selects the features that rarely appear in the specified categories, but frequently appear in other categories. To overcome this drawback, on the basis of in-depth analysis of the related algorithms, an improved IG feature selection method...
In the world we live in, people from different professions are at increased risk for depressive symptoms and posttraumatic stress disorder (PTSD) due to hard working or extreme environmental conditions. Accurate diagnosis and determining the causes are very important to solve these kinds of psychological problems. Machine learning (ML) techniques are gaining popularity in neuroscience due to their...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.