The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Many different techniques have been employed to analyze spam emails. The paper explores two main semantic methods: Bayesian algorithms and Support Vector Machine (SVM). More recent spam filters are introduced in the paper. They all utilize semantic analysis information to determine whether a message is spam.
The growth of email users has resulted in the dramatic increasing of the spam emails. Helpfully, there are different approaches able to automatically detect and remove most of these messages, and the best-known ones are based on Bayesian decision theory and Support Vector Machines. However, there are several forms of Naive Bayes filters, something the anti-spam literature does not always acknowledge...
In this paper we compare four machine learning techniques for blog comments spam filtering. the machine learning techniques are the Naïve Bayes, K-nearest neighbor, neural networks and the support vector machines. For this comparative study we used a blog comment corpus that has been affected by spam, which is our study case in this work. We classify the comments of this blog comments corpus, which...
Phishing continue to be one of the most drastic attacks causing both financial institutions and customers huge monetary losses. Nowadays mobile devices are widely used to access the Internet and therefore access financial and confidential data. However, unlike PCs and wired devices, such devices lack basic defensive applications to protect against various types of attacks. In consequence, phishing...
Spam sender detection based on email subject data is a complex large-scale text mining task. The dataset consists of email subject lines and the corresponding IP address of the email sender. A fast and accurate classifier is desirable in such an application. In this research, a highly scalable SVM modeling method, named Granular SVM with Random granulation (GSVM-RAND), is designed. GSVM-RAND applies...
In this paper, we report our work on spam filtering with three novel Bayesian classification methods: aggregating one-dependence estimators (AODE), hidden Naive Bayes (HNB), locally weighted learning with Naive Bayes (LWNB). Other four traditional classifiers: Naive Bayes, k nearest neighbor (kNN), support vector machine (SVM), C4.5 are also performed for comparison. Four feature selection methods:...
As the rapid development of the Internet, the occurrence of more and more spam mails becomes harmful to users. Content-based spam filtering technologies become the mainstream anti-spam mail methods so far. Support vector machine (SVM), Bayes, windows and KNN are excellent ones of these technologies and they have advantages and disadvantages respectively. The common shortage of content-based methods...
In this research, we propose a two-stage method for spam classification, the naive Bayesian classifier (NBC) and support vector machine (SVM). NBC adopts the concept of Bayesian theory for classification, and combines the conditional probability with feature count as input data for SVM which uses the radial basis function with Gaussian kernel for further classification. The classification features...
This paper presents an anti-spam filter approach based on support vector machine (SVM). Firstly, we adopt the tri-gram language model to perform word segmentation in the Chinese email. In order to overcome the sparse data problem, the absolute discount smoothing algorithm is applied. Secondly, the different factoid words are identified by the automaton machine, so as to acquire the approximate syntactic...
Email communication has become widespread, but the exponential increase in spam (unsolicited email) and the increase in the volume of email, can make the use of email for communication tedious and time consuming. This paper reviews recent approaches to filter out spam email, to classify email into a hierarchy of folders, and to automatically determine the tasks required in response to an email message.
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.