The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
At present, shallow characteristics are usually utilized to represent the distributed features of text for Chinese spam classification, causing the problem of inexact text vector representation and low classification performance. A novel Chinese spam classification method based on weighted distributed feature is proposed by combining the features of TF-IDF weighted algorithm with the distributed text-based...
Insider threat is a significant security risk for information system, and detection of insider threat is a major concern for information system organizers. Recently existing work mainly focused on the single pattern analysis of user single-domain behavior, which were not suitable for user behavior pattern analysis in multi-domain scenarios. However, the fusion of multi-domain irrelevant features may...
Phishing is a criminal scheme to steal the user's personal data and other credential information. It is a fraud that acquires victim's confidential information such as password, bank account detail, credit card number, financial username and password etc. and later it can be misuse by attacker. We aim to use fundamental visual features of a web page's appearance as the basis of detecting page similarities...
The advent of Social Medias, Email services and other internet facilities are found helpful for a wide range of users. But some of them are interested in finding loop holes in such web based services to hinder the normal activities of common users. In this, spam Emails are one of the most disturbing activity in social network. In this context there is a need for efficient spam filters and most of...
Information overload caused by e-mails are a known issue in practice and academia. There is a lot of research on information overload and e-mail use and misuse. However studies on the phenomenon e-mail overload, describing how e-mail as technology contributes to information overload, are scarce or fragmented. We therefore investigate how e-mails lead to causes of information overload and are creating...
Spam is the most dangerous threat to email systems today. Spam is any unwanted and harmful mail. Separation of spam from normal mails is essential. This paper surveys different spam filtering techniques, Support Vector Machine (SVM) training problems and need to introduce MapReduce Hadoop to train SVM. Techniques to separate spam mails are word based, content based, machine learning based and hybrid...
In this paper, we consider the task of automatic handwritten mail classification and we investigate the relation between the transcription rate and the classification rate. Several configurations of a multi-word handwriting recognizer using different language models are tested and their word recognition rates on the documents to be classified are reported. For the document classification task, we...
In order to effectively filter spam e-mails, this paper introduces a neural network-based spam filtering method, described text feature extraction, text vector generation, neural network classifier model building and testing. Verify through experiments that, choose different learning and training function, the network classification results are very different.
The previous models or methods almost adopted static measure, and filters need be updated and maintained frequently, so they can not adapt to dynamic spam and lack of self-adaptation. In this paper, An Immunological approach to filtering junk Email has been built. The experiment shows that this model can efficiently raise both the recall ratio and precision ratio, and enhance the ability of self-adaptation...
Concerning the requirement of e-mail filtering to improve the efficiency and accuracy in e-mail mining, topic detection, and many other specific applications, learnt from traditional spam filtering methods, an approach based on feature analysis and text classification is proposed. Utilizing some structural features which are very likely to identify an irrelevant e-mail, such as group sending, embedded...
To acquire knowledge by learning automatically from the data, through a process of inference, model fitting, or learning from example is one of the rare field of email management. And when an artificial system can perform "intelligent", tasks similar to those performed by the human brain and such is implemented in email classification, such a system will be is extremely intelligent. Using...
Anomaly detection involves identifying observations that deviate from the normal behavior of a system. One of the ways to achieve this is by identifying the phenomena that characterize "normal" observations. Subsequently, based on the characteristics of data learned from the "normal" observations, new observations are classified as being either "normal" or not. Most state-of-the-art...
Most content based spam filters are rule based or trained off-line. Handling new spam tactics is difficult and prone to high misclassification rate. This paper proposes an incremental adaptive spam mail filtering using Naiumlve Bayesian classification which gives good performance, simplicity and adaptability. We model an incremental scheme that receives a stream of emails, and applies the concept...
Widespread information technique use has led to the emergence of email networks large-scale applications networks in cyberspace. But the traditional spam solutions for anti-spam are mostly static methods, and the means of adaptive and real time analyses the mail are seldom considered. Inspired by the theory of artificial immune systems (AIS), a novel distributed anti-spam model that leverages e-mail...
Collocation is the frequent bi-grams of semantic meanings and grammatical functions. Adjacent and long distance collocations are extracted as features for a Bayesian classifier in spam filtering. Compared to the common unigram feature, collocation-based classifier shows improvement in all the evaluation metrics. The influence of mail header information is studied for the classifier, which shows a...
There is a lack of general mail filtering system, which is not only compatible with several content-based filtering methods but also can realize automatic learning function by mail client. In this connection, this paper proposes GALMFS (general automatic learning mail filtering system). GALMFS realizes general design of mail filtering system through separating content-based filtering methods from...
The study on content-based spam filtering is one of the important topics in the Internet security research area. And Bayesian classification method has expressed better performance on anti-spam. An improved new method that classifies spam filtering based on Bayesian filtering is proposed in this paper. The experiment results show that the new method has improved spam recall and spam precision.
By feeding personal E-mails into the training set, personalized content-based spam filters are believed to classify e-mails in higher accuracy. However, filters trained by both spam mails and personal mails may have difficulty classifying e-mails with the same characteristics of both spam and ham. In this paper, we propose a two-tier approach of using two filters trained only with either personal...
The traditional anti-spam techniques like black and white list can not meet the needs of the spam filter nowadays. Some machine learning techniques become very popular in the research of spam filter. Support vector machine is one of the most excellent methods in classifying. But these techniques are usually applied to spam identity based on the mail body textual content only, seldom discussing about...
This paper introduces an algorithm based on VSM algorithm and statistical decision tree (SDT) to recognize illegal e-mails. The vector space model is simple and easy to operate. At first, the vector space model (VSM ) can filter some specific words which are often used in illegal e-mails. Then, SDT can judge illegal e-mails by Semanteme analyze. After the two steps, the illegal e-mails can also be...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.