The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
In the complex pattern classification problem, the reliability of classifier output for the patterns located at different regions of the data set may be different. In order to efficiently improve the classification accuracy, we propose a new method to correct the original classifier output using the local knowledge of the classifier performance in different regions. The training data set can be divided...
Embedding malicious URLs in e-mails is one of the most common web threats facing the internet community today. Malicious URLs have been widely used to mount various cyber-attacks like spear phishing, pharming, phishing and malware. By falsely claiming to be a trustworthy entity, users are lured into clicking on these compromised links to divulge vital information such as usernames, passwords, or credit...
The World Wide Web supports a wide range of criminal activities such as spam-advertised e-commerce, financial fraud and malware dissemination. Although the precise motivations behind these schemes may differ, the common denominator lies in the fact that unsuspecting users visit their sites. These visits can be driven by email, web search results or links from other web pages. In all cases, however,...
In this paper, we develop and implement a new email spamming system leveraged by coupled text similarity analysis on user preference and a virtual meta-layer user-based email network, we take the social networks or campus LAN networks as the spam social network scenario. Fewer current practices exploit social networking initiatives to assist in spam filtering. Social network has essentially a large...
Statistical and machine learning methods have been proposed to predict hard drive failure based on SMART attributes, and many achieve good performance. However, these models do not give a good indication as to when a drive will fail, only predicting that it will fail. To this end, we propose a new notion of a drive's health degree based on the remaining working time of hard drive before actual failure...
Email is a rapid and cheap communication medium for sending and receiving information where spam is becoming a nuisance for such communication. A good spam filtering cannot only be achieved by high performance accuracy but low false positive is also necessary. This paper presents a combining classifiers approach with committee selection mechanism where the main objective is to combine individual decisions...
In many document classification problems, sets of people will be associated with the document. These sets might include document authors, or people who have read the document, or the sender of an electronic message, or the recipients of the message, or those carbon copied, or those blind carbon copied. It is obvious that these sets of people can constitute important information that can help to classify...
The growth of email users has resulted in the dramatic increasing of the spam emails. Helpfully, there are different approaches able to automatically detect and remove most of these messages, and the best-known ones are based on Bayesian decision theory and Support Vector Machines. However, there are several forms of Naive Bayes filters, something the anti-spam literature does not always acknowledge...
In this paper, we introduce a classification approach to identify definitions of all terms from a aviation professional corpus. The corpora of aviation domain are firstly segmented by LTP platform from HIT. Then four feature selection methods and two classifiers are applied to extract definitions. First of all, we summarize the correct proportion of feature subset used in classification of term definitions,...
Phishing continue to be one of the most drastic attacks causing both financial institutions and customers huge monetary losses. Nowadays mobile devices are widely used to access the Internet and therefore access financial and confidential data. However, unlike PCs and wired devices, such devices lack basic defensive applications to protect against various types of attacks. In consequence, phishing...
Spam sender detection based on email subject data is a complex large-scale text mining task. The dataset consists of email subject lines and the corresponding IP address of the email sender. A fast and accurate classifier is desirable in such an application. In this research, a highly scalable SVM modeling method, named Granular SVM with Random granulation (GSVM-RAND), is designed. GSVM-RAND applies...
In this paper, we report our work on spam filtering with three novel Bayesian classification methods: aggregating one-dependence estimators (AODE), hidden Naive Bayes (HNB), locally weighted learning with Naive Bayes (LWNB). Other four traditional classifiers: Naive Bayes, k nearest neighbor (kNN), support vector machine (SVM), C4.5 are also performed for comparison. Four feature selection methods:...
After analysis and comparison of the problems of the existing one-versus-one (OVO) and one-versus-rest (OVR) decomposition methods of multi-class support vector machine (SVM), the novel strategy based on posterior probability is presented to reconstruct a multi-class classifier from binary SVM-based classifiers. The new reconstruction strategy can increase recognition accuracy and resolve the unclassifiable...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.