The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Despite widespread use of commercial anti-virus products, the number of malicious files detected on home and corporate computers continues to increase at a significant rate. Recently, anti-virus companies have started investing in machine learning solutions to augment signatures manually designed by analysts. A malicious file's determination is often represented as a hierarchical structure consisting...
Malicious software, or malware, continues to be a problem for computer users, corporations, and governments. Previous research [1] has explored training file-based, malware classifiers using a two-stage approach. In the first stage, a malware language model is used to learn the feature representation which is then input to a second stage malware classifier. In Pascanu et al. [1], the language model...
Targeted attacks are a significant problem for governmental agencies and corporations. We propose a MinHash-based, targeted attack detection system which analyzes aggregated process creation events typically generated by human keyboard input. We start with a set of malicious process creation events, and their parameters, which are typically generated by an attacker remotely controlling computers on...
In this paper, we propose a new maximum margin-based, active learning algorithm for identifying incorrectly labeled training data. The algorithm combines a round-robin approach for investigating each class with a simple, yet effective ranking metric called maximum negative margin (MNM). Samples are given to an expert for re-evaluation to determine if they are indeed mislabeled. We also propose using...
As web search providers seek to improve both relevance and response times, they are challenged by the ever-increasing tax of automated search query traffic. Third party systems interact with search engines for a variety of reasons, such as monitoring a web site’s rank, augmenting online games, or possibly to maliciously alter click-through rates. In this paper, we investigate automated traffic (sometimes...
Attackers often create systems that automatically rewrite and reorder their malware to avoid detection. Typical machine learning approaches, which learn a classifier based on a handcrafted feature vector, are not sufficiently robust to such reorderings. We propose a different approach, which, similar to natural language modeling, learns the language of malware spoken through the executed instructions...
Drive-by download attacks attempt to compromise a victim's computer through browser vulnerabilities. Often they are launched from Malware Distribution Networks (MDNs) consisting of landing pages to attract traffic, intermediate redirection servers, and exploit servers which attempt the compromise.
In this paper, we propose an image-based detection method to identify web-based scareware attacks that is robust to evasion techniques. We evaluate the method on a large-scale data set that resulted in an equal error rate of 0.018%. Conceptually, false positives may occur when a visual element, such as a red shield, is embedded in a benign page. We suggest including additional orthogonal features...
Automatically generated malware is a significant problem for computer users. Analysts are able to manually investigate a small number of unknown files, but the best large-scale defense for detecting malware is automated malware classification. Malware classifiers often use sparse binary features, and the number of potential features can be on the order of tens or hundreds of millions. Feature selection...
Most teleconferencing conversations are conducted in the presence of Acoustic echoes. Typically an adaptive filter is used to cancel the echo, with a control device called the doubletalk detector which controls the adaptation. We derive a novel test statistic for the doubletalk detection based on the cross-correlation between the microphone signal and the cancellation error for the frequency domain...
A new method is developed for evaluating the error probability (P e ) for direct sequence, code division multiple access (DS/CDMA) wireless systems that includes the effects of shadowing and fading. The method is based on saddle point integration (SPI) of the test statistic's moment generating function (MGF) in the complex plane. The SPI method is applicable to both ideal and wireless channels...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.