The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
The world wide web is filled with billions of images and duplicates of images can frequently be found on many websites. These duplicates can be exact copies or differ slightly in their visual content. In this paper we provide a comparative study on how well content-based duplicate image detection methods are able to detect the duplicates of a query image. We conduct a survey to better understand in...
Identifying review manipulation has become one of hot research issues in e-commerce because more and more customers make their purchase decisions based on some personal comments from virtual communities and e-business websites. Customers consider these personal reviews are more reliable than the existing internet advertisements. Consequently, some enterprises attempt to create fake personal comments...
This paper mainly focuses on the technical factors of the website targeted at shopping mall websites in Korea and China. There are a total of six technical factors, and we conducted an empirical study on how those six factors are different based on the shopping mall websites of two different countries. Statistical data for an empirical study targeted at undergraduate students in Korea and China, and...
The exponential increase of bandwidth on the Internet has made the online traffic classification a highly exigent task. All the operations in the classification process must be efficiently implemented in order to deal with an enormous amount of data. A key point in this process is the selection of a flow termination, a decision that has important consequences for several traffic classification techniques...
IP geolocation services are playing an increasingly important role in Web applications. Landmark-based IP geolocation can gain better IP geolocation performance. However, the accuracy of landmarks is still an urgent issue to address. The paper proposes a filtering landmark mechanism to guarantee the accuracy of landmarks. First, a Ping-Delay Coarse Filtering Landmark algorithm (PCFL) is proposed to...
The future Internet scenario consists of a higher number of users and applications, which demand more resources from the communication infrastructure. Techniques for providing performance and scalability, such as Traffic Engineering (TE), will always be necessary even if the transmission rate is very high, because of such demands. Quality of Service is one of the solutions that can be used to improve...
Traffic classification plays an important role in many short to medium term network management tasks and in long term network dimensioning/planning. In recent years a number of traffic classifiers have been proposed, in particular classifiers based on machine learning techniques exhibit high levels of accuracy. However, in practice, even if classifiers can be accurately trained at a given time, their...
The recent years have seen extensive work on statistics-based network traffic classification using machine learning (ML) techniques. In the particular scenario of learning from unlabeled traffic data, some classic unsupervised clustering algorithms (e.g. K-Means and EM) have been applied but the reported results are unsatisfactory in terms of low accuracy. This paper presents a novel approach for...
The ability to accurately classify various types of Internet traffic within a network using Netflow traces represents a major challenge as there is no payload information available with Netflow. P2P applications represent a very large portion of the internet traffic and are becoming more difficult to classify, as some of these applications tend to use port masquerading techniques and encrypted payloads,...
Internet Threat Monitoring (ITM) systems have been widely deployed to detect and characterize dangerous Internet global threats such as botnet and malware propagation. Nonetheless, the effectiveness of ITM systems largely depends on the confidentiality of their monitor locations. In this paper, we investigate localization attacks aiming to identify ITM monitor location and propose the formal model...
The focus of this paper is to characterize the behavior of large, evolving networks, in terms of central nodes to identify patterns that may be conducive to persistent threat structures over time and geo-spatial regions. We propose an approach to monitor central nodes to determine Consistency and Inconsistency (CoIn) in their availability across time periods. Our approach also identifies the time...
Many existing machine learning based traffic classifiers require the first five packets in traffic flows to perform traffic classification. In this work, we investigate the flexibility of using arbitrary sets of packets to train traffic classifiers. Such classifiers could be used as auxiliary classifiers that would function in cases where some packets in flows are unavailable, possibly due to packet...
Phishing is turning into a hotbed for vast fraudulency over the internet; therefore it's one of the most challenges toward internet security. Utilizing a centralized list of website is a common solution; as the most of the browsers and commercial anti-phishing products utilize it. Nevertheless, this solution is helpless against zero-day phishing attacks. So, many researches study and suggest methods...
BitTorrent is both the dominant Peer-to-Peer (P2P) protocol for file-sharing and a nightmare for ISPs due to its network agnostic nature. Many solutions exist to localize BitTor-rent traffic relying on cooperation between ISPs and the trackers. Recently, BitTorrent users have been abandoning the trackers in favor of Distributed Hash Tables (DHTs). Despite DHTs are complex heterogeneous systems, DHT-based...
The ability to estimate the geographic position of a network host has a vast array of uses, and many measurement-based geolocation methods have been proposed. Unfortunately, comparing results across multiple studies is difficult. A key contributor to that difficulty is network geometry — the spatial arrangement of hosts and links. In this paper, we study the relationship between network geometry and...
Internet threat monitoring systems are studied and developed to comprehend the malicious activities on the Internet. On the other hand, it is known that attackers devise a technique that locates the deployment of sensors that constitute the monitoring system. This technique is called as localization attacks to Internet threat monitors. If attackers can detect sensors, they can evade them when they...
This paper examines the processing of PII data in e-Government web services in developing nations using the case study of some Nigerian embassies. It presents a secured framework intended for protecting Personally Identifiable Information (PII) data in e-Government web services in developing nations where such do not already exist. The framework is based on OWASP ASVS security requirement for data...
We report on experiments that demonstrate the relevance of our AntiSocial Behavior (ASB) corpus as a machine learning resource to detect antisocial behavior from text. We first describe the corpus and then, by using the corpus for training machine learning algorithms, we build a set of binary classifiers. Experimental evaluations revealed that classifiers built based on the ASB corpus produce reliable...
The geographical location of Internet IP addresses is important for academic research, commercial and homeland security applications. While commercial databases claim to have a very high level of accuracy, the correctness of their databases is questionable. Academic tools, based on delay measurements, were shown to have a large range of error. We present a novel algorithm that crawls the Internet...
Random surfers spend very little time on a web page. If the most important web page content fails to attract his attention within the short time span, he will move away to some other page, thus defeating the purpose of the web page designer. In order to predict if the contents of a web page will catch a random surfer's attention or not, we propose a machine learning based approach to classify web...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.