The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
The reports from last years outline the fact that the web crawlers (robots, bots) activities generate more than a half of web traffic on Internet. Web robots can be good (used for example by search engines) or bad (for bypassing security solutions, scraping, spamming or hacking), but usually all take up the internet bandwidth and can cause damage to businesses that rely on web traffic or content....
The Locator/Identifier Separation Protocol (LISP) separates classical IP addresses into two categories: one for identifying terminals, the other for routing. To associate identifiers and locators LISP needs a specific mechanism, called mapping system. This technology is still at an early stage but two experimental platforms have already been deployed in the Internet: LISP Beta Network and LISP-Lab...
Operation services are reusable and shareable units of configuration code executed by configuration management tools (CMTs), achieving continuous deployment and continuous delivery. With the prevalence of DevOps (Development and Operations), thousands of operation services have been developed for various software systems, and they are publicly available through the online repositories of popular CMTs...
Web crawlers have been misused for several malicious purposes such as downloading server data without permission from the website administrator. In this paper, based on one observation that normal users and malicious crawlers have different short-term and long-term download behaviors, we develop a new anti-crawler mechanism called PathMarker to detect and constrain persistent distributed crawlers...
Now a days the social media is come into bud of online social networks (e.g., Facebook [1], Google+ [2]) and video streaming sites e.g., YouTube [3], as well as a coming together between the two types of systems. More and more media contents (video clips, images, etc.) are published and shared among users on social network sites while the video streaming systems are increasingly leveraging social...
Web of world facilitates search results on massive collection of data using Internet. While obtaining search results corresponding to a particular search query, authentication of search results becomes an important question, as different sources contribute data in every moment which are stored in (un)trusted third party servers. In this paper, we discuss on the authenticated search results of some...
There has been an increase in the use of image processing for object recognition. However, traditional methods are not suitable in real-time system because they cannot satisfy human performance. Recently, deep learning with Convolutional Neural Network came to be known as a solution for image recognition. In fact, there are many great result with deep learning in object recognition. However, it needs...
Cross-site scripting (also referred to as XSS) is a vulnerability that allows an attacker to send malicious code (usually in the form of JavaScript) to another user. XSS is one of the top 10 vulnerabilities on Web application. While a traditional cross-site scripting vulnerability exploits server-side codes, DOM-based XSS is a type of vulnerability which affects the script code being executed in the...
Advancement of technology era made information easily available over internet which needs to be adequately protected or it may be compromised is known as an information or security breach. Software known as “web spider” is used to discover openly available web pages. Spider glances at web pages and follow links on those pages to retrieve information from web. Information confidentiality needs to be...
In this paper, we will propose SNS crawler engine for topic expansion. The Smart Broadcasting Platform uses only subtitle, and scripts in extracting topic. But, there are not sufficient words in them in order for adopting it to various domains. Therefore, it needs to include more data sources for extracting richer topics. We will also introduce the system architecture of SNS crawler engine, describe...
With the advent of Web 2.0 application, and the increasing number of browsers and platforms on which the applications can be executed, cross-browser incompatibilities (XBIs) are becoming a serious problem for organizations to develop web-based software. Although some techniques and tools have been proposed to identify XBIs, a number of false positives and false negatives still exist as they cannot...
Provisioning cloud applications usually is a complex task as it involves the deployment and configuration of several components (e.g., load balancer, application server, database) and cloud services (computing, storage, CDN, etc.) also known as application blueprints or topologies. The Topology and Orchestration Specification for Cloud Applications (TOSCA) is a recent standard that has focused on...
As Internet use increases, it is plagued by malicious activity. In particular, drive-by download attacks have become a serious problem. As part of an exploit-as-a-service ecosystem for drive-by download attacks, malware download sites play a particularly important role for attackers. In this paper, we analyzed approximately 43,000 malware download URLs. Our measurement period is over 1.5 years and...
The ever-growing number of cyber attacks from botnets has made them one of the biggest threats on the Internet. Thus, it is crucial to study and analyze botnets, to take them down. For this, an extensive monitoring is a pre-requisite for preparing a botnet takedown, e.g., via a sinkholing attack. However, every new monitoring mechanism developed for botnets is usually tackled by the botmasters by...
Centralized crawlers are not adequate to spider meaningful and relevant portions of the Web. A crawler with good scalability and load balancing can bring growth to performance. As the size of web is growing, in order to complete the downloading of pages in fewer amounts of time and increase the coverage of crawlers it is necessary to distribute the crawling process. In this paper, we present client...
SQL injection attack (SQLIA) is one of the most severe attacks that can be used against web database driving applications. Attackers' use SQLIA to get unauthorized access to and perform unauthorized data modification. To mitigate the devastating problem of SQLIA, different researchers proposed variety of web penetration testing tools that automation of SQLI vulnerability assessment that result in...
During the past decade, the Web has become increasingly more popular and thus more important for delivery of content and services over the Internet. At the same time, the number of requested objects, their size and delivery mechanisms for popular websites have become more complex. This in turn has various implications including the impact on page loading time that directly affects the experience of...
The Internet has always been growing with all the contents and information added by different types of users. Without proper storage and indexing, these contents can easily be lost in the sea of information housed by the Internet. Hence, an automated program, known as the web crawler is used to index all the contents added to the Internet. With proper configurations and settings, a web crawler can...
Social Network Analysis (SNA) is a field of study that focuses on analyzing user profiles and participations on social network channels in order to model relationships between people and to predict certain behaviors or knowledge. To achieve their goals, researchers, interested in SNA, have to extract content and structure from the numerous social networks available today. Existing tools, which help...
In this paper, we present our study on building a Web graph for the Malaysian Web based on the crawling of Malaysian Web sites. Given the constructed Web graph, interesting characteristics have been studied, such as the in-degree, out-degree, the distribution of power law, bow-tie structure and the strongly-connected components (SCCs). Besides, more important insight could be obtained from analyzing...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.