The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
The reports from last years outline the fact that the web crawlers (robots, bots) activities generate more than a half of web traffic on Internet. Web robots can be good (used for example by search engines) or bad (for bypassing security solutions, scraping, spamming or hacking), but usually all take up the internet bandwidth and can cause damage to businesses that rely on web traffic or content....
Web data acquisition is the foundation of Web data mining. Web crawler is an important tool for Web data acquisition, but the frequent updates of Web data structures, data sources and distribution channels, resulted in high costs of crawler program development and maintenance. In order to solve this problem, this paper designed and implemented an intelligent dynamic crawler, which stored the data...
Cross-site scripting (also referred to as XSS) is a vulnerability that allows an attacker to send malicious code (usually in the form of JavaScript) to another user. XSS is one of the top 10 vulnerabilities on Web application. While a traditional cross-site scripting vulnerability exploits server-side codes, DOM-based XSS is a type of vulnerability which affects the script code being executed in the...
The contentious battle between web services and miscreants involved in blackhat search engine optimization and malicious advertisements has driven the underground to develop increasingly sophisticated techniques that hide the true nature of malicious sites. These web cloaking techniques hinder the effectiveness of security crawlers and potentially expose Internet users to harmful content. In this...
The distribution network, including its flow information, in a supply chain system is usually a business secret to ensure the supply chain security and hold on to a favorable position in commercial competition. When more and more organizations deploy tracking systems to facilitate users, most of them focus much on the business growth but ignore the protection for the secrets. This paper therefore...
This paper presents a dynamic detection method based on simulating browser behavior, and designs a web crawler based on a headless browser, which can interpret the JavaScript code and retrieve Ajax content to find the hidden injection points in pages, with full consideration of the web pages containing complex scripts under Web 2.0 environment. In implementation, this paper uses dynamic analysis in...
Information safety is significant for state security, especially for intelligence service. OSIA (open source intelligence analyzing) system based on cloud computing and domestic platform is designed and implemented in this paper. For the sake of the security and utility of OSIA, all of the middleware and involved OS are compatible with domestic software. OSIA system concentrates on analyzing open...
Web applications become an important part for Communication now days. As the popularity of the web application increases like online transaction, net banking and many more, the role of web security has been increase as well. Web applications vulnerabilities let attackers to carry out malicious activities that range from gaining unauthorized access or stealing the sensitive data. Past research have...
With increase in the awareness of security programming, the number of vulnerabilities for software on a machine have subsequently decreased. Exploiting these few vulnerabilities if present, require attackers to use their skills and efforts to exploit various services. Firewalls, access control lists (ACLs), intrusion detection and prevention system deployed in an organization are able to block and...
A concise and practical introduction is given on Online Social Networks (OSN) and their application in law enforcement, including a brief survey of related work. Subsequently, a tool is introduced that can be used to search OSN in order to generate user profiles. Both its architecture and processing pipeline are described. This tool is meant as a flexible framework that supports manual foraging (and...
Botnets are a serious threat to Internet-based services and end users. The recent paradigm shift from centralized to more sophisticated Peer-to-Peer (P2P)-based botnets introduces new challenges for security researchers. Centralized botnets can be easily monitored, and once their command and control server is identified, easily be taken down. However, P2P-based botnets are much more resilient against...
The SPaCIoS project has as goal the validation and testing of security properties of services and web applications. It proposes a methodology and tool collection centered around models described in a dedicated specification language, supporting model inference, mutation-based testing, and model checking. The project has developed two approaches to reverse engineer models from implementations. One...
Today, the web is all about the dynamic content; the information created whilst it is needed i.e. the resources are not readily available to the users. Then how it is possible that a web crawler finds a resource that is either protected by a session or hidden behind an authentication form? The query triggered to look-for the answers to the questions on web crawlers which are; what is a crawler? Why...
Web robots also known as crawlers or spiders are used by search engines, hackers and spammers to gather information about web pages. Timely detection and prevention of unwanted crawlers increases privacy and security of websites. In this paper, a novel method to identify web crawlers is proposed to prevent unwanted crawler to access websites. This new method suggests Five-factor identification process...
With the rapid development of microblog technology, many interesting research issues on microblog have aroused growing attention. Data fetching from microblog is the groundwork of these researches. In this paper we take Sina microblog (also called Weibo) as the crawling site, designing and implementing a high efficient incremental microblog crawler based on the classic multi-producers and multi-consumers...
There have been a number of network monitoring projects launched to cope with cyber threats in the Internet. In those projects, several types of sensors such as black hole sensor, low and high interaction honey pot, and web crawlers are deployed to analyze characteristics of attackers from various perspectives. However, there are some problems of deployment and operation of network monitoring systems,...
Crawling is a necessary step for testing web applications for security. An important concept that impacts the efficiency of crawling is state equivalence. This paper proposes two techniques to improve any state equivalence mechanism. The first technique detects parts of the pages that are unimportant for crawling. The second technique helps identifying session parameters. We also present a summary...
Security has become a major concern while browsing as the number of malicious sites keeps increasing with the cost for hosting a site decreasing. Though most of the web servers use Secure Socket Layer (SSL) over HTTP (Hyper Text Transfer Protocol) to ensure trust between consumers and providers, SSL is vulnerable to Man-In-The-Middle (MITM) attack and becoming very common these days. Phishing is another...
The increasing number of intrusions and data thefts on online systems is one of the triggers of the growing concern about security inside organizations. Nowadays, dynamic and extensible detection tools are required and critical to detect and diagnose vulnerabilities in Web systems. In this paper we present the development and evaluation of a vulnerability scanner for online systems. Unlike most existing...
User input validation is a technique to counter attacks on web applications. In typical client-server architectures, this validation is performed on the client side. This is inefficient because hackers bypass these checks and directly send malicious data to the server. User input validation thus has to be duplicated from the client-side (HTML pages) to the server-side (PHP or JSP etc.). We present...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.