The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Pharming attacks - a sophisticated version of phishing attacks - aim to steal users' credentials by redirecting them to a fraudulent website using DNS-based techniques. Pharming attacks can be performed at the client-side or into the Internet, using complex and well designed techniques that make the attack often imperceptible to the user. With the deployment of broadband connections for Internet access,...
Mainly on data-intensive Web site research experiment. In the web pages of the automatically generated wrapper method of research-based information extraction, the main job is to make the page tree matching algorithm, the sample tree and the tree wrapper DOM tree matching two pages compared to the first to discover the page selection mode, producing the primary template, and then self-correction of...
This paper studies the web wrapper generation for web pages of forum, blog and news web sites. While more and more web pages are dynamically generated using a common template populated with data from databases. This paper proposes a novel method that uses tree alignment and transfer learning method to generate the wrapper from this kind of web pages. We present a new tree alignment algorithm to find...
To extract information automatically from semi-structured Web pages, this paper puts forward a method named IESS for discovering the record model based on DOM and maximal similar sub tree, to identify records automatically and correctly when there are some differences in expression models of records that belong to the same type. To test the performance of the method, a scientific literature statistical...
The Web has become one of the most important connections to various information resources. The most interesting challenge is how to extract important data from a large number of Web pages and transform them to more structural, standard and semantic information, which can be queried and analyzed by using matured techniques in database, data warehouse and other fields. We design a wrapper generator...
Information extraction from Web sites is nowadays a relevant problem, usually performed by software modules called wrappers. Introduced the relevant information extraction technology. A combination of HTML pages to extract information of the theme and extract the contents. First of all, to remove noise combination of visual block, the vision-based DOM tree denoising methods to improve the efficiency...
Notice of Violation of IEEE Publication Principles"A Foundation for Knowledge System with Application in Information Retrieval and Knowledge Acquisition," by Zhi Teng, Ye Liu, Fuji Ren,in the Proceedings of the International Conference on Natural Language Processing and Knowledge Engineering, 2008. NLP-KE '08, Oct. 2008After careful and considered review of the content and authorship of...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.