The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
keywords from the Web pages. The system first identifies the section of the Web page that contains the multimedia file to be extracted and then extracts it by using clustering techniques and other tools of statistical origin. Experimental results on real-world image sharing Web sites are presented and discussed in this paper
Content-based phishing detection extracts keywords from a target Web page, uses these keywords to retrieve the corresponding legitimate site, and detects phishing when the domain of the target page does not match that of the retrieved site. It often misidentifies a legitimate target site as a phishing site, however
In this work, we compare various text-based pornographic Web filtering techniques. The techniques include blacklist and keyword blocking. The technique called SV is modified to extract a representative feature vector. Each test Web pagepsilas feature is extracted and gathered as a vector. The vector is then summarized
This paper explores a unique way in which the thinking algorithm adds an extra logical substrate to a Web search query using artificial intelligence. Instead of just going after keyword searching, the algorithm tries to assess the motives of the user behind entering a query. The algorithm tries to find the reasons as
appear on websites with other text contents which can deliver important information about the image semantics. Popular image search engines use text contents surrounding the image to generate annotation keywords. Also emphasized text contents like headlines are assumed to be important description providers. Otherwise we
extends the VSM based on keyword, it consider that the keywords in the page have different weight in the different position;Integrating the principles of Page-Rank, link analysis also considers that anchor text and website of the web page relevant with the theme.
We put forward a new matching strategy with more elaborate and realistic concepts to improve on the past matching strategy based on keyword. In this paper, we set up a concept lattice for each Web page according to the natural semantic concepts extracted from each Web page. Such concept lattices are encoded by
retrieve the maximal set of relevant and quality page. In our proposed approach, we calculate the unvisited URL score based on its Anchor text relevancy, its description in Google search engine and calculate the similarity score of description with topic keywords, cohesive text similarity with topic keywords and Relevancy
The topic correlation judgment algorithm based on weight and threshold is proposed as for the problem that Web pages which are closely related to the given topic may be neglected due to not all keywords given by the users in the pages when users retrieve the topic they desire on the Internet. The algorithm retrieves
-independent approach of extracting news stories from web pages is proposed which is based on anchor text and is applicable to most websites. Experiments show our approach performs good and is better than another approach we have found. Second, a domain-based method of representing events is proposed in which hundreds of keywords
General purpose search engines provide users with lists of retrieved documents in response to their queries. The common structure of list elements includes the title of a document, its URL, and small snippet from the text. Snippets are evidence of occurrences of query's keywords in the document. The length of each
As an ever-increasing amount of information on the Web today is available through search interfaces, users have to key in a set of keywords in order to access the pages from certain Web sites, which are often referred to as the hidden Web or the deep Web. Since there is no static links to the hidden Web pages, search
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.