The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
This paper presents a keyword extraction technique that can be used for tracking topics over time. In our work, keywords are a set of significant words in an article that gives high-level description of its contents to readers. Identifying keywords from a large amount of on-line news data is very useful in that it can
Content-based phishing detection extracts keywords from a target Web page, uses these keywords to retrieve the corresponding legitimate site, and detects phishing when the domain of the target page does not match that of the retrieved site. It often misidentifies a legitimate target site as a phishing site, however
The motivation behind sub-topic or topic specific keyword discovery through Web pages is helping a user, who is insufficient in knowledge and experience about a topic, to find important concepts without much effort. Intuitively, a Web user would start searching the Web via querying search engines, visiting some pages
agent that targets a particular topic and visits and gathers only relevant web pages. In this dissertation I had worked on design and working of web crawler that can be used for copyright infringement. We will take one seed URL as input and search with a keyword, the searching result is based on keyword and it will fetch
Applying automatic summarization to search engine can make it easier for users to find out the content of the Web page. In this paper, the results of search engine are analyzed. On the basis of query keywords expansion, we propose a new summary approach which calculates the sentence weight utilizing the information of
-independent approach of extracting news stories from web pages is proposed which is based on anchor text and is applicable to most websites. Experiments show our approach performs good and is better than another approach we have found. Second, a domain-based method of representing events is proposed in which hundreds of keywords
, current search engines do not allow users to explore these features when posing a query. Search engine queries are based almost exclusively on keywords. We believe that it is possible to improve user satisfaction if HTML tags and document metadata are available to users at query time. In this paper we present Xearch, a meta
In this paper, we propose a new method to select relevant images to the given keywords from the images gathered from the Web. Our novel method is based on the probabilistic latent semantic analysis (PLSA) model, which is a generative probabilistic topic model. Firstly, we gather images related to the given keywords
submission date, number of views, ranking position, description keywords, political inclination of the submitter, the political message in the video, and comments associated with the video, we construct a picture of how online video medium was used during the last congressional political campaign. Our analysis takes into
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.