The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
This paper presents a keyword extraction technique that can be used for tracking topics over time. In our work, keywords are a set of significant words in an article that gives high-level description of its contents to readers. Identifying keywords from a large amount of on-line news data is very useful in that it can
Webpage keyword is widely used in personalized search and recommendations. The accuracy of keyword is significant to the quality of search and recommendation result. Current keyword extraction methods' accuracy is not high. In order to make up for the shortage of present technology, a new keyword weight adjusting
Content-based phishing detection extracts keywords from a target Web page, uses these keywords to retrieve the corresponding legitimate site, and detects phishing when the domain of the target page does not match that of the retrieved site. It often misidentifies a legitimate target site as a phishing site, however
The motivation behind sub-topic or topic specific keyword discovery through Web pages is helping a user, who is insufficient in knowledge and experience about a topic, to find important concepts without much effort. Intuitively, a Web user would start searching the Web via querying search engines, visiting some pages
agent that targets a particular topic and visits and gathers only relevant web pages. In this dissertation I had worked on design and working of web crawler that can be used for copyright infringement. We will take one seed URL as input and search with a keyword, the searching result is based on keyword and it will fetch
To cope with the problem of information overload, various information filtering system have been proposed. However, most of them represent documents and user's interests as bag of words, and thus the intrinsic structures and semantic information in them are neglected. The information filtering mainly depended on the matching of simple key word or bag of words. In this paper, a multi-agent personalized...
Mobile web browsing signifies accessing the content on web pages using a mobile device. It is common for Internet search engines to use keyword searching in which rank is assigned to each page based on several features. But it is an arduous task for a user to inscribe a keyword in such a delicate small mobile screen
sending a simple keyword query to the hotspot (BlueInfo Pull). The BlueInfo hotspot requests the service from the origin server in the Internet and relays the response to the mobile device, possibly after adaptation for mobile viewing. The usability of BlueInfo Pull in comparison to a mobile phone browser is demonstrated
Traditional method of Web Service search is based on the UDDI registry. UDDI is very popular in the domain of Web Service publishing, but it's not very suitable for Web Service discovering as it is based on keyword match. In this paper we propose a model to discover Web service based on semantic and search engine, and
desirable. In this paper, some existing achievements are investigated firstly. Then our current technique on web information extraction is discussed in detail. In our approach, rules and patterns are extracted from sample pages through training process, with human involvements. We use both keywords and regular expressions to
Applying automatic summarization to search engine can make it easier for users to find out the content of the Web page. In this paper, the results of search engine are analyzed. On the basis of query keywords expansion, we propose a new summary approach which calculates the sentence weight utilizing the information of
posts were collected from a selected hacker forum using a customized web-crawler. Posts were analyzed using a parts of speech tagger, which helped determine a list of keywords used to query the data. Next, a sentiment analysis tool scored these keywords, which were then analyzed to determine the effectiveness of this
-independent approach of extracting news stories from web pages is proposed which is based on anchor text and is applicable to most websites. Experiments show our approach performs good and is better than another approach we have found. Second, a domain-based method of representing events is proposed in which hundreds of keywords
, current search engines do not allow users to explore these features when posing a query. Search engine queries are based almost exclusively on keywords. We believe that it is possible to improve user satisfaction if HTML tags and document metadata are available to users at query time. In this paper we present Xearch, a meta
In this paper, we propose a new method to select relevant images to the given keywords from the images gathered from the Web. Our novel method is based on the probabilistic latent semantic analysis (PLSA) model, which is a generative probabilistic topic model. Firstly, we gather images related to the given keywords
data-rich by keywords in the index path; generate extraction rule and obtain a wrapper according. The wrapper can extract data automatically in the same domain from a Website. It does relevant to the continuity, the structural similarity, and the location relations of the useful information in Web pages, but not the HTML
submission date, number of views, ranking position, description keywords, political inclination of the submitter, the political message in the video, and comments associated with the video, we construct a picture of how online video medium was used during the last congressional political campaign. Our analysis takes into
-demand into the sentence importance. The user-demand consists of the keywords that user queried. The experimental results show that this method can improve the accuracy of searching information.
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.