The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
A keyword choice and analysis approach in SEO is studied to deal with the issues such as low efficiency, poor reliability and instability optimization in artificial SEO processing in this paper. A keyword expansion method is proposed by reversing search engine's related search keywords to meet user's requirements
This paper presents a keyword extraction technique that can be used for tracking topics over time. In our work, keywords are a set of significant words in an article that gives high-level description of its contents to readers. Identifying keywords from a large amount of on-line news data is very useful in that it can
keywords from the Web pages. The system first identifies the section of the Web page that contains the multimedia file to be extracted and then extracts it by using clustering techniques and other tools of statistical origin. Experimental results on real-world image sharing Web sites are presented and discussed in this paper
Content-based phishing detection extracts keywords from a target Web page, uses these keywords to retrieve the corresponding legitimate site, and detects phishing when the domain of the target page does not match that of the retrieved site. It often misidentifies a legitimate target site as a phishing site, however
The motivation behind sub-topic or topic specific keyword discovery through Web pages is helping a user, who is insufficient in knowledge and experience about a topic, to find important concepts without much effort. Intuitively, a Web user would start searching the Web via querying search engines, visiting some pages
Search engine optimization (SEO) is a process of improving the prominence of a website. Following a reverse engineering approach, in this paper, we study and analyze the key influence factors in the process of web search. We firstly build a system to automatically crawl all factors of 200 thousand web pages. Then we make a content analysis including Page Rank, URL and HTML analysis based on top 20...
agent that targets a particular topic and visits and gathers only relevant web pages. In this dissertation I had worked on design and working of web crawler that can be used for copyright infringement. We will take one seed URL as input and search with a keyword, the searching result is based on keyword and it will fetch
In the past few years, there has been an exponential increase in the amount of information available on the World Wide Web. This plethora of information can be extremely beneficial for users. However, the amount of human intervention that is currently required for this is inconvenient. Information extraction (IE) systems try to solve this problem by making the task as automatic as possible. Most of...
The Web represents one of the largest repositories of information ever compiled by mankind and as such search techniques are essential to navigating its depths and returning pertinent information. Typically the search techniques employed in search engines such as Google entail the use of keywords in which Web pages
appear on websites with other text contents which can deliver important information about the image semantics. Popular image search engines use text contents surrounding the image to generate annotation keywords. Also emphasized text contents like headlines are assumed to be important description providers. Otherwise we
of HTML page, and the proposed algorithms is performed. Complete evaluation is performed which indicates the effectiveness of using our technique. The experimental results show improved precision and recall with the proposed algorithms with respect to keyword-based search. The algorithms are implemented in JAVA and its
to determine the forms' relevance to the domain. In this work scientific research publications domain has been considered. Experimental results show that proposed approach is better as compared to keyword based crawlers in terms of both relevancy and completeness.
Applying automatic summarization to search engine can make it easier for users to find out the content of the Web page. In this paper, the results of search engine are analyzed. On the basis of query keywords expansion, we propose a new summary approach which calculates the sentence weight utilizing the information of
-independent approach of extracting news stories from web pages is proposed which is based on anchor text and is applicable to most websites. Experiments show our approach performs good and is better than another approach we have found. Second, a domain-based method of representing events is proposed in which hundreds of keywords
, current search engines do not allow users to explore these features when posing a query. Search engine queries are based almost exclusively on keywords. We believe that it is possible to improve user satisfaction if HTML tags and document metadata are available to users at query time. In this paper we present Xearch, a meta
In this paper, an OntoCrawler based on the ontology-supported technique for webpage searching was proposed, in which only user entered some keywords would the system supported by domain ontology actively provide comparison and verification for those keywords so as to up-rise precision rate of webpage searching. This
In this paper, we propose a new method to select relevant images to the given keywords from the images gathered from the Web. Our novel method is based on the probabilistic latent semantic analysis (PLSA) model, which is a generative probabilistic topic model. Firstly, we gather images related to the given keywords
small number of HTML input elements extracted from user input HTML forms and a few keywords. It utilizes pre-query technique and post-query technique in a hierarchical manner. Decision trees and multi layer artificial neural networks were used to obtain the classification rates over 91% to classify search forms and non
submission date, number of views, ranking position, description keywords, political inclination of the submitter, the political message in the video, and comments associated with the video, we construct a picture of how online video medium was used during the last congressional political campaign. Our analysis takes into
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.