The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
In the last few years a number of search engines have been introduced which support Arabic language. Explain the reasons of late appearance of search engines that related to Arabic language and pages has appeared because Arabic language is morphological language. While other languages such as English or French are language's affixation. This research aims to develop the Arabic search engine and to...
Many interesting geospatial datasets are publicly accessible on web sites and other online repositories. However, the sheer number of datasets and locations, plus a lack of support for cross-repository search, makes it difficult for researchers to discover and integrate relevant data. We describe here early results from a system, Klimatic, that aims to overcome these barriers to discovery and use...
Real recuperation of the maximum applicable Records on the subject of attention from the Network is troublesome due to the expansive sum of Data in all sorts of layouts. Concentrates on have been directed on methods to progress the productivity of Data recuperation (IR) schemes. To arrive to appropriate arrangements in IR schemes, machines want extra semantic Data that makes a contrast in accepting...
In this paper, we propose a framework for focused Linked Data (LD) crawler based on context graphs. A focused crawler searches for a specific subset of web, in our case it targets interlinked RDF data stores. The proposed crawler constructs set of context graphs for the given seed URIs by back crawling the web, and classifiers are trained to detect and assign documents to different categories based...
The information world WWW has more than 3 billion HTML pages and these web pages gain access through search engines only. Search engine is a program that searches the document for specified set of keywords and returns a list of documents where any or all of the specified keywords were found. As more information becomes available on the web, it is more difficult to provide effective search services...
Users of social media website now can search, share, or just browse for information with ease. In the field of travel and tourism, it can easily find and share information about travel activities, travel destination, and travel accommodation due to plentiful source of information, but there is also a drawback related to the quality of the information. Traditional search engine that uses keyword to...
Owing to the dynamic nature of the web, it is difficult for search engine to find the relevant documents to serve a user query. For this purpose, search engine maintains the index of downloaded documents stored in the local repository. Whenever a query comes search engine searches the index in order to find the relevant matched results to be presented to the user. The quality of the matched result...
A search engine is a program that searches documents using specific keywords on the World Wide Web and return a list of websites where the keywords are found. It has become a major tool for reaching a growing number of social, political, economic, educational, agricultural and cultural domains represented on the Web. Majority of the search engines return various results in which most of them are unnecessary...
In this work1, we consider a set of networking applications which generate or process a continuous stream of data items, for example, a web-cache which processes a stream of web-objects. These applications often require to answer membership queries for duplicate detection on an unbounded set of data items. Two key challenges to answer such membership queries are the limited space to store the entire...
Web service discovery on the web is not a trivial task as the number of available web service descriptions continuously increases, and global UDDI registries are no longer available. As discovery through conventional, general-purpose search engines does not yield satisfactory results, a more promising alternative should be explored through specialized search engines. This paper explores the design...
This study develops a web information retrieval system using the fuzzy relations in the indexing and ranking portions of standard web retrieval methods. The system was developed, including crawler, indexer, ranking portion, and user search structure. The BK-products of fuzzy relations with closure/interior properties are used to construct a fuzzy thesaurus and further to retrieve the relevant documents...
In this article we describe the architecture of our Web-based platform for refining machine translations. The main idea is to use the Web as a database of phrases, and use this information in order to improve the quality of translations. The platform considers three modules, namely: crawling, indexing, and refining. This is an ongoing work, and currently we are capable to take an English phrase and...
With the development of internet technology, the information in the website increases sharply. In view of the problem of the user's great desire for intranet information retrieval and the inefficiency of the intranet information retrieval service provided by web search engines, in this paper, we study the architecture, key technologies and implementation of the intranet search engine system. We designed...
There are a huge number of Spatial Data Infrastructures (SDIs). This has several advantages, but it is really difficult for a user to know what spatial service could satisfy his/her needs. For this reason, the SDI community now demands an approach to integrate SDIs and relate them with semantic features. In order to contribute to a solution, in this paper we propose an approach to construct an ad-hoc...
A web crawler is a relatively simple automated program or script that methodically scans or “crawls” through Internet pages to retrieval information from data. Alternative names for a web crawler include web spider, web robot, bot, crawler, and automatic indexer. There are many different uses for a web crawler. Their primary purpose is to collect data so that when Internet surfers enter a search term...
This paper presents a social computing tool that centers around social scientists. In the past years, we have worked with social scientists and cultural anthropologists. We learned their ways of studying subjects in social media, what their needs are, and their interests. In the process, we have built a generic platform for collecting data in the blogosphere, tracking blogs of particular interests,...
Provisioning and maintenance of infrastructure for Web based digital library search engines such as CiteSeerx present several challenges. CiteSeerx provides autonomous citation indexing, full text indexing, and extensive document metadata from document scrawled from the web across computer and information sciences and related fields. Infrastructure virtualization and cloud computing are particularly...
Due to the dynamic nature of the Web, it becomes harder to find relevant and recent information. More and more people begin to use focused crawler to get information in their special fields today. However, the Similarity Computation based text is incompetent, because the page consists of not only text but also multimedia contents, such as image, audio, video and so on. In the field of the focused...
Search engines nowadays are becoming more and more necessary and popular in surfing the Internet. However, how these search engines like Google or Baidu works is unknown to many people. This paper, through a research into Open-source search engine Nutch, introduces how a common search engine works. By using Nutch, a search engine which belongs to Guizhou Normal University's website is designed and...
Vertical search engines enable users to find information related to a certain topic. A local search engine is a vertical search engine whose topic revolves around a certain geographical area (such as a city, state, country, etc...) In this paper we describe our experiences developing a crawler for a local search engine for the city of Bellingham, Washington, USA. We focus on the tasks of crawling...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.