The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
In the mobile Internet era, users access interesting information in a continuous manner rather than as one-time results through search engines. The traditional link-based ranking algorithms typically return the relevant “popular” web pages. The current, most important web pages are ranked lower than these pages. Furthermore, most of the results are repeated when the user submits the same query days...
In this paper, we will propose SNS crawler engine for topic expansion. The Smart Broadcasting Platform uses only subtitle, and scripts in extracting topic. But, there are not sufficient words in them in order for adopting it to various domains. Therefore, it needs to include more data sources for extracting richer topics. We will also introduce the system architecture of SNS crawler engine, describe...
Today, web is used in virtually all the spheres of daily life. Internet is a worldwide storehouse of information. Internet searcher is a data recovery framework that has different site information stored in it. The web index records No. of times the URL accessed; it goes to the web URL to recover its meta data and indicates page mistakes. The web searcher is at first sustained with URLs of various...
With the advancement of speech recognition technologies, there is an increase in the adoption of voice interfaces on mobile-based platforms. While, developing a general purpose Automatic Speech Recognition (ASR) which can understand voice commands is important, the contexts of how people interact with their mobile device change very rapidly. Due to the high processing complexity of the ASR engine,...
The objective of this paper was to provide an automatic engine to classify and locate information using natural language. The proposal integrates a set of two algorithms to extract information from different repositories using their own open APIs and creates a knowledge database using a natural language approach using a Bayesian algorithm to classify and a second algorithm to clean the paper. Putting...
We present PharmaGuard, a novel system for the automatic discovery of illegal online pharmacies, aimed at assisting law-enforcement toward their early identification, blacklisting and shutdown. Given a previously labelled set of examples, the system is able to learn a profile of (illegal) pharmacies, and then exploit it to discover never-before-seen instances indexed by popular web search engines...
This article gives an overview of the currently available literature on web page ranking algorithm using machine learning. Web page ranking algorithm, a well-known approach to rank the web pages available on cyber world. It helps us to know -- how the search engine exactly works and how a machine learn itself while giving priority to the page that which page is important to successfully fulfills the...
The world is completely working on digital data. The largest and prime or main collection of this digital data is web. The size of this web is increasing round-the-clock. The principal problem is to search this huge database for specific information. To state whether a web page is relevant to a search topic is a dilemma[l]. There are many techniques to state the relevancy but if focus on the users'...
Search engines are vital in the current digital world. Given the huge amount of information on the internet, search engines are vital tools that internet users are using to search web pages for the required information. However, most of the search engines currently in the market are inadequate and thus do not completely serve the needs of internet users. This is because in most cases they give results...
As ever more data is collected in the business processes of large enterprises, the decision makers need new business intelligence tools. Enterprise reporting tools have been available for some time, however, while reports aggregate business data, the large number of reports can become unmanageable. The European funded project Questor aims to create a revolutionary product that will eliminate the complexities...
With the rapid development of World Wide Web, search engines have become the main tool for people to get network information. However, the search results are widely criticized due to the poor accuracy and redundancy disadvantages. After the advent of semantic web, new search engine with the ability of understanding queries and documents has attracted more and more attentions. This paper starts from...
A Web crawler is an important component of the Web search engine. It demands large amount of hardware resources to crawl data from the rapidly growing and changing Web. The crawling process should be performed continuously to maintain up-to-date data. This paper develops a new approach to speed up the crawling process on a multi-core processor by utilizing the concept of virtualization. In this approach,...
Web services have become a primary mechanism for consuming resources available on the Internet. As more and more services are published on the Web, automated service discovery is critical to consumers to identify relevant and reliable services efficiently. In this paper, we enhance the Web Service Crawler Engine (WSCE) framework by introducing comparison measures to allow for more accurate identification,...
Over the past few years, Cloud computing has been receiving much attention as a new computing paradigm for providing flexible and on-demand infrastructures, platforms and software as services. In Cloud computing, challenges in searching cloud services need to be renewed due to a number of unique characteristics of cloud services such as the dynamic, diverse services offering at different levels, as...
With the rapid development of microblog technology, many interesting research issues on microblog have aroused growing attention. Data fetching from microblog is the groundwork of these researches. In this paper we take Sina microblog (also called Weibo) as the crawling site, designing and implementing a high efficient incremental microblog crawler based on the classic multi-producers and multi-consumers...
The widespread use of Internet provides a good environment for e-commerce. Study on e-commerce network characteristics always focuses on the Taobao. So far, researches based on Taobao are related to credit rating system, marketing strategy, analysis of characteristics of the seller and so on. The purpose of all these studies is to analyze online marketing transactions in e-commerce. In this paper,...
Semantic web search engine is the new generation of conventional web search engine that brings precise and meaningful information from the Internet. These new search engines answer user queries using Semantic Web Documents (SWDs) that are found in ontologies database. It is likely that a query may have more than one range in a domain. The semantic web search engines such as Hakia, Swoogle, and Watson...
This study develops a web information retrieval system using the fuzzy relations in the indexing and ranking portions of standard web retrieval methods. The system was developed, including crawler, indexer, ranking portion, and user search structure. The BK-products of fuzzy relations with closure/interior properties are used to construct a fuzzy thesaurus and further to retrieve the relevant documents...
This paper is aimed to create implementation crawler engine or search engine using cloud computing infrastructure. This approach use virtual machines on a cloud computing infrastructure to run service engine crawlers and also for application servers. Based on our initial experiments, this research has successfully built crawler engine that runs on Virtual Machine (VM) of cloud computing infrastructure...
Designing and developing an effective web crawler is a challenging role in a large search engine. This paper proposes component based web crawler along with the indexer. The WebCrawler consist of crawler services and indexer services and realized as web services. The communication between the services is sent and received using XML, SOAP and WSDL. In the crawler service, the web pages are fetched...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.