The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Word-sense disambiguation is one of the key concepts in natural language processing. The main goal of a language is to present a specific concept to the audience. This concept is extracted from the meaning of words in that language. System should be able to identify role and meaning of words in order to identify the concepts in texts properly. This issue becomes more problematic if there are words...
The main goal of focused web crawlers is to retrieve as many relevant pages as possible. However, most of the crawlers use page rank algorithm to lineup the pages in the crawler frontier. Since the page rank algorithm suffers from the drawback of “Richer get rich phenomenon”, focused crawlers often fail to retrieve the hidden relevant pages. This paper presents a novel approach for retrieving the...
The natural language semantic corpus construction is the key step to implement information exchange in the intelligent cloud-computing environment. This paper makes a detailed analysis of semantic corpus construction technologies, and proposes a new webpage de-duplication algorithm based on TF-IDF and word vector distance. Experimental results show accuracy and efficiency of the proposed method. Our...
Many classification techniques can automatically summarize text into topics and accordingly identify topic terms from the online reviews. Among these techniques Latent Dirichlet Allocation (LDA) and Latent Semantic Analysis (LSA) are some of the most often employed approaches. LDA is a probability generated model that projects a document into the topic space using Dirichlet Distribution, and each...
Operation services are reusable and shareable units of configuration code executed by configuration management tools (CMTs), achieving continuous deployment and continuous delivery. With the prevalence of DevOps (Development and Operations), thousands of operation services have been developed for various software systems, and they are publicly available through the online repositories of popular CMTs...
When manually testing Web sites humans can go with vague, yet general instructions, such as "add the product to shopping cart and proceed to checkout". Can we teach a robot to follow such instructions as well?In this paper I present a novel model, called semantic usage patterns which allows us to capture the general topics behind the individual steps of interactions. These models can be...
Sentiment analysis uses NLP (Natural Language Processing) technology to analyze people's sentiment or opinion. SA can treat the subjectivity, sentiment and opinion of text by calculation. In recent years, because a large number of opinion appear in web such as discussion forum, review sites, blogs and news sites, sentiment analysis has become an important research area. In this paper, we designed...
To automatically test web applications, crawling-based techniques are usually adopted to mine the behavior models, explore the state spaces or detect the violated invariants of the applications. However, their broad use is limited by the required manual configurations for input value selection, GUI state comparison and clickable detection. In existing crawlers, the configurations are usually string-matching...
Most prior works in automatic examination rating focus on rating objective questions because natural language processing techniques cannot provide sufficient support for subjective answers rating. This paper presents a novel solution to subjective answer rating which is based on semantic matching of keywords rather than strict text matching. It provides higher accuracy, better flexibility and applicability...
1Success of Meetup groups is of utmost importance for the members who organize them. Given a wide variety of such groups, a single metric may not be indicative of success for different groups; rather, success measure should be specific to the interest of a group. In this paper, accounting for the group diversity, we systematically define Meetup group success metrics and use them to generate labels...
Online Public Opinion Systems (OPOS) target at collecting, analyzing, summarizing and monitoring massive public opinions on the Internet in real time. Meanwhile, OPOS often have the ability to identify the key or sudden events, and thus notify related people immediately for rapid responses to these events. As part of this endeavor, this paper introduces the architecture and techniques of an OPOS that...
Cloud computing technology is a new paradigm which provides Information Technology (IT) resources via the Internet. This new shift in the way that IT re-sources are offered to the user brings new challenges, such as cloud service discovery. Nowadays, cloud users are faced with a dilemma as they have an abundant choice of cloud services. Moreover, many cloud providers offer a range of services which...
Real recuperation of the maximum applicable Records on the subject of attention from the Network is troublesome due to the expansive sum of Data in all sorts of layouts. Concentrates on have been directed on methods to progress the productivity of Data recuperation (IR) schemes. To arrive to appropriate arrangements in IR schemes, machines want extra semantic Data that makes a contrast in accepting...
Relevant information retrieval from the www mainly depends on the technique and efficiency of a crawler. So crawlers must be capable enough to understand the text and context of a link which they are going to crawl. Anchor text contains a very useful information to know about the target web page. Because knowledge about the target web page content helps the crawlers to decide their preferences of...
Today's web is a human-readable where information cannot be easily processed by the machine. At the same time, the enormous amount of data has made it increasingly difficult to find, access, present, and maintain relevant information. The present data retrieval techniques are based on the full content coordinating of keywords, which lack the semantic data. In this paper the semantic based information...
There are about 3 billion indexed websites present in the WWW. Not all websites do not belong to a particular topic are indexed by a search engine say google.com, there are online platforms available where different users help the person asking for a (Universal Resource Locator) URL containing a topical information. To verify the authenticity and validity of the URL, an empirical methodology and its...
The information world WWW has more than 3 billion HTML pages and these web pages gain access through search engines only. Search engine is a program that searches the document for specified set of keywords and returns a list of documents where any or all of the specified keywords were found. As more information becomes available on the web, it is more difficult to provide effective search services...
Classification of web services through semantic service discovery of a similar event will be the feature services. However, to improve the selection and matching process is not enough. The existing service discovery approaches often published keyword matching to find web services practices. In this paper we propose a framework for automatic service classification and categorization of web service...
The world is completely working on digital data. The largest and prime or main collection of this digital data is web. The size of this web is increasing round-the-clock. The principal problem is to search this huge database for specific information. To state whether a web page is relevant to a search topic is a dilemma[l]. There are many techniques to state the relevancy but if focus on the users'...
Search engines are vital in the current digital world. Given the huge amount of information on the internet, search engines are vital tools that internet users are using to search web pages for the required information. However, most of the search engines currently in the market are inadequate and thus do not completely serve the needs of internet users. This is because in most cases they give results...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.