The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Measuring the similarity between strings plays an increasingly important role in many applications such as information retrieval, short answer grading, and conversational agent software. There has been much recent research interest in applying string similarity within Arabic language applications; however, the use of string similarity in Arabic poses a substantial challenge such as the complexity...
Huge volume of content is produced on multiple online sources every day. It is not possible for a user to go through these articles and read about topics of interest. Secondly professional articles, blog and forum have many topics discussed in a single discussion. Automatic knowledge-based topic models is a recent approach in Natural Language Processing that extract high quality topics from a large...
Word Sense Disambiguation (WSD) is the task of automatically choosing the correct meaning of a word in a context. Due to the importance of this task, it is considered as one of the most important and challenging problems in the field of computational linguistics and plays a crucial role in various natural language processing (NLP) applications. In this paper, we present an improved version of a recent...
In Natural Language Processing, Word Sense Disambiguation is defined as the task to assign a suitable sense of words in a certain context. Word Sense Disambiguation takes an important role and considered as the core research problem in computational linguistics. In this research, we conduct an experiment with Adapted Lesk Algorithm compared to original Lesk Algorithm to improve the performance of...
With the rapid development of Internet, how to extract personal relations from Internet has become an important research topic in information extraction. However, current relation extraction researches mainly focus on the processing of English language, the researches focus on Chinese are less. At the same time, there are two main problems in current personal relation extraction approaches: 1) it...
Nowadays, many devices and sensors we use in everydaylife are connected to the internet. We call this the IoT (Internet ofThings). With more things being used, it is getting more difficultfor users to use them efficiently. Without overcoming thischallenge, IoT cannot be vitalized. To solve this problem, manyvoice agent systems, including Apple Siri and Amazon Alexa, areextending their service domains...
Word Sense Disambiguation (WSD) is crucial and its significance is prominent in every application of computational linguistics. WSD is a challenging problem of Natural Language Processing (NLP). Though there are lots of algorithms for WSD available, still little work is carried out for choosing optimal algorithm for that. Three approaches are available for WSD, namely, Knowledge-based approach, Supervised...
Measuring semantic similarity between short texts is challenging because the meaning of short texts may vary dramatically even by a few words due to their limited lengths. In this paper, we propose a novel similarity measure for terms that allows better clustering performance than the state-of-the-art method. To achieve such performance, we incorporate knowledge-based and corpus-based term similarity...
Word sense disambiguation (WSD) is an essential task in computational linguistics for language understanding applications such as information retrieval, question answering, machine translation, text summarization etc. In this paper we propose an unsupervised WSD method for a Hindi sentence based on network agglomeration. First we create the sentence graph G for the given sentence. This sentence graph...
Twitter has became an invaluable source of information, due to his dynamic nature with more than 400 million tweets posted per day. Determining what an individual post is about can be a non trivial task because his high contextualization and his informal nature. Named Entity Linking (NEL) is a subtask of information extraction that aims to ground entity mentions to their corresponding node in a Knowledge...
Word Sense Disambiguation (WSD) is a key factor in written and verbal communication of natural language processing. It is a method of selecting the appropriate sense of an ambiguous word in the given context. This paper aims at determining the correct sense of the given ambiguous word in Hindi language. A modified Lesk approach is used which uses the concept of dynamic context window. Dynamic context...
This paper presents a knowledge representation framework for natural language understanding. Here we propose an automated knowledge acquisition mechanism that mirrors information extraction in human-human interaction. This framework utilizes knowledge based automatic role labeling and automatic concept learning together with a conceptual structure that captures intent and context. The resulting framework...
Preprocessing the input text is an essential component in a Natural Language Processing (NLP) system. We are discussing the relevance of the preprocessors in the context of Machine Translation system developed by us based on AnglaBharati Technology. Whenever we come across with text for translation we encounter with the special formats in an input text and getting its appropriate translation is a...
In this paper, an intelligent concept based search engine has been presented that can be used as a multilingual platform for different search queries. It retrieves those results pages also which don't have directly the keywords but contains the synonyms or related words. In response to a query for the word “car” it will also retrieve web pages which don't have directly the word “car” but have the...
Although there are many resources used in Natural Language Process, a specialized knowledge base used in word sense disambiguation(WSD) is still a shortage. By extracting knowledge from different resources and using them, we can improve the accuracy rate of word sense disambiguation. In this article, we use different methods existed to extract properties from The Grammatical Knowledge-base of Contemporary...
As a group of unknown words of Chinese information processing, the letter-word phrases used in Chinese texts can't be identified correctly by the existed segmentation software. Here, an auto-tagging system of letter-word phrases based on rules and statistical data is presented. At first, the system scans the sentences to get letter-strings, and then takes every letter string as an anchor and scans...
Knowledge capture is an important key in a business world where huge quantities of data are available via the Internet. Knowledge, as usable information, is a necessary element in the success of any organization. The recent growth of online information available in the form of academic paper related to algorithm and tool of Thai word segmentation distributed in various web sites, however it has not...
This paper presents a system that combines two text mining techniques; information extraction and clustering. A rule-based approach is used to perform the information extraction task, based on the dependency relation between some intransitive verbs and prepositions. This relationship helps in extracting types of crime from documents within the crime domain. With regard to the clustering task, the...
As a kind of data model, a formal context must be extracted from some actual data sources such as documents. For case of unstructured Chinese document, it is the first question to decide how to express the document. Vector space model (VSM) which is the dominant model of document expression now takes a single word as a feature item, so that neglects the lexical semantic relationship between words...
In natural language, it is very common that one word has several different meanings. The well solution of word sense disambiguation (WSD) problem is the basis of natural language processing. In this paper, based on the certainty of word sense in certain language context, a new method is put forward based on ontology to solve the problem of WSD. And also, a prototype is developed to solve the WSD of...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.