The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Semantic similarity of texts is one of the important areas of Natural Language Processing, and there are several approaches to measure similarity: statistical, WordNet based, and hybrid. For all of these approaches, a lexical knowledge is used such as corpus or semantic network. WordNet is one of the most preferred and mature lexical knowledge base. In this study, we have focused on measuring semantic...
Accurate recognition of the meaning of words in the given document fragments has a long historical standing in computational linguistics. Because of its importance in understanding natural language semantics, it becomes one of the most challenging issues within this field. In this paper, we present an enhanced knowledge-based word sense disambiguation (WSD) algorithm that manipulates by calculating...
In this paper, a knowledge based approach for Word Sense Disambiguation (WSD) in Bengali language has been presented. Bengali WordNet, developed at ISI Kolkata has been used as a knowledge base and the input data set is prepared from the Bengali Text Corpus developed in the TDIL (Technology Development for Indian Language) project of the Government of India. The proposed approach resolute the exact...
Applying human cognition to a search engine for information retrieval is an emerging task employed to various implementations and one among them is natural language understanding by the machine in a semantic manner. Natural language processing systems will be constructed using inference engines along with a knowledge base (KB) to store rules and facts. High-Performance Linguistics (HPL) Scheme is...
Natural Language Processing (NLP) constitutes a fundamental module for a plethora of domains where unstructured text is a predominant source. Despite the keen interest of both industry and research community in developing NLP tools, current industrial solutions still suffer from two main cons. First, the architectures underlying existing systems do not satisfy critical requirements of large-scale...
Measuring the similarity between strings plays an increasingly important role in many applications such as information retrieval, short answer grading, and conversational agent software. There has been much recent research interest in applying string similarity within Arabic language applications; however, the use of string similarity in Arabic poses a substantial challenge such as the complexity...
This work introduces CONCEPTUM, an advanced knowledge discovery system for speed-reading natural language texts and allowing faster and more effective learning. CONCEPTUM sports a huge plethora of features, ranging from language detection and conceptualization, up to semantic categorization, named entity recognition and automatic ontology building, effectively turning an unstructured textual source...
Text is the backbone of the web and most of the information and human knowledge is represented in natural language. Every day, a vast amount of textual information is posted in web portals, wikis and news sites and necessitates automated approaches to analyze and understand their content. In this paper, we present a case-based reasoning approach to transform natural language sentences into first order...
Word Sense Disambiguation (WSD) is the task of automatically choosing the correct meaning of a word in a context. Due to the importance of this task, it is considered as one of the most important and challenging problems in the field of computational linguistics and plays a crucial role in various natural language processing (NLP) applications. In this paper, we present an improved version of a recent...
Semantic relatedness measure play an important roles in Natural Language Processing (NLP) tasks. By using the knowledge bases and current methods, the semantic relatedness measure could be done. This time, we implement the hybrid method in measuring semantic relatedness between the pair of word. Hybrid method is one of the most popular method that used to measures semantic relatedness. Hybrid method...
With the rapid development of Internet, how to extract personal relations from Internet has become an important research topic in information extraction. However, current relation extraction researches mainly focus on the processing of English language, the researches focus on Chinese are less. At the same time, there are two main problems in current personal relation extraction approaches: 1) it...
Measuring semantic similarity between short texts is challenging because the meaning of short texts may vary dramatically even by a few words due to their limited lengths. In this paper, we propose a novel similarity measure for terms that allows better clustering performance than the state-of-the-art method. To achieve such performance, we incorporate knowledge-based and corpus-based term similarity...
Twitter has became an invaluable source of information, due to his dynamic nature with more than 400 million tweets posted per day. Determining what an individual post is about can be a non trivial task because his high contextualization and his informal nature. Named Entity Linking (NEL) is a subtask of information extraction that aims to ground entity mentions to their corresponding node in a Knowledge...
Word Sense Disambiguation (WSD) is a key factor in written and verbal communication of natural language processing. It is a method of selecting the appropriate sense of an ambiguous word in the given context. This paper aims at determining the correct sense of the given ambiguous word in Hindi language. A modified Lesk approach is used which uses the concept of dynamic context window. Dynamic context...
We outline the design of a visualizer, named Vishit, for texts in the Hindi language. The Hindi language is lingua franca in many states of India where people speak different languages. The visualized text serves as a universal language where seamless communication is needed by many people who speak different languages and have different cultures. Vishit consists of the following three major processing...
This paper presents a knowledge representation framework for natural language understanding. Here we propose an automated knowledge acquisition mechanism that mirrors information extraction in human-human interaction. This framework utilizes knowledge based automatic role labeling and automatic concept learning together with a conceptual structure that captures intent and context. The resulting framework...
A context-aware approach based on machine learning and lexical analysis identifies ambiguous terms and stores them in contextualized sentiment lexicons, which ground the terms to concepts corresponding to their polarity.
The quality of decisions made in business and government relates directly to the quality of the information used to formulate the decision. This information may be retrieved from an organization's knowledge base (Intranet) or from the World Wide Web. Intelligence services Intranet held information can be efficiently manipulated by technologies based upon either semantics such as ontologies, or statistics...
In this paper, an intelligent concept based search engine has been presented that can be used as a multilingual platform for different search queries. It retrieves those results pages also which don't have directly the keywords but contains the synonyms or related words. In response to a query for the word “car” it will also retrieve web pages which don't have directly the word “car” but have the...
Standards and the need for standards, for example for annotation purposes, only emerge after a period of time. Before, people just did what they thought was right. This may have resulted in large amounts of data in a format that in the end did not turn out to be on speaking terms with the (new) standard. This format may even have become a de facto standard for a particular language or in a particular...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.