The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
The graphical text representation method such as Conceptual Graphs (CGs) attempts to capture the structure and semantics of documents. As such, they are the preferred text representation approach for a wide range of problems namely in natural language processing, information retrieval and text mining. In a number of these applications, it is necessary to measure the dissimilarity (or similarity) between...
Latent semantic analysis (LSA) is a vector space technique for representing word meaning. Traditionally, LSA consists of two steps, the formation of a word by document matrix followed by singular value decomposition of that matrix. However, the formation of the matrix according to the dimensions of words and documents is somewhat arbitrary. This paper attempts to reconceptualize LSA in more general...
With the increased demand for English communication, various styles of learning support methods have been proposed and provided to the Japanese learners. However, there are still many learners finding it hard to read, write and speak in English. Regardless of language difference, understanding the other's intention and emotional status accurately and expressing what they think or feel to the others...
In this paper, we introduced a new semantic induction metric which can induce some semantic classes from a set of domain-specific unannotated data. We emphasized on the co-occurrence probability instead of just distances of word probability distribution. Compared to the traditional approach on right or left context to calculate the similarity, we used both left and right information simultaneously...
As an inherently language-dependent feature, the expansion of numbers to their full written-out forms can also be accomplished with language-independent code for multilingual TTS systems. In this paper we present a number expansion system that works with little data and language-independent code, while still able to expand numerics in dozens of languages. The paper also describes a way to determine...
Named entity relations are a foundation of semantic networks, ontology and the semantic Web, and are widely used in information retrieval and machine translation, as well as automatic question and answering systems. Relation feature selection and extraction are two key issues. The location features possess excellent computability and operability, and the semantic features have strong intelligibility...
The sentence similarity computation plays an significant role in the fields of Chinese language processing. The paper presents a new approach to calculate the Chinese question semantic similarity, which is divided into two steps: the first step is to disambiguate the word sense in the question, and the second step is to compute the question semantic similarity based on the word sense. This paper uses...
As software systems continue to grow and evolve, locating code for maintenance and reuse tasks becomes increasingly difficult. Existing static code search techniques using natural language queries provide little support to help developers determine whether search results are relevant, and few recommend alternative words to help developers reformulate poor queries. In this paper, we present a novel...
In this paper, I develop a new approach to improve the Contradiction Detection (CD) incorporating the additional information from the context, which aimed at improving the CD precision. My work focuses on using additional context from the sentences or pages to disambiguate the event/entity. I give a new method but the probability to do the ambiguity examination. Whatpsilas more, I develop a CD system...
Generally phrasal verbs comprise a verb followed by a preposition that is commonly occurring feature in English. Each of the phrasal verbs acquires absolutely different meanings in different contexts. Having highly context dependent meanings, phrasal verbs may be disambiguated only by devising a technique involving utilization of semantic information pertaining to the context. This paper presented...
In this paper, we propose an unsupervised machine learning method to automatically construct a product hierarchical concept model based on the online reviews of this product. Our method starts by representing each candidate noun using a feature context vector, which is simply a vector of all its co-occurring neighbors excluding itself. We then applied bisection clustering to hierarchically cluster...
This paper makes a systematic study on disambiguating sentiment ambiguous adjectives within context in real text, which is an interaction between word sense disambiguation and sentiment analysis. We firstly address the issue of inter-annotator agreement on assigning semantic orientations to word occurrences in real text. Secondly we demonstrate that co-occurring sentiment monosemous adjectives can...
Most existing corpus based relation extraction techniques focus on predefined relations. In this paper, a clustering based method is presented for domain relevant relation extraction including both relation type discovery and relation instance extraction. Given two raw corpora, one in the general domain, one in an application domain, domain specific verbs connecting different instances are extracted...
Toponym resolution involves two stages: toponym recognition and reference resolution. The present research proposes an integrative model for the resolution of Chinese toponym reference in discourse. The model employs the concept of cognitive salience, an inherent property of toponym references, to integrate different heuristic reference disambiguation methods into a coherent design. The heuristic...
Word sense disambiguation (WSD) is a process of identifying proper meaning of words that may have multiple meanings. It is regarded as one of the most challenging problems in the field of natural language processing (NLP). Nepali Language also has words that have multiple meanings, thus giving rise to the problem of WSD in it. In this paper, we investigate the impact of NLP resources like morphology...
Semantic similarity is a fundamental concept and widely researched and used in the fields of natural language processing. By analyzing the definition of the concept in HowNet2008, this paper proposes a new method of semantic similarity calculation. The concepts are classified into three classes: simple concept; complex concept and combined concept. To different concept, we design different method...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.