The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
A new method based on model selection for acoustic model training is proposed .The MPE trained model and the MLE trained model is used for model selection for the following training. The selection criteria is based on the ratio of the inter-variance to the intra-variance of each model. Besides we also propose a cluster method for the model in order to get the accuracy information for the weight calculation...
When using natural language question to retrieval document, query expansion are the key factors that affect its retrieval performance. By analyzing the traditional query expansion method, this paper puts forward a query expansion method based on set theory for answering document retrieval. In order to verify the validity of the method, a similarity calculation method for question and candidate answer...
This paper presents a method of sentiment and sentimental agent identification based on Chinese sentimental sentence dictionary. Our method can identify eight kinds of sentiment (including joy, sorrow, love, disgust, surprise, anxiety, anger and hate), and the main sentimental agent. Sentimental sentence dictionary is composed by some sentimental sentence patterns. And the sentiment of a candidate...
Recently, emotion recognition with computer has attracted a great deal of attention to researchers for its broad applications. Emotion estimation from textual input has also become active as natural language processing (NLP) technology develops. However, when it comes to negative sentences in Chinese, the original emotion estimation may be reversed which makes obtaining correct recognition results...
This paper presents a two-step dependency parser to parse Chinese deterministically. By dividing a sentence into two parts and parsing them separately, the error accumulation can be avoided effectively. Previous works on shift-reduce dependency parser may guarantee the greedy characteristic of deterministic parsing less. This paper improves on a kind of deterministic dependency parsing method to weaken...
In this paper we present a Chinese query expansion model based on topic-relevant terms which were acquired from the Google search engine automatically. In contrast to earlier methods, our queries are expanded by adding those terms that are most relevant to the concept of the query, rather than selecting terms that are relevant to the query terms. Firstly, we use automatically extracted short terms...
The size and growth rate of biomedical abbreviation are increasing very fast, automatic construction of biomedical abbreviations dictionary from text helps to understand biomedical literature, and to update existing databases, ontologies, and dictionaries. This paper proposes a new method for automatic construction of biomedical abbreviations dictionary from text by combining string matching algorithm...
Automatic essay scoring system is a very important research tool for many educational studies. Many researches indicate that AES systems should be able to analyze semantic characteristics of an essay and include more such features to score essays. This paper makes an assumption: some concepts that can be regarded as literary concepts would only be utilized by skillful writers. However, it is a difficult...
Word sense disambiguation (WSD) is a process of identifying proper meaning of words that may have multiple meanings. It is regarded as one of the most challenging problems in the field of natural language processing (NLP). Nepali Language also has words that have multiple meanings, thus giving rise to the problem of WSD in it. In this paper, we investigate the impact of NLP resources like morphology...
Semantic similarity is a fundamental concept and widely researched and used in the fields of natural language processing. By analyzing the definition of the concept in HowNet2008, this paper proposes a new method of semantic similarity calculation. The concepts are classified into three classes: simple concept; complex concept and combined concept. To different concept, we design different method...
This paper studies the word sense disambiguation of English modal verb ldquomayrdquo. Based on the analysis of the sense, category of modality and function of ldquomayrdquo in different contexts in the training corpus, a model of back propagation neural network for word sense disambiguation of ldquomayrdquo is established. It takes the mutual information of epistemic and non-epistemic ldquomayrdquo...
Research on cross-language information retrieval (CLIR) increasingly concentrates in candidate translation selection of the keywords in the query. The accuracy of translation has a direct impact on accurate rate and recalled rate. This thesis presents three methods based on HowNet to resolve query translation ambiguity of CLIR. The first is based on semantic relation, and it uses semantic relation...
Speech recognition systems are usually trained using tremendous transcribed utterances, and training data preparation is intensively time-consuming and costly. Aiming at reducing the number of training examples to be labeled, active learning is used in acoustic modeling of speech recognition, this learning scheme iteratively inspects the unlabeled samples, selects the most informative samples corresponding...
This paper presents a novel extractive approach which takes advantage of geodesic distance for sentence similarity computation to multi-document summarization task. Based on geodesic distance between every two sentences, the text relationship map is constructed. Sentences with higher degree in the map are selected and grouped into clusters. Finally, sentences with highest degree of each cluster are...
Sentence similarity computing plays an important role in the question answering (QA) system. Because there are many question expressions for one meaning, we present a new approach to match question based on fuzzy set. In this paper, we establish a library of standard questions. Each standard question is relative with a series of Keywords. The main focus of this paper lies with matching of standard...
This paper proposed a novel reordering model based on the reordering of source language chunks. This model is used as a preprocessing step of phrase-based translation models and could be well integrated with them. At the same time, as a chunk-based model, syntax information could be concerned in the process of reordering while the entire parsing of the source sentence is not required. Two experiments...
This work proposes an approach to address the problem of improving content selection in automatic text summarization by using probabilistic neural network (PNN). This approach is a trainable summarizer, which takes into account several features, including sentence position, positive keyword, negative keyword, sentence centrality, sentence resemblance to the title, sentence inclusion of name entity,...
With the rapid development of text summarization, evaluation methods for automatic summarization system is becoming more and more important in natural language processing, which can promote development of text summarization greatly. This paper analyzes the existed methods for automatic summarization evaluation, and introduces a new evaluation method based on HowNet. The original tests have shown that...
This paper is a grope research on applying the technology of information extraction (IE) in the field of information content security and the focus is semantic orientation recognition for short message service (SMS). An experimental SMS monitoring system will be introduced, in which rules based and hidden Markov model (HMM) based IE patterns are integrated to recognize the semantic orientation of...
In this paper, we pay attention to the Chinese separable verb-object words in sentences. We collect sentences with morphemes of separable words employing search engine for thousands of separable words collected by linguist. Then sentences are tagged artificially with agreement of whether being used separable words on the result of word segmentation and part-of-speech tagging, According to typical...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.