The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Although there are many resources used in Natural Language Process, a specialized knowledge base used in word sense disambiguation(WSD) is still a shortage. By extracting knowledge from different resources and using them, we can improve the accuracy rate of word sense disambiguation. In this article, we use different methods existed to extract properties from The Grammatical Knowledge-base of Contemporary...
The most fundamental step in semantic information processing (SIP) is to construct knowledge base (KB) at the human level; that is to the general understanding and conception of human knowledge. WordNet has been built to be the most systematic and as close to the human level and is being applied actively in various works. In one of our previous research, we found that a semantic gap exists between...
This paper presents an outline of our work to develop a word sense disambiguation system in Malayalam. Word sense disambiguation (WSD) is a linguistically based mechanism for automatically defining the correct sense of a word in the context. WSD is a long standing problem in computational linguistics. A particular word may have different meanings in different contexts. For human beings, it is easy...
Analysis system of sentence category (ASSC) analyzes the sentence semantic structure based on the theory of Hierarchical Network of Concept (HNC). ASSC as a kind of technology on Chinese Language Understanding and processing, can be widely used for all kinds of language information processing system. This paper studies the sentence testing capabilities of ASSC, gives the specific data representing...
Transfer Grammar is an integral component of a Rule based Machine Translation system. In this paper, we describe a subset of the transfer grammar developed for Tamil to Hindi Machine Translation system, i.e., the transfer of nominal constructions from Tamil to Hindi. Nominal constructions in Tamil, which is an agglutinative language, take multiple suffixes which may be case markers or other suffixes...
Conversion from another language to native language is highly demanding due to increasing the usage of web based application. Firstly, the respective sentence of a native language is converted to Universal Networking Language (UNL) expressions and then UNL expressions can be converted to any native language. Already UNL system is developed for most of the languages, but there are no algorithms to...
The Universal Networking Language (UNL) is a world wide generalizes form of human interactive language in a machine independent digital platform for defining, recapitulating, amending, storing and dissipating knowledge or information among people of different affiliations. The theoretical and applied research associated with this interdisciplinary endeavor facilitates in a number of practical applications...
The expression of lexical semantics is the crucial factor for natural semantic processing. This paper proposes a new theoretical model for constructing a lexical semantic knowledge base. According to this theory, semantic genes are the carriers of lexical meanings. They can be inherited from hypernyms to hyponyms, and during the inheritance they may be mutated. By heredity, recombination and variation...
Similarity between two sentences can be determined by either comparing their commonalities or their differences. Commonalities, which reflect similarity judgment, connect the two sentences while differences, which reflect dissimilarity judgment, represent the unique way of self-identification. Although both of them are essential in determining sentence similarity, however, the existing methods only...
At present, there are a lot of language sources that can be used in natural language process (NLP), which include corpus, dictionaries, databases and so on. However, because of the large number of the language sources and the lack of the national standards, the marks, formats and other aspects are different in these language sources. This forms the gaps in language resources and causes inconvenience...
Nowadays, the knowledge base of Question Answering system, usually stores question answering pairs or text content extracted from web pages. But in last decades, in order to improve work efficiency, many enterprises had made large investment on ERP (Enterprise Resource Planning) and MIS (Management Information System). And the systems have accumulated a great deal of useful business knowledge, of...
In the research of events description in the identification unit based on rules, formulation of rules becomes a core issue. This thesis selects 3751 information corpus of TCT corpus for the study, summarizes 9 predicate identification rules and 14 phrases merger rules, and quantify the importance of the rules by the contribution. Furthermore, the relationship between the binding rules is studied by...
In Natural Language Processing (NLP) symbolic systems, several linguistic phenomena, for instance, the thematic role relationships between sentence constituents, such as AGENT and PATIENT, can be accounted for by the employment of a rule-based grammar. Another approach to NLP concerns the use of the connectionist model, which has the benefits of learning, generalization and fault tolerance, among...
This paper reports our work on Chinese semantic role labeling, which takes advantage of hierarchical semantic knowledge from a common sense knowledge base named HowNet. On one hand, the words in lexical features such as predicate and head word are generalized with their hypernyms in HowNet. On the other hand, the hypernym-hyponym relation between sememes is used to capture the semantic similarity...
Machine translation, a part of computational Linguistics, belongs to Natural Language Processing (NLP) and is a hot issue in the computational society. Gap between the linguist and the computer programmer, gives birth to so many problems like lexical ambiguity, syntactic and structural ambiguity, polysemy, induction, discourses, anaphoric ambiguity and different shade of meanings. Mostly English-to-Urdu...
Remarkable performance has been reported to recognize single object classes. Scalability to large numbers of classes however remains an important challenge for today's recognition methods. Several authors have promoted knowledge transfer between classes as a key ingredient to address this challenge. However, in previous work the decision which knowledge to transfer has required either manual supervision...
In this paper, we introduce a specific document searching system for e-libraries. The searching system is based on the Vietnamese Language Query Processing (VLQP) framework. The VLQP framework is built to support the implementation of Vietnamese language query processing component for intelligent searching systems in library field. At the present, VLQP's features still limit, but this framework may...
We present a method for improving existing statistical machine translation methods using an knowledge-base compiled from a bilingual corpus as well as sequence alignment and pattern matching techniques from the area of machine learning and bioinformatics. An alignment algorithm identifies similar sentences, which are then used to construct a better word order for the translation. Our preliminary test...
As a kind of data model, a formal context must be extracted from some actual data sources such as documents. For case of unstructured Chinese document, it is the first question to decide how to express the document. Vector space model (VSM) which is the dominant model of document expression now takes a single word as a feature item, so that neglects the lexical semantic relationship between words...
Cross-document coreference resolution plays an import part in the filed of natural language processing (NLP). It captures the ability of gathering documents for information about a certain entity. Most previous algorithms identify the underlying entity of a given document depending on the original text, which is unreliable if the original text contains multiple parts of different themes. In this paper,...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.