The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Translingual Text Mining (TTM) is an innovative technology of natural language processing for building multilingual parallel corpora, processing machine translation, contextual knowledge acquisition, information extraction, query profiling, language modeling, contextual word sensing, creating feature test sets and for variety of other purposes. The Keynote Lecture will discuss opportunities and challenges...
We design a hierarchical collocation extraction tool according to the three-layered linguistic properties of collocation. Based on the structured definitions of collocation, the extraction goes through three phases: i) extracting peripheral collocations in the frequency layer from dependency triples, ii) extracting semi-peripheral collocations in the syntactic layer by association measures (AMs),...
Even the most cutting-edge communication-mediated technology like satellite navigation for orbit positioning, pedestrian movement recognition systems based on inertial sensors, 5G systems, let alone medical devices for coordination of human organs functionality would not be invented without technologies for language processing as an information source between humans and communication systems. Regardless...
Even the most cutting-edge communication-mediated technology like satellite navigation for orbit positioning, pedestrian movement recognition systems based on inertial sensors, 5G systems, let alone medical devices for coordination of human organs functionality would not be invented without technologies for language processing as an information source between humans and communication systems. Regardless...
Natural language processing requires a lot of analysis and information regarding words and segment of sentence. Almost all NLP applications such as machine translation, information extraction, automatic summarization, question answering system, natural language generation, etc., require successful identification and resolution of anaphora. Information regarding word using POS tagger, parser and other...
As the fear of semantics is declining, let's formalize the unformalizable! The address is focusing on the feasibility and necessity of accessing directly the comprehensive meaning of natural language texts, data, images, etc., to emulate human understanding by the computer. It is based on the premise that, without such understanding, no real-life application can reach the precision that human users...
A context-aware approach based on machine learning and lexical analysis identifies ambiguous terms and stores them in contextualized sentiment lexicons, which ground the terms to concepts corresponding to their polarity.
Starting from the definition of treebanks and considering that treebanks are theory dependent, we propose an annotation scheme for Romanian using several approaches ranging from phrase structure to dependency grammars and property grammars. The annotation has its starting point in a generative grammar study of the Romanian AP and validates the data of the linguistic study using an annotation scheme...
This paper presents the Interactive Arabic Dictionary (IAD) developed at the Higher Institute for Applied Sciences and Technology (HIAST). IAD is an interactive web application based on the “Al-Wasseet” dictionary. It provides the different meanings of words with examples and multimedia illustrations. IAD presents also other related information like associated words, semantic domains, expressions,...
This paper deals with the importance of Natural Language Toolkit for the course of Computational Linguistics and for scientific research in the field of natural language processing. Peculiarities of Python programming language, used in Natural Language Toolkit, are described. The specific experience of studying Natural Language Toolkit in the course of Computational Linguistics is considered.
This paper deals with the implementation of semantic analysis within the linguistic theory of Functional Discourse Grammar as a part of natural language analysis. The algorithm of the semantic analyzer and tools for its implementation are reviewed. The specific example of the analyzed linguistic unit is considered.
Multiword expressions (MWEs) are important for practical applications, such as machine translation (henceforth, MT), multilingual information retrieval, data mining and other natural language processing. A method of combining similarity measure and statistical tool is proposed for automatically extracting English MWEs from the corpus of Chinese government white papers and work reports from 1991 to...
This paper describes an approach to Vietnamese text summarization, concentrated on the discourse structure of the text. Based on characteristics of Vietnamese, we propose rules for segmenting text into elementary discourse units (edus) and for recognizing discourse relations between textual spans. The score of an edu is computed based on the discourse tree. The edus with highest scores are chosen...
Selectional Preferences (SPs) in verb-object(VO) constructions have been widely used in NLP applications, such as WSD, metaphor comprehension etc. To estimate the number of verbs that have strong SPs, 38,119 VO types of 1,462 verbs are extracted from "Modern Chinese Cihai", tagged in How Net sense inventory with automatic tagging algorithm, The statistics indicates that only about 50% verbs...
In this paper we analyse one of the most challenging problems in natural language processing: domain adaptation in sentiment classification. In particular, we look for generic features by making use of linguistic patterns as an alternative to the commonly feature vectors based on n-grams. The experimentation conducted shows how sentiment classification is highly sensitive to the domain from which...
The aim of this paper is to present a model based on a fuzzy linguistic approach to evaluate the quality of digital libraries. The quality evaluation of digital libraries is defined using users' perceptions on the quality of digital services provided through their Web sites. We assume a fuzzy linguistic modeling to represent the users' perception and apply automatic tools of fuzzy computing with words...
Previous studies showed that linguistic information contained in source code is a valuable source of information and can help to improve program comprehension. The proposed research focuses on improving the quality of source code by studying common negative practices with respect to linguistic information. The definition of the so called linguistic antipatterns are expected to increase the awareness...
Mistakes are generally considered a negative phenomenon. There is however a positive face of mistakes, essential in learning, in teaching, in scientific research and in any creative work. We discuss the reason of this fact, we give several such examples and we show how various historical failures in some respect became important successes in another respect. In a second part of this paper, three types...
The task of analytic-synthetic process of text information cannot be solved without deep knowledge about the structure of language, without using a deep linguistic processor, processing morphology and syntax as well as semantics of language. On the stage of semantic analysis methods and algorithms of the theory of categories and predicate algebra are suggested to be used for defining connections between...
In the last decades the computational linguistics community has developed important and widely used lexical resources. Although they are very popular among the Natural Language Processing (NLP) community, they do not address two important characteristics of language. The first is that the meaning of a word in a language is a collective effort defined by the people who use the language. The second...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.