The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
In this work we tried to do morphological analyze both in Cyrillic Mongolian script and Traditional Mongolian script and define inflection method of suffix in accordance to orthography rule using computer.Mongolian language is an agglutinate language. The word building and word changing rules are based on the combination of grammatical and suffixes into the words stem. But the Cyrillic and Traditional...
This paper presents a system that combines NLP and hand-written rules for enhancing the text of authentic Web pages based on the needs of a specific language learner. It uses the Stanford CoreNLP system to process texts, and applies hand-written rules for retrieving language information that is relevant according to a given Common European Framework of Reference for Languages (CEFR) level. After the...
This survey mainly focuses on the developments of machine translation for the Indian languages. The survey throws a light on rule-based approach, empirical based approach and hybrid based approaches for machine translation. Every approach has its own advantages and disadvantages. Machine Translation (MT) is a process which translates from one language to another language. Due to rapid globalisation...
This study describes the construction of the TOCFL (Test Of Chinese as a Foreign Language) learner corpus, including the collection and grammatical error annotation of 2,837 essays written by Chinese language learners originating from a total of 46 different mother-tongue languages. We propose hierarchical tagging sets to manually annotate grammatical errors, resulting in 33,835 inappropriate usages...
Translingual Text Mining (TTM) is an innovative technology of natural language processing for building multilingual parallel corpora, processing machine translation, contextual knowledge acquisition, information extraction, query profiling, language modeling, contextual word sensing, creating feature test sets and for variety of other purposes. The Keynote Lecture will discuss opportunities and challenges...
Transliteration forms an essential part of transcription which converts text from one writing system to another. The need for translating data has become larger than before as the world is getting together through social media. Machine transliteration has emerged as a part of information retrieval and machine translation projects to translate named entities, that are not registered in the dictionary,...
We design a hierarchical collocation extraction tool according to the three-layered linguistic properties of collocation. Based on the structured definitions of collocation, the extraction goes through three phases: i) extracting peripheral collocations in the frequency layer from dependency triples, ii) extracting semi-peripheral collocations in the syntactic layer by association measures (AMs),...
Even the most cutting-edge communication-mediated technology like satellite navigation for orbit positioning, pedestrian movement recognition systems based on inertial sensors, 5G systems, let alone medical devices for coordination of human organs functionality would not be invented without technologies for language processing as an information source between humans and communication systems. Regardless...
Even the most cutting-edge communication-mediated technology like satellite navigation for orbit positioning, pedestrian movement recognition systems based on inertial sensors, 5G systems, let alone medical devices for coordination of human organs functionality would not be invented without technologies for language processing as an information source between humans and communication systems. Regardless...
The world inflection is an important area of computerized linguistics for the agglutinative languages. The presented paper provides an overview of the two main algorithms for learning of inflection rules. The TASR and OSTIA methods are implemented and analyzed with real life data from the Hungarian language. The main novelty of the research work is the development of a robust method to generate training...
Information Extraction (IE) is one of the most important Natural Language Processing (NLP) applications, which extracts information such as Named-Entities (NE) and collocation of terms from the corpus. Collocation is a sequence of terms that co-occur together in the corpus. In Arabic Information Extraction, there are many problems because of the complex of Arabic's grammar and ambiguity. In general,...
In this study, we offer some suggestions for the curriculum content design in teaching Mandarin Chinese as a second language based on our experimental works, with the aid of computational linguistics.
Natural language processing requires a lot of analysis and information regarding words and segment of sentence. Almost all NLP applications such as machine translation, information extraction, automatic summarization, question answering system, natural language generation, etc., require successful identification and resolution of anaphora. Information regarding word using POS tagger, parser and other...
Assigning the appropriate grammatical category to a word given a context is very important step in major areas of natural language processing. A limited numbers of Part of Speech Taggers currently exist for Arabic. These taggers mainly adopt tagsets that were developed for languages such as English. In this paper we present an effort of proposing a revised categories for Arabic POS tags that would...
The traditional mobile application development difficult, costly, cumber some management and other issues, the paper based on computing technology, combined with the Android platform presented in English applied language and computer linguistics online translation system design, improved the traditional English applied language application development for computer linguistics unified based on CCS...
In this paper, some results on the detection of variation in annotation in parsed corpora or tree banks are presented. Tree banks are generally built by means of using both automatic tools (i.e., taggers and parsers) and human intervention. In this process, inconsistencies (and, thus, variation) in the annotation arise, caused by a number of factors, for instance, disagreement in interpretation, incomplete...
Sentiment Analysis is one of the significant issues in the area of natural language processing, computational linguistics and text mining. It has also become a potential research area in bibliographic search and opinion mining, which is our main focus in this paper. Sentiment analysis of citations on schema-based research contents, such as scientific articles and reports, may not only makes an appropriate...
As the fear of semantics is declining, let's formalize the unformalizable! The address is focusing on the feasibility and necessity of accessing directly the comprehensive meaning of natural language texts, data, images, etc., to emulate human understanding by the computer. It is based on the premise that, without such understanding, no real-life application can reach the precision that human users...
The aim of this work is to construct a predictive algorithm based on linguistic computation to predict Post Traumatic Stress Disorder (PTSD) and comorbid conditions. Most contemporary work in psychometrics, concerns itself with the construction and validation of instruments in an attempt to measure personality, attitudes, sentiment and beliefs. Measurement of these phenomena is difficult and largely...
We propose a language-independent approach to clean up word alignment errors in an aligned parallel corpus, which are caused by the unsupervised word-align process. In such an aligned corpus, we evaluate the alignment patterns of one-to-many discontinuous words by statistical measures of collocation. The alignment of discontinuous words without strong collocation tendencies will be taken as errors...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.