The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Large alphabet languages such as Chinese present different problems for language modelling compared to small alphabet languages such as English. In this paper, we describe adaptive models of Chinese text based on the Partial Predictive Match (PPM) text compression scheme that learns the language as the text is processed sequentially. We describe several character-based, word-based and part-of-speech...
Natural Language Processing (NLP) is one of the most important research areas carried out in the world of Human language. For every language, spell checker is an essential component of many of the common Desktop applications, Machine Translation system and Office Automation system. In Myanmar, Myanmar Language is used as an official language. Myanmar Pronunciation and orthography has differences because...
Nowadays the computer-aided Communicative Language Teaching has been widely applied in Chinese universities. This paper is to prove the effectiveness and viability of it by using case studies through classroom observations, diary entries and interviews.
We propose a method to improve traditional character-based PPM text compression algorithm for natural languages. Consider a text file as a sequence of alternating words and non-words, the basic idea of our algorithm is to encode non words and prefixes of words using character-based context models and encode suffixes of words using dictionary models. By using dictionary models, the algorithm can encode...
Concordancing is a technique which analyzes text corpora to show how any given word or phrase in the text is used in the immediate contexts in which it appears. The main focus of this technique consist in discovering patterns and rules of authentic language use through analysis of actual usage, and generating theories of what does not account for the probable choices that speakers actually make. In...
Pattern technology of writing is widely used in machine writing systems. In pattern technology, there are two core issues about chunk as writing material in the pattern. One is chunk retrieval, the other is chunk packing. In this study, we focus on the problem of chunk retrieval and packing with given style, topic and content of practical writing. We proposed the concept of chunk, and discussed the...
Almost Instantaneous VF code proposed by Yamamoto and Yokoo in 2001, which is one of the variable-length-to-fixed-length codes, uses a set of parse trees and achieves a good compression ratio. However, it needs much time and space for both encoding and decoding than an ordinary VF code does. In this paper, we proved that we can multiplex the set of parse trees into a compact single tree and simulate...
Many extant natural language watermarking techniques demand deep structure analysis, and so suffer in reliability. We propose a scheme for natural language watermarking, which embedding watermark bits into the pragmatics feature of text by rewriting sentences. In contrast, we eschew syntactic and semantic analysis. We make use of transformation templates and our templates based on pragmatics rule...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.