Search results

Items from 1 to 9 out of 9 results

chapter

Automatic Synonym and Phrase Replacement Show Promise for Style Transformation

Foaad Khosmood, R Levinson

2010 Ninth International Conference on Machine Learning and Applications > 958 - 961

2010 Ninth International Conference on Machine Learning and Applications (ICMLA 2010)

Style transformation refers to the process by which a piece of text written in a certain style of writing is transformed into another text exhibiting a distinctly different style of writing without significant change to the meaning of individual sentences. In this paper we continue investigation into the linguistic style transformation problem and demonstrate current achievements in transformation...

chapter

SCUT-COUCH Textline_NU: An Unconstrained Online Handwritten Chinese Text Lines Dataset

Hanyu Yan, Lianwen Jin, C Viard-Gaudin, H Mouchere

2010 12th International Conference on Frontiers in Handwriting Recognition > 581 - 586

2010 12th International Conference on Frontiers in Handwriting Recognition (ICFHR 2010)

An unconstrained online handwritten Chinese text lines dataset, SCUT-COUCH Textline_NU, a subset of SCUT-COUCH [1] [2], is built to facilitate the research of unconstrained online Chinese text recognition. Texts for hand copying are sampled from China Daily corpus with a stratified random manner. The current vision of SCUT-COUCH Textline_NU has 8,809 text lines (4,813 lines are collected by touch...

chapter

Statistically Exploring Semantic Accessibility Scale Based on English, French and Japanese Corpora: A Comparative Perspective

Ping-fang Yu, Jia-Li Du

2010 Third International Symposium on Information Processing > 203 - 207

2010 Third International Symposium on Information Processing (ISIP 2010)

The inter-language studies on the textual semantic accessibility scale (SAS) are a new branch of the computational linguistics and the present paper tries to statistically probe into the SASes in English, French and Japanese literature works sampled from the corresponding corpora. Firstly, six control groups are formed by the equidistant texts extracted every 10 pages, 5 pages, 4 pages, 3 pages, 2...

chapter

Exploit Kashida Adding to Arabic e-Text for High Capacity Steganography

A. Al-Nazer, A. Gutub

2009 Third International Conference on Network and System Security > 447 - 451

2009 Third International Conference on Network and System Security (NSS)

Steganography is the ability to hide information in a cover media such as text, and pictures. An improved approach is proposed to embed secret into Arabic text cover media using Kashida, an Arabic extension character. The proposed approach is maximizing the use of Kashida to hide more information, represented in binary bits, in Arabic text cover media. A stego system has been developed based on this...

chapter

Stylistic document retrieval for Turkish

D. Zamalieva, F. Kalaycilar, A. Kale, S. Pehlivan, more

2009 24th International Symposium on Computer and Information Sciences > 663 - 667

2009 24th International Symposium on Computer and Information Sciences (ISCIS)

In information retrieval (IR) systems, there are a query and a collection of documents compared with this query and ranked according to a particular similarity measure. Since texts with the same content can be written by different authors, the writing styles of the documents change as well accordingly. This observation brings the idea of investigating text by means of style. In this paper, we analyze...

chapter

Efficient Generation of Comprehensive Database for Online Arabic Script Recognition

R. Saabni, J. El-Sana

2009 10th International Conference on Document Analysis and Recognition > 1231 - 1235

2009 10th International Conference on Document Analysis and Recognition (ICDAR)

The difficulties in segmenting cursive words into individual characters have shifted the focus of handwriting recognition research from segmentation-based approaches to segmentation-free (holistic) methods. However, maintaining and training large number of prototypes (models) that represent the words in the dictionary make the training process extremely expensive and difficult in computing resources...

article

Distributional Features for Text Categorization

Xiao-Bing Xue, Zhi-Hua Zhou

IEEE Transactions on Knowledge and Data Engineering > 2009 > 21 > 3 > 428 - 442

Text categorization is the task of assigning predefined categories to natural language text. With the widely used 'bag of words' representation, previous researches usually assign a word with values such that whether this word appears in the document concerned or how frequently this word appears. Although these values are useful for text categorization, they have not fully expressed the abundant information...

chapter

Integrated system for Japanese word processing

R. Mladenov, M. Karova

2008 Conference on Human System Interactions > 44 - 47

2008 Conference on Human System Interactions

Typing Japanese texts with computers is not as straightforward as western ones. East-Asian languages use very large sets of symbols which are called ideograms or hieroglyphs. Typing words which are consisted of thousands of symbols is a process which must pass some procedures into which Latin characters are converted into hieroglyphs. This application is designed to assist typing Japanese texts with...

chapter

A novel technique for words reordering based on N-grams

T. Athanaselis, S. Bakamidis, I. Dologlou

2007 9th International Symposium on Signal Processing and Its Applications > 1 - 4

2007 9th International Symposium on Signal Processing and Its Applications (ISSPA)

This paper presents an approach for repairing word order errors in English text by reordering words in a sentence and choosing the version that maximizes the number of trigram hits according to a language model. The novelty of this method concerns the use of an efficient confusion matrix technique for reordering the words. For further reducing the number of permutations the use of unigramspsila probability...

Filter options

Keywords:
TEXT ANALYSIS
NATURAL LANGUAGES

Publication date

Set your own date range

INFONA - science communication portal

Search results

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Content availability

Publication type

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options