Search results

Items from 1 to 19 out of 19 results

chapter

Disambiguated linear word translation in medium European languages

Marton Makrai

2015 6th IEEE International Conference on Cognitive Infocommunications (CogInfoCom) > 355 - 356

2015 6th IEEE International Conference on Cognitive Infocommunications (CogInfoCom)

An earlier paper used triangulated word translations as seed in linear translation between medium European languages. The present work improves upon it by handling word ambiguity both in the main (i.e. source and target) languages and in the pivot by training a multi-prototype vector space model in the former, filtering triangles based on scores computed by a linear model trained with direct (non-triangulated)...

chapter

Tamil to Malayalam Transliteration

Kavitha Raju, Sreerekha T. V., Vidya P. V., Rajeev R. R., more

2015 Fifth International Conference on Advances in Computing and Communications (ICACC) > 12 - 15

2015 Fifth International Conference on Advances in Computing & Communications (ICACC)

Transliteration forms an essential part of transcription which converts text from one writing system to another. The need for translating data has become larger than before as the world is getting together through social media. Machine transliteration has emerged as a part of information retrieval and machine translation projects to translate named entities, that are not registered in the dictionary,...

chapter

Word Sense Disambiguation Based on Feature Ranking Graph

Yeqing Li, Xiaoyu Qiu

2015 IEEE 29th International Conference on Advanced Information Networking and Applications Workshops > 209 - 212

2015 IEEE 29th International Conference on Advanced Information Networking and Applications Workshops (WAINA)

Research in the field of WSD has been conducted in computational linguistics as a specific task for many years. Language and context features have been shown to be very helpful for the task of word sense disambiguation. In this paper, we investigate the effectiveness of the graph-based ranking method on features from limited language data of word sense disambiguation. Contrary to existing method,...

chapter

Joint layer based deep learning framework for bilingual machine transliteration

Sanjanaashree P, Anand Kumar M

2014 International Conference on Advances in Computing, Communications and Informatics (ICACCI) > 1737 - 1743

2014 International Conference on Advances in Computing, Communications and Informatics (ICACCI)

Between the growth of Internet or World Wide Web (WWW) and the emersion of the social networking site like Friendster, Myspace etc., information society started facing exhilarating challenges in language technology applications such as Machine Translation (MT) and Information Retrieval (IR). Nevertheless, there were researchers working in Machine Translation that deal with real time information for...

chapter

To filter discontinuous word alignment for statistical machine translationaper

Chenchen Ding, Mikio Yamamoto

2014 International Conference on Audio, Language and Image Processing > 449 - 453

2014 International Conference on Audio, Language and Image Processing (ICALIP)

We propose a language-independent approach to clean up word alignment errors in an aligned parallel corpus, which are caused by the unsupervised word-align process. In such an aligned corpus, we evaluate the alignment patterns of one-to-many discontinuous words by statistical measures of collocation. The alignment of discontinuous words without strong collocation tendencies will be taken as errors...

chapter

Improving Word Alignment for Statistical Machine Translation Based on Constraints

Le Quang-Hung, Le Anh-Cuong

2012 International Conference on Asian Language Processing > 113 - 116

2012 International Conference on Asian Language Processing (IALP)

Word alignment is an important and fundamental task for building a statistical machine translation (SMT) system. However, obtaining word-level alignments in parallel corpora with high accuracy is still a challenge. In this paper, we propose a new method, which is based on constraint approach, to improve the quality of word alignment. Our experiments show that using constraints for the parameter estimation...

chapter

Malayalam word sense disambiguation

R P Haroon

2010 IEEE International Conference on Computational Intelligence and Computing Research > 1 - 4

2010 IEEE International Conference on Computational Intelligence and Computing Research (ICCIC 2010)

This paper presents an outline of our work to develop a word sense disambiguation system in Malayalam. Word sense disambiguation (WSD) is a linguistically based mechanism for automatically defining the correct sense of a word in the context. WSD is a long standing problem in computational linguistics. A particular word may have different meanings in different contexts. For human beings, it is easy...

chapter

English-Hindi Automatic Word Alignment with Scarce Resources

Eknath Venkataramani, Deepa Gupta

2010 International Conference on Asian Language Processing > 253 - 256

2010 International Conference on Asian Language Processing (IALP 2010)

Many automatic word alignment techniques have been so far developed in Natural Language Processing (NLP). However, word alignment between English and Hindi has not progressed much due to two main reasons viz. complex structure of the participating languages and the scarcity of Hindi-language resources. This paper provides a corpus-augmented method of word alignment in which these limitations have...

chapter

Use of a visual word dictionary for topic discovery in images

K Kandasamy, R Rodrigo

2010 Fifth International Conference on Information and Automation for Sustainability > 510 - 515

2010 5th International Conference on Information and Automation for Sustainability (ICIAfS)

The bag of visual words model has seen immense success in addressing the problem of image classification. Algorithms using this model generate the codebook of visual words by vector quantizing the features (such as SIFT) of the images to be classified. However, a codebook so formed tends to get biased by the nature of the dataset. In this paper we propose an alternative method to create the codebook...

chapter

Modeling Syllable-Based Pronunciation Variation for Accented Mandarin Speech Recognition

Shilei Zhang, Qin Shi, Yong Qin

2010 20th International Conference on Pattern Recognition > 1606 - 1609

2010 20th International Conference on Pattern Recognition (ICPR 2010)

Pronunciation variation is a natural and inevitable phenomenon in an accented Mandarin speech recognition application. In this paper, we integrate knowledge-based and data-driven approaches together for syllable-based pronunciation variation modeling to improve the performance of Mandarin speech recognition system for speakers with Southern accent. First, we generate the syllable-based pronunciation...

chapter

Discovery of patterns in LZ-78 text discrimination

M Malyutov, G Cunningham

2010 IEEE Region 8 International Conference on Computational Technologies in Electrical and Electronics Engineering (SIBIRCON) > 28 - 33

2010 IEEE Region 8 International Conference on "Computational Technologies in Electrical and Electronics Engineering" (SIBIRCON 2010)

We continue studying a new context-free computationally simple stylometry-based text homogeneity test: the sliced conditional compression complexity (sCCC or simply CCC) of literary texts introduced and inspired by the incomputable Kolmogorov conditional complexity. Other stylometry tools can occasionally almost coincide for different authors. Our CCC-attributor is asymptotically strictly minimal...

chapter

A Model of Chinese Word Sense Disambiguation Based on Combining Rule and Statistics Method

Yangsen Zhang, Haiyan Kang

2010 Second International Workshop on Education Technology and Computer Science > 2 > 230 - 234

2010 2nd International Workshop on Education Technology and Computer Science (ETCS)

For the existing disadvantage of Word Sense Disambiguation(WSD) research methods, we have analyzed the computability and computational complexity of knowledge Dictionaries with different structure, and chosen ??The Grammatical knowledge-base of Contemporary Chinese?? and ??the Semantic Knowledge-base of Contemporary Chinese?? which written by Institute of Computational Linguistics of Peking University,...

chapter

The study of chinese dictionary mechanism based on the suboptimal search tree

Zhiqiang Ma, Yila Su, Yao Ma

2010 The 2nd International Conference on Computer and Automation Engineering (ICCAE) > 3 > 65 - 69

2nd International Conference on Computer and Automation Engineering (ICCAE 2010)

The speed of dictionary query affects not only the speed of segmentation, but also the wide use of the segmentation system in the mass calculation. According to the different occurrence frequency of words in the text, the dictionary mechanism of the suboptimal search tree is designed so that the comparison times is reduced in the process of segmentation and the speed of segmentation is improved. Finally,...

chapter

Experiences with developing language processing tools and corpora for Amharic

B Gamback, L Asker

2010 IST-Africa > 1 - 8

2010 IST-Africa

A major bottleneck for promoting use of computers and the Internet is that many languages lack access to basic tools that would make it possible for people to access ICT in their own language. The paper describes the development a set of such resources for the processing of Amharic, the working language of the Ethiopian government. The primary goal was to investigate techniques and methods that can...

chapter

Computing Word Similarity on Large-Scale Corpus

Tao Xu, Weiguang Qu, Xuri Tang, Dexin Ding, more

2009 Fourth International Conference on Innovative Computing, Information and Control (ICICIC) > 1076 - 1079

2009 Fourth International Conference on Innovative Computing, Information and Control (ICICIC 2009)

This paper proposes a novel approach for word similarity computation based on word sense vectors. The word sense vector is built using HIT-IR Tongyici Cilin (extended) for concept generalization and is further modified by the use of relative and absolute frequency filters. Experiments show that the approach not only overcomes the problem of similarity computation of unseen words but also yields a...

chapter

A method of Part-Of-Speech guessing of Chinese Unknown Words based on combined features

Hai-Jun Zhang, Shu-Min Shi, Chong Feng, He-Yan Huang

2009 International Conference on Machine Learning and Cybernetics > 1 > 328 - 332

2009 Eighth International Conference on Machine Learning and Cybernetics (ICMLC)

Part-of-speech (POS) guessing of unknown words is an essential phase in the process of unknown words identification. This paper applies combined features (namely, both external and internal features) in POS guessing of Chinese unknown words, under conditional random field model (CRF). For acquiring high-precision of POS guessing, this paper puts forward a method of integrating Chinese radical, as...

chapter

Chinese Unknown Word Recognition Using Improved Conditional Random Fields

Yisu Xu, Xuan Wang, Buzhou Tang, Xiaolong Wang

2008 Eighth International Conference on Intelligent Systems Design and Applications > 2 > 363 - 367

2008 Eighth International Conference on Intelligent Systems Design and Applications

Unknown word recognition is a very important problem in natural language processing. It has a great influence on the performance of dictionary construction and word segmentation. This paper introduces two methods to improve the effect of Chinese unknown word recognition by using Conditional Random Fields: the rough label of the characters and the N-best listing. The CRF with the two methods proposed...

chapter

Word Sense Disambiguation Based on Vicarious Words

Zhimao Lu, DongMei Fan, Rubo Zhang

2008 Fourth International Conference on Natural Computation > 6 > 101 - 105

2008 Fourth International Conference on Natural Computation (ICNC)

This paper presents the concept of vicarious words and develops a new unsupervised Chinese word sense disambiguation method. This method, after statistical learning from the vicarious words, realizes unsupervised word sense disambiguation by calculating mutual information to measure the degree of collocation information between the ambiguous words and their context. In our experiment, we test ten...

chapter

Acquiring ISA Relations from Chinese Free Text Based on Multiple Patterns

Lei Liu, Sen Zhang, Lu Hong Diao, Shu Ying Yan, more

2008 Fifth International Conference on Fuzzy Systems and Knowledge Discovery > 4 > 160 - 164

2008 Fifth International Conference on Fuzzy Systems and Knowledge Discovery (FSKD)

Automatic acquisition of ISA relations is a basic problem in knowledge acquisition from text. We present a method that acquires and verifies ISA relations from large Chinese free text. It initially discovers a set of sentences using Chinese lexicosyntactic patterns. Then we combine outside layer removal and inside layer gathering for acquiring concepts of constituting ISA relation. Finally, ISA relations...

Filter options

Keywords:
DICTIONARIES
COMPUTATIONAL LINGUISTICS

Publication date

Set your own date range

Keywords

NATURAL LANGUAGE PROCESSING (12)
TEXT ANALYSIS (6)
TAGGING (4)
TRAINING DATA (4)
WORD PROCESSING (4)
ACCURACY (3)
ARTIFICIAL INTELLIGENCE (3)
COMPUTATIONAL MODELING (3)
CONFERENCES (3)
FEATURE EXTRACTION (3)
KNOWLEDGE BASED SYSTEMS (3)
TESTING (3)
WORD SENSE DISAMBIGUATION (3)
CHINESE WORD SEGMENTATION (2)
CLASSIFICATION ALGORITHMS (2)
COMPUTATIONAL COMPLEXITY (2)
CONTEXT (2)
DATA MINING (2)
GRAMMAR (2)
HIDDEN MARKOV MODELS (2)
MACHINE TRANSLITERATION (2)
PART-OF-SPEECH TAGGING (2)
PRAGMATICS (2)
PROBABILITY (2)
SEMANTICS (2)
SPEECH RECOGNITION (2)
STATISTICAL LEARNING (2)
STATISTICAL MACHINE TRANSLATION (2)
SUPPORT VECTOR MACHINES (2)
WORD ALIGNMENT (2)
ABSOLUTE FREQUENCY FILTERS (1)
ACCENTED MANDARIN (1)
ACCENTED MANDARIN SPEECH RECOGNITION (1)
ACOUSTIC MODEL (1)
ADAPTATION MODEL (1)
ADAPTATION MODELS (1)
AIRPLANES (1)
ALGORITHM DESIGN AND ANALYSIS (1)
ALIGNMENT STATISTICS (1)
AMHARIC (1)
BAG OF VISUAL WORDS (1)
BAG OF WORDS (1)
BINARY-SEEK-BY-WORD (1)
BRAIN MODELING (1)
CCC ASYMPTOTIC NORMALITY (1)
CCC ATTRIBUTOR (1)
CHANGE POINT DETECTION (1)
CHINESE DICTIONARY MECHANISM (1)
CHINESE FREE TEXT (1)
CHINESE LEXICOSYNTACTIC PATTERNS (1)
CHINESE LINGUISTIC EXPERT (1)
CHINESE UNKNOWN WORD (1)
CHINESE UNKNOWN WORD RECOGNITION (1)
CHINESE WORD SENSE DISAMBIGUATION MODEL (1)
CHROMIUM (1)
CLUSTERING ALGORITHMS (1)
CODEBOOK (1)
COLLOCATION INFORMATION (1)
COMBINING RULE-BASED AND STATISTICS-BASED APPROACHES (1)
COMPLEXITY THEORY (1)
COMPUTABILITY (1)
COMPUTATIONAL LINGUISTIC (1)
COMPUTER SYSTEM (1)
COMPUTERS (1)
CONDITIONAL RANDOM FIELD (1)
CONDITIONAL RANDOM FIELD MODEL (1)
CONSTRUCTION INDUSTRY (1)
CONTEXT FREE COMPUTATIONALLY SIMPLE STYLOMETRY BASED TEXT HOMOGENEITY TEST (1)
CONTEXT MODELING (1)
CORPORA (1)
CORPUS (1)
CORPUS AUGMENTED METHOD (1)
CORPUS-AUGMENTED APPROACH (1)
CRF (1)
DATA COMPRESSION (1)
DATA MODELS (1)
DATA-DRIVEN APPROACH (1)
DEEP BELIEF NETWORKS (1)
DEEP LEARNING (1)
DICTIONARY CONSTRUCTION (1)
DICTIONARY MECHANISM (1)
DICTIONARY QUERY (1)
DISCONTINUOUS WORD ALIGNMENT (1)
ELEMENTARY COMBINATORIAL ARGUMENTS (1)
ENGLISH-HINDI AUTOMATIC WORD ALIGNMENT (1)
ENTROPY (1)
ERROR ANALYSIS (1)
ETHIOPIAN GOVERNMENT (1)
EUROPE (1)
EXPANSION DICTIONARY (1)
FEATURE RANKING GRAPH (1)
FEATURE VECTOR QUANTIZATION (1)
FREQUENCY CONVERSION (1)
GIZA++ (1)
GRAMMATICAL KNOWLEDGE BASE OF CONTEMPORARY CHINESE (1)
GRAPH THEORY (1)
HINDI LANGUAGE RESOURCE SCARCITY (1)
more

INFONA - science communication portal

Search results

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options