Search results

Items from 1 to 5 out of 5 results

chapter

Improved Chinese-Japanese phrase-based MT quality using an extended quasi-parallel corpus

Hao Wang, Wei Yang, Yves Lepage

2014 IEEE International Conference on Progress in Informatics and Computing > 6 - 10

2014 International Conference on Progress in Informatics and Computing (PIC)

State-of-the-art phrase-based machine translation (MT) systems usually demand large parallel corpora in the step of training. The quality and the quantity of the training data exert a direct influence on the performance of such translation systems. The lack of open-source bilingual corpora for a particular language pair results in lower translation scores reported for such a language pair. This is...

chapter

Adaptive named entity recognition based on conditional random fields with automatic updated dynamic gazetteers

Xixin Wu, Zhiyong Wu, Jia Jia, Lianhong Cai

2012 8th International Symposium on Chinese Spoken Language Processing > 363 - 367

2012 8th International Symposium on Chinese Spoken Language Processing (ISCSLP 2012)

This paper presents a hybrid model which combines conditional random fields (CRFs) with dynamic gazetteers (DGs) for the task of Chinese named entity recognition (NER). In the previous work of NER, gazetteers were widely used. But their gazetteers were all static ones which cannot adapt themselves to the new domains and new out-of-vocabulary named entities (OOVNEs). In this work, we build and maintain...

chapter

English-Hindi Automatic Word Alignment with Scarce Resources

Eknath Venkataramani, Deepa Gupta

2010 International Conference on Asian Language Processing > 253 - 256

2010 International Conference on Asian Language Processing (IALP 2010)

Many automatic word alignment techniques have been so far developed in Natural Language Processing (NLP). However, word alignment between English and Hindi has not progressed much due to two main reasons viz. complex structure of the participating languages and the scarcity of Hindi-language resources. This paper provides a corpus-augmented method of word alignment in which these limitations have...

chapter

A Hybrid Oriya Named Entity Recognition System: Integrating HMM with MaxEnt

S. Biswas, S. Mohanty, S.P. Mishra

2009 Second International Conference on Emerging Trends in Engineering&Technology > 639 - 643

2009 2nd International Conference on Emerging Trends in Engineering and Technology (ICETET 2009)

This paper describes a hybrid system that applies maximum entropy (MaxEnt) model with hidden Markov model (HMM) and some linguistic rules to recognize name entities in Oriya language. The main advantage of our system is, we are using both HMM and MaxEnt model successively with some manually developed linguistic rules. First we are using MaxEnt to identify name entities in Oria corpus, then tagging...

chapter

Chinese Unknown Word Recognition Using Improved Conditional Random Fields

Yisu Xu, Xuan Wang, Buzhou Tang, Xiaolong Wang

2008 Eighth International Conference on Intelligent Systems Design and Applications > 2 > 363 - 367

2008 Eighth International Conference on Intelligent Systems Design and Applications

Unknown word recognition is a very important problem in natural language processing. It has a great influence on the performance of dictionary construction and word segmentation. This paper introduces two methods to improve the effect of Chinese unknown word recognition by using Conditional Random Fields: the rough label of the characters and the N-best listing. The CRF with the two methods proposed...

Filter options

Keywords:
TRAINING DATA
HIDDEN MARKOV MODELS
COMPUTATIONAL LINGUISTICS

Publication date

Set your own date range

Content availability

Available (4)
None (1)

Keywords

NATURAL LANGUAGE PROCESSING (4)
CONFERENCES (2)
DICTIONARIES (2)
EDUCATIONAL INSTITUTIONS (2)
WORD PROCESSING (2)
ANALOGY (1)
CHINESE UNKNOWN WORD RECOGNITION (1)
CONDITIONAL RANDOM FIELD (1)
CONDITIONAL RANDOM FIELDS (CRFS) (1)
CONTEXT (1)
CORPUS AUGMENTED METHOD (1)
CORPUS-AUGMENTED APPROACH (1)
DATA MINING (1)
DATA MODELS (1)
DICTIONARY CONSTRUCTION (1)
DYNAMIC GAZETTEERS (DGS) (1)
ENGLISH-HINDI AUTOMATIC WORD ALIGNMENT (1)
ENTROPY (1)
ERROR CORRECTION (1)
ERROR CORRECTION MECHANISM (1)
FEATURE EXTRACTION (1)
GIZA++ (1)
HIDDEN MARKOV MODEL (1)
HINDI LANGUAGE RESOURCE SCARCITY (1)
HMM (1)
HYBRID ORIYA NAMED ENTITY RECOGNITION SYSTEM (1)
INDIAN LANGUAGES (1)
LINGUISTIC RULES (1)
MACHINE TRANSLATION (1)
MATHEMATICAL MODEL (1)
MAXENT (1)
MAXIMUM ENTROPY (1)
MAXIMUM ENTROPY METHODS (1)
N-BEST LISTING (1)
NAMED ENTITY RECOGNITION (NER) (1)
NATOOLS (1)
ORGANIZATIONS (1)
ORIYA LANGUAGE (1)
PARAPHRASING (1)
QUASI-PARALLEL DATA (1)
RANDOM PROCESSES (1)
SCARCE RESOURCES (1)
SPEECH RECOGNITION (1)
TAGGING (1)
TEXT ANALYSIS (1)
UNKNOWN WORDS RECOGNITION (1)
VOCABULARY (1)
WORD ALIGNMENT (1)
WORD SEGMENTATION (1)
more

INFONA - science communication portal

Search results

Improved Chinese-Japanese phrase-based MT quality using an extended quasi-parallel corpus

Adaptive named entity recognition based on conditional random fields with automatic updated dynamic gazetteers

English-Hindi Automatic Word Alignment with Scarce Resources

A Hybrid Oriya Named Entity Recognition System: Integrating HMM with MaxEnt

Chinese Unknown Word Recognition Using Improved Conditional Random Fields

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Content availability

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options