Advanced search

Advanced search in people

From:

To:

Items from 1 to 20 out of 36 results

article

Data-mining to build a knowledge representation store for clinical decision support. Studies on curation and validation based on machine performance in multiple choice medical licensing examinations

Barry Robson, Srinidhi Boray

Computers in Biology and Medicine > 2016 > 73 > C > 71-93

Extracting medical knowledge by structured data mining of many medical records and from unstructured data mining of natural language source text on the Internet will become increasingly important for clinical decision support. Output from these sources can be transformed into large numbers of elements of knowledge in a Knowledge Representation Store (KRS), here using the notation and to some extent...

chapter

Word-Sense Disambiguation of Korean Predicates Using Sejong Electronic Dictionary and Unsupervised Learning

Sangwook Kang, Yeontaek Oh, Minho Kim, Hyuk-chul Kwon

2015 IEEE International Conference on Computer and Information Technology; Ubiquitous Computing and Communications; Dependable, Autonomic and Secure Computing; Pervasive Intelligence and Computing > 257 - 261

The Sejong Electronic (machine-readable) Dictionary, developed by the 21st century Sejong Plan, contains a systematically organized information on Korean words. It helps to solve the problems encountered in the electronic formatting of a still-commonly-used hard-copy dictionary. The Sejong Electronic Dictionary, however, has a limitation relating to sentence structure and selection-restricted nouns...

chapter

Effective topic modeling for email

Hiep Hong, Teng-Sheng Moh

2015 International Conference on High Performance Computing & Simulation (HPCS) > 342 - 349

2015 International Conference on High Performance Computing & Simulation (HPCS)

Emails have been increasingly popular and have become an indispensible tool for communication and document exchange. Because of its convenience, people use emails every day at work, at school, and for personal matters. Consequently, the number of emails people receive daily keeps on increasing, causing them to spend more time organizing the emails. People often need to classify and move email into...

chapter

Exploring Technical Phrase Frames from Research Paper Titles

Yuzana Win, Tomonari Masada

2015 IEEE 29th International Conference on Advanced Information Networking and Applications Workshops > 558 - 563

2015 IEEE 29th International Conference on Advanced Information Networking and Applications Workshops (WAINA)

This paper proposes a method for exploring technical phrase frames by extracting word n-grams that match our information needs and interests from research paper titles. Technical phrase frames, the outcome of our method, are phrases with wildcards that may be substituted for any technical term. Our method, first of all, extracts word trigrams from research paper titles and constructs a co-occurrence...

chapter

Automatic extraction of historical transition in researchers and research topics

Sanako Hori, Masaki Murata, Masato Tokuhisa, Qing Ma

2011 7th International Conference on Natural Language Processing and Knowledge Engineering > 296 - 299

2011 7th International Conference on Natural Language Processing and Knowledge Engineering (NLPKE)

It is necessary for a researcher to know historical transition in researchers and research topics. Although Web search engines can be used for obtaining such information, collecting the information across a long time period is difficult and laborious. Thus, we proposed a method for automatically extracting historical transition in researchers and research topics by using co-occurrence information...

chapter

Mining Hot Topics from Free-Text Customer Reviews An LDA-Based Approach

Chuanming Yu, Xiaoqing Zhang, Huiting Luo

2010 Seventh Web Information Systems and Applications Conference > 85 - 89

2010 7th Web Information Systems and Applications Conference (WISA 2010). Workshop on Semantic Web and Ontology (SWON2010). Workshop on Electronic Government Technology and Application (EGTA 2010)

This study examines how the Latent Dirichlet Allocation (LDA) model combined with natural language processing techniques can be used to identify hot topics from free-text customer reviews. To verify the validity of the proposed approach, 21 580 restaurant reviews are collected. Each review is viewed as a probabilistic mixture of latent topics and each topic is treated as a probability distribution...

chapter

Automarking: Automatic Assessment of Open Questions

Laurie Ane Cutrone, Maiga Chang

2010 10th IEEE International Conference on Advanced Learning Technologies > 143 - 147

2010 IEEE 10th International Conference on Advanced Learning Technologies (ICALT 2010)

A number of Learning Management Systems (LMSs) exist on the market today. A subset of a LMS is the component in which student assessment is managed. In some forms of assessment, such as open questions, the LMS is incapable of evaluating the students' responses and therefore human intervention is necessary. In order to assess at higher levels of Bloom's (1956) taxonomy, it is necessary to include open-style...

chapter

Keyphrase extraction based on semantic relatedness

Fei Xie, Xindong Wu, Xuegang Hu

9th IEEE International Conference on Cognitive Informatics (ICCI'10) > 308 - 312

2010 9th IEEE International Conference on Cognitive Informatics (ICCI)

Keyphrase extraction is a fundamental research task in natural language processing and text mining. A limitation of previous keyphrase extraction methods based on semantic analysis is that the acquisition of the semantic features within phrases is restricted by the constructed thesaurus and language. An approach to the acquisition of the semantic features within phrases from a single document is proposed...

chapter

Subtopic-based Multi-documents Summarization

Shu Gong, Youli Qu, Shengfeng Tian

2010 Third International Joint Conference on Computational Science and Optimization > 2 > 382 - 386

Third International Joint Conference on Computational Sciences and Optimization (CSO 2010)

Multi-documents summarization is an important research area of NLP. Most methods or techniques of multi-document summarization either consider the documents collection as single-topic or treat every sentence as single-topic only, but lack of a systematic analysis of the subtopic semantics hiding inside the documents. This paper presents a Subtopic-based Multi-documents Summarization (SubTMS) method...

chapter

An Algorithm of Text Automatic Proofreading Based on Chinese Word Segmentation

Huawei Zhang, Yan Jun

2009 International Conference on Computational Intelligence and Software Engineering > 1 - 3

2009 International Conference on Computational Intelligence and Software Engineering

Chinese text automatic proofreading opens up broad possibilities for the application of natural language processing. According to the distribution of Chinese single-character after word segmentation in Chinese text with the characteristic of errors and character trigram model, presents an effective text automatic proofreading algorithm. Experiments show that our method achieves better precision and...

chapter

Improved Reordering Rules for Hierarchical Phrase-Based Translation

Shu Cai, Yajuan Lu, Qun Liu

2009 International Conference on Asian Language Processing > 65 - 70

2009 International Conference on Asian Language Processing (IALP 2009)

Hierarchical phrase-based translation model has been proven to be a simple and powerful machine translation model. However, due to the computational complexity constraints, the extraction and use of hierarchical rules are usually restricted under certain limits, and these limits could have a negative impact on the performance of the translation model, especially for reordering. This paper presents...

chapter

Extracting Thai Compounds Using Collocations and POS Bigram Probabilities without a POS Tagger

W. Aroonmanakun

2009 International Conference on Asian Language Processing > 118 - 122

2009 International Conference on Asian Language Processing (IALP 2009)

This paper presents a simple method to extract compounds using statistical collocations and POS bigram probabilities without a POS tagger. Statistical collocation was used to determine strength of word co-occurrences. Probabilities of POS sequences were used to adjust the strength of collocation within a possible compound. These probabilities were estimated from compounds found in the dictionary....

chapter

Automatic Choosing of English Rhymes in Translation of Chinese Ancient Poems

Miao Fang, Xin Jiang, Qi Zhao, Yi Jiang

2009 Second International Symposium on Knowledge Acquisition and Modeling > 1 > 434 - 437

2009 Second International Symposium on Knowledge Acquisition and Modeling (KAM 2009)

Translating Chinese ancient poem is a valuable but hard thing. Automatic choosing of English rhymes in translation of Chinese ancient poems would do translators a favor. This paper extracts three important factors that influence English rhymes, and presents a set of statistical models based on these factors, and then trains these models and acquires their parameters, which at last are used to recommend...

chapter

Incremental learning of integrated semiotics based on linguistic and behavioral symbols

W. Takano, Y. Nakamura

2009 IEEE/RSJ International Conference on Intelligent Robots and Systems > 2545 - 2550

2009 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2009)

This paper describes an novel approach towards linguistic processing for robots through integration of a motion language module and a natural language module. The motion language module represents association between symbolized motion patterns and words. The natural language module models sentences. The motion language module and the natural language module are graphically integrated. The integration...

chapter

A Hybrid Approach to Vietnamese Word Segmentation Using Part of Speech Tags

Dang Duc Pham, Giang Binh Tran, Son Bao Pham

2009 International Conference on Knowledge and Systems Engineering > 154 - 161

2009 International Conference on Knowledge and Systems Engineering (KSE 2009)

Word segmentation is one of the most important tasks in NLP. This task, within Vietnamese language and its own features, faces some challenges, especially in words boundary determination. To tackle the task of Vietnamese word segmentation, in this paper, we propose the WS4VN system that uses a new approach based on Maximum matching algorithm combining with stochastic models using part-of-speech information...

chapter

An Experimental Study on Lexicalized Statistical Parsing for Vietnamese

Anh-Cuong Le, Phuong-Thai Nguyen, Hoai-Thu Vuong, Minh-Thu Pham, more

2009 International Conference on Knowledge and Systems Engineering > 162 - 167

2009 International Conference on Knowledge and Systems Engineering (KSE 2009)

Syntactic parsing is a central problem and a challenge in the field of natural language processing. It attracts many studies and consequently there exists the effective parsers for several popular languages such as English and Chinese. For Vietnamese parsing, there have been a few studies focusing on this problem, these studies lack of applying modern techniques, and no popular parser has been released...

chapter

Extended super function based Chinese Japanese machine translation

Xiao Sun, F. Ren, Degen Huang

2009 International Conference on Natural Language Processing and Knowledge Engineering > 1 - 8

2009 International Conference on Natural Language Processing and Knowledge Engineering (NLP-KE)

Research on machine translation has a long history and many methods and techniques have been proposed and developed. However, low quality of translation is still a major problem and many related problems remain unresolved. Super function based machine translation was proposed to perform translation without going through syntactic and semantic analysis as many machine translation systems usually do...

chapter

Research on knowledge elements in exponential language model

Huixing Jiang, Xiaojie Wang

2009 International Conference on Natural Language Processing and Knowledge Engineering > 1 - 5

2009 International Conference on Natural Language Processing and Knowledge Engineering (NLP-KE)

This paper presents an exponential language model (ELM) for modeling and managing knowledge elements. The model has been developed based on minimum sample risk (MSR) algorithm, which is a discriminative training method. ELM uses features to capture global, domain, or sentential language phenomena that is composed of name entities, part of speech strings, personal usage words, positions of words, sentence...

chapter

Comparison between two Arabic tagsets

M. Rashwan, E. Khalil, A. Rafea

2009 International Conference on Natural Language Processing and Knowledge Engineering > 1 - 8

2009 International Conference on Natural Language Processing and Knowledge Engineering (NLP-KE)

Enhancing Arabic tagging is of great importance in many NLP applications. This paper presents a simple comparison tool that compares two powerful tagging systems for Arabic, the first one is the ASVM Tagger, by Diab M. et al,. The second one is RDI Arab Tagger that relies on simple powerful long n-grams probability estimation plus A*search algorithm for disambiguation, this comparison is done to superimpose...

chapter

A novel similarity measure for semantic class induction in human-computer spoken dialogues

Yali Li, Changchun Bao, Yonghong Yan

2009 IEEE Youth Conference on Information, Computing and Telecommunication > 351 - 354

2009 IEEE Youth Conference on Information, Computing and Telecommunication (YC-ICT 2009)

In this paper, we introduced a new semantic induction metric which can induce some semantic classes from a set of domain-specific unannotated data. We emphasized on the co-occurrence probability instead of just distances of word probability distribution. Compared to the traditional approach on right or left context to calculate the similarity, we used both left and right information simultaneously...

Content availability:
Available
Keywords:
DATA MINING
PROBABILITY
NATURAL LANGUAGE PROCESSING

Publication date

Set your own date range

INFONA - science communication portal

Advanced search

Advanced search in people

Data-mining to build a knowledge representation store for clinical decision support. Studies on curation and validation based on machine performance in multiple choice medical licensing examinations

Word-Sense Disambiguation of Korean Predicates Using Sejong Electronic Dictionary and Unsupervised Learning

Effective topic modeling for email

Exploring Technical Phrase Frames from Research Paper Titles

Automatic extraction of historical transition in researchers and research topics

Mining Hot Topics from Free-Text Customer Reviews An LDA-Based Approach

Automarking: Automatic Assessment of Open Questions

Keyphrase extraction based on semantic relatedness

Subtopic-based Multi-documents Summarization

An Algorithm of Text Automatic Proofreading Based on Chinese Word Segmentation

Improved Reordering Rules for Hierarchical Phrase-Based Translation

Extracting Thai Compounds Using Collocations and POS Bigram Probabilities without a POS Tagger

Automatic Choosing of English Rhymes in Translation of Chinese Ancient Poems

Incremental learning of integrated semiotics based on linguistic and behavioral symbols

A Hybrid Approach to Vietnamese Word Segmentation Using Part of Speech Tags

An Experimental Study on Lexicalized Statistical Parsing for Vietnamese

Extended super function based Chinese Japanese machine translation

Research on knowledge elements in exponential language model

Comparison between two Arabic tagsets

A novel similarity measure for semantic class induction in human-computer spoken dialogues

Filter options

Publication date

Publication type

Keywords

Data set

INFONA - science communication portal

Advanced search

Advanced search in people

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Publication type

Keywords

Data set

Reporting an error / abuse

Sending the report failed

Accessibility options