Search results

Items from 1 to 20 out of 35 results

chapter

Web-based keyword adapted Language Modeling for Keyword Spotting

Wenzhu Shen, Ji Wu, Wei Li

2010 7th International Symposium on Chinese Spoken Language Processing > 251 - 255

7th International Symposium on Chinese Spoken Language Processing (ISCSLP 2010)

Language Model (LM) constitutes one of the key components in Keyword Spotting (KWS). The rapid development of the World Wide Web (WWW) makes it an extremely large and valuable data source for LM training, but it is not optimal to use the raw transcripts from WWW due to the mismatch of content between the web corpus

chapter

Keyword Spotting from Online Chinese Handwritten Documents Using One-vs-All Trained Character Classifier

Heng Zhang, Da-Han Wang, Cheng-Lin Liu

2010 12th International Conference on Frontiers in Handwriting Recognition > 271 - 276

2010 12th International Conference on Frontiers in Handwriting Recognition (ICFHR 2010)

This paper presents a text query-based method for keyword spotting from online Chinese handwritten documents. The similarity between a text word and handwriting is obtained by combining the character similiarity scores given by a character classifier. To overcome the ambiguity of character segmentation, multiple

chapter

LDA-based keyword selection in text categorization

S. Tasci, T. Gungor

2009 24th International Symposium on Computer and Information Sciences > 230 - 235

2009 24th International Symposium on Computer and Information Sciences (ISCIS)

topic analysis of LDA for feature selection and compare it with the classical feature selection metrics in text categorization. For the experiments, we use SVM as the classifier and tf*idf weighting for weighting the terms. We observed that almost in all metrics, information gain performs best at all keyword numbers while

chapter

Keyword Spotting in Online Handwritten Documents Containing Text and Non-text Using BLSTM Neural Networks

Emanuel Indermuhle, Volkmar Frinken, Andreas Fischer, Horst Bunke

2011 International Conference on Document Analysis and Recognition > 73 - 77

2011 International Conference on Document Analysis and Recognition (ICDAR)

Spotting keywords in handwritten documents without transcription is a valuable method as it allows one to search, index, and classify such documents. In this paper we show that keyword spotting based on bi-directional Long Short-Term Memory (BLSTM) recurrent neural nets can successfully be applied on online

chapter

A corpus-based approach for keyword identification using supervised learning techniques

J. TeCho, C. Nattee, T. Theeramunkong

2008 5th International Conference on Electrical Engineering/Electronics, Computer, Telecommunications and Information Technology > 1 > 33 - 36

2008 5th International Conference on Electrical Engineering/Electronics, Computer, Telecommunications and Information Technology (ECTI-CON)

This paper presents a corpus-based approach for extracting keywords from a text written in a language that has no word boundary. Based on the concept of Thai character cluster, a Thai running text is preliminarily segmented into a sequence of inseparable units, called TCCs. To enable the handling of a large-scaled

chapter

To Determine the Weight in a Weighted Sum Method for Domain-Specific Keyword Extraction

Wenshuo Liu, Wenxin Li

2009 International Conference on Computer Engineering and Technology > 1 > 11 - 15

2009 International Conference on Computer Engineering and Technology (ICCET 2009)

Keyword extraction has been a very traditional topic in Natural Language Processing. However, most methods have been too complicated and slow to be applied in real applications, for example in web-based system. This paper proposes an approach which will complete some preparing works focusing on exploring the

chapter

Adapting BLSTM Neural Network Based Keyword Spotting Trained on Modern Data to Historical Documents

V Frinken, A Fischer, H Bunke, R Manmatha

2010 12th International Conference on Frontiers in Handwriting Recognition > 352 - 357

2010 12th International Conference on Frontiers in Handwriting Recognition (ICFHR 2010)

Being able to search for words or phrases in historic handwritten documents is of paramount importance when preserving cultural heritage. Storing scanned pages of written text can save the information from degradation, but it does not make the textual information readily available. Automatic keyword spotting systems

chapter

Keyword-Labeled Classification with Auxiliary Unlabeled Documents

Congle Zhang, Dikan Xing, Ke Zhou

2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology > 3 > 463 - 466

2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology

To reduce the human effort in labeling the training set for document classification, some learning algorithms ask users to give the representative keywords for each class rather than any labeled documents. The key challenge in such \emph {keyword-labeled classification} is how to learn the high quality classifier with

chapter

An improved method of keywords extraction based on short technology text

Jun Wang, Lei Li, Fuji Ren

Proceedings of the 6th International Conference on Natural Language Processing and Knowledge Engineering(NLPKE-2010) > 1 - 6

2010 International Conference on Natural Language Processing and Knowledge Engineering (NLP-KE 2010)

Keywords are the critical resources of information management and retrieval, automatic text classification and clustering. The keywords extraction plays an important role in the process of constructing structured text. Current algorithms of keywords extraction have matured in some ways. However the errors of word

chapter

An evaluation of existing and new feature selection metrics in text categorization

S. Tasci, T. Gungor

2008 23rd International Symposium on Computer and Information Sciences > 1 - 6

2008 23rd International Symposium on Computer and Information Sciences

metrics used in text categorization by using local and global policies. For the experiments, we use three datasets which vary in size, complexity and skewness. We use SVM as the classifier and tf-idf weighting for term weighting. We observed that almost in all metrics, local policy outperforms when the number of keywords is

chapter

Probabilistic neural network based text summarization

M. Abdel Fattah, F. Ren

2008 International Conference on Natural Language Processing and Knowledge Engineering > 1 - 6

2008 International Conference on Natural Language Processing and Knowledge Engineering (NLP-KE)

This work proposes an approach to address the problem of improving content selection in automatic text summarization by using probabilistic neural network (PNN). This approach is a trainable summarizer, which takes into account several features, including sentence position, positive keyword, negative keyword, sentence

chapter

Document classification efficiency of phrase-based techniques

N. Kapalavayi, S.N.J. Murthy, Gongzhu Hu

2009 IEEE/ACS International Conference on Computer Systems and Applications > 174 - 178

2009 7th IEEE/ACS International Conference on Computer Systems and Applications (AICCSA-2009)

Due to the exponential growth of available text documents in digital form, it is of great importance to develop techniques for automatic document classification based on the textual contents. Earlier document classification techniques have used keyword-based features and related statistics to achieve good results when

chapter

An Improved Method for Ranking of Search Results Based on User Interest

Hong-Rong Yang, Ming Xu, Ning Zheng

2008 IFIP International Conference on Network and Parallel Computing > 132 - 138

2008 IFIP International Conference on Network and Parallel Computing

huge irrelevant search hits. In this paper, we propose an improved method for ranking of search results to reduce human efforts on locating interesting hits. The search results are re-ranked using adaptive user interest hierarchies (AUIH), which considers both investigator-defined keywords and user interest learnt from

chapter

Automatic image annotation using group sparsity

Shaoting Zhang, Junzhou Huang, Yuchi Huang, Yang Yu, more

2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition > 3312 - 3319

2010 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Automatically assigning relevant text keywords to images is an important problem. Many algorithms have been proposed in the past decade and achieved good performance. Efforts have focused upon model representations of keywords, but properties of features have not been well investigated. In most cases, a group of

chapter

A New Text Feature Conversion Method for Text Classification

Minghan Hu, Ying Liu, Lei Wang, Debin Ren

2009 Sixth International Conference on Fuzzy Systems and Knowledge Discovery > 1 > 62 - 66

2009 Sixth International Conference on Fuzzy Systems and Knowledge Discovery (FSKD 2009)

Keywords normally carry large amount of category information. In order to fully utilize this kind of information for text classification, this paper proposes a new text feature conversion method based on the SKG model. The method uses the classified texts with the listed key words as the training data to train the

chapter

Fisher Kernels for Handwritten Word-spotting

F. Perronnin, J.A. Rodriguez-Serrano

2009 10th International Conference on Document Analysis and Recognition > 106 - 110

2009 10th International Conference on Document Analysis and Recognition (ICDAR)

The Fisher kernel is a generic framework which combines the benefits of generative and discriminative approaches to pattern classification. In this contribution, we propose to apply this framework to handwritten word-spotting. Given a word image and a keyword generative model, the idea is to generate a vector which

chapter

Web image annotation using word co-occurrence and association analysis

Yin-Fu Huang, Hsiu-Chen Liao

2008 23rd International Symposium on Computer and Information Sciences > 1 - 4

2008 23rd International Symposium on Computer and Information Sciences

. First, the related textual information associated with Web images is identified as the candidate annotations for Web images. Second, the word co-occurrence is utilized to eliminate irrelevant keywords for improving the annotation accuracy. Then, the keyword-based association analysis is exploited to further discover

chapter

Language model adaptation using WWW documents obtained by utterance-based queries

Andreas Tsiartas, Panayiotis Georgiou, Shrikanth Narayanan

2010 IEEE International Conference on Acoustics, Speech and Signal Processing > 5406 - 5409

2010 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2010

results in up to 1.1% absolute Word Error Rate (WER) improvement as compared to keyword-based approaches. The proposed approach reduces the WER by 6.3% absolute in our experiments, compared to an in-domain LM without considering any Web data.

chapter

A Framework for the Classification of Unstructured Data

D.A. Ostrowski

2009 IEEE International Conference on Semantic Computing > 373 - 377

2009 IEEE International Conference on Semantic Computing (ICSC)

mechanisms with a traditional indexing method. The goal is to identify a higher semantic content and more meaningful keyword combinations, considering both supervised and unsupervised techniques. Within a specific implementation both Bayesian learning as well as clustering are integrated to support a boost parameter towards

chapter

The improved features selection for text classification

Yun Yang, Yanan Wu

2010 2nd International Conference on Computer Engineering and Technology > 6 > V6-268 - V6-271

2010 2nd International Conference on Computer Engineering and Technology (ICCET)

of cultural information. Therefore, text categorization research has become more important. The paper improved the precision of the traditional text categorization by the process that we mended the weight of words and mined potential keywords, then found their relationship. In the end of the paper, an experiment was

Keywords:
TRAINING
TEXT ANALYSIS

Publication date

Set your own date range

INFONA - science communication portal

Search results

Web-based keyword adapted Language Modeling for Keyword Spotting

Keyword Spotting from Online Chinese Handwritten Documents Using One-vs-All Trained Character Classifier

LDA-based keyword selection in text categorization

Keyword Spotting in Online Handwritten Documents Containing Text and Non-text Using BLSTM Neural Networks

A corpus-based approach for keyword identification using supervised learning techniques

To Determine the Weight in a Weighted Sum Method for Domain-Specific Keyword Extraction

Adapting BLSTM Neural Network Based Keyword Spotting Trained on Modern Data to Historical Documents

Keyword-Labeled Classification with Auxiliary Unlabeled Documents

An improved method of keywords extraction based on short technology text

An evaluation of existing and new feature selection metrics in text categorization

Probabilistic neural network based text summarization

Document classification efficiency of phrase-based techniques

An Improved Method for Ranking of Search Results Based on User Interest

Automatic image annotation using group sparsity

A New Text Feature Conversion Method for Text Classification

Fisher Kernels for Handwritten Word-spotting

Web image annotation using word co-occurrence and association analysis

Language model adaptation using WWW documents obtained by utterance-based queries

A Framework for the Classification of Unstructured Data

The improved features selection for text classification

Filter options

Publication date

Content availability

Publication type

Keywords

INFONA - science communication portal

Search results

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Content availability

Publication type

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options