Search results

Items from 1 to 13 out of 13 results

chapter

Keyword-Labeled Classification with Auxiliary Unlabeled Documents

Congle Zhang, Dikan Xing, Ke Zhou

2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology > 3 > 463 - 466

2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology

To reduce the human effort in labeling the training set for document classification, some learning algorithms ask users to give the representative keywords for each class rather than any labeled documents. The key challenge in such \emph {keyword-labeled classification} is how to learn the high quality classifier with

chapter

Unsupervised Contextual Keyword Relevance Learning and Measurement using PLSA

S. Sudarsun, Dalou Kalaivendhan, M. Venkateswarlu

2006 Annual IEEE India Conference > 1 - 6

2006 Annual IEEE India Conference

In this paper, we have developed a probabilistic approach using PLSA for the discovery and analysis of contextual keyword relevance based on the distribution of keywords across a training text corpus. We have shown experimentally, the flexibility of this approach in classifying keywords into different domains based on

chapter

Topic Distillation and Clustering Algorithm Based on the Topology of Pages-Keywords

Jian-Shuang Deng, Qi-Lun Zheng, Hong Peng

2006 International Conference on Machine Learning and Cybernetics > 1581 - 1586

Proceedings of 2006 International Conference on Machine Learning and Cybernetics

easy to bring the problem of topic excursion. Hits algorithm requires a number of pages as the basic-set for calculating and cannot be used in plain texts. This paper introduces a new algorithm: PK-TDC which makes use of the iterative idea of Hits. PK-TDC searches the authority pages and keywords on the topology of pages

chapter

Document classification efficiency of phrase-based techniques

N. Kapalavayi, S.N.J. Murthy, Gongzhu Hu

2009 IEEE/ACS International Conference on Computer Systems and Applications > 174 - 178

2009 7th IEEE/ACS International Conference on Computer Systems and Applications (AICCSA-2009)

Due to the exponential growth of available text documents in digital form, it is of great importance to develop techniques for automatic document classification based on the textual contents. Earlier document classification techniques have used keyword-based features and related statistics to achieve good results when

chapter

An information arrangement technique for a text classification and summarization based on a summarization frame

S. Tsuchiya, E. Yoshimura, H. Watabe

2009 International Conference on Natural Language Processing and Knowledge Engineering > 1 - 5

2009 International Conference on Natural Language Processing and Knowledge Engineering (NLP-KE)

can be expected to be achieved in a QA system. Sentences are classified according to the content. Each classification is classified into a more detailed field. Important keywords are extracted from the sentences classified into the field. Moreover, the extracted keywords are classified into common and peculiar word for

chapter

An Examination of the Effectiveness of Social Tagging for Resource Discovery

D.H.-L. Goh, Chei Sian Lee, A.Y.K. Chua, K. Razikin

2008 International Workshop on Information-Explosion and Next Generation Search > 23 - 30

2008 International Workshop on Information-Explosion and Next Generation Search (INGS)

Social tagging allows users to assign keywords (tags) to resources facilitating their future access by the tag creator, and possibly by other users. In terms of its support for resource discovery, social tagging has both proponents and critics. The goal of this paper investigates if tags are an effective means for

chapter

A Cluster-based Approach to Filtering Spam under Skewed Class Distributions

Wen-feng Hsiao, Te-ming Chang, Guo-hsin Hu

2007 40th Annual Hawaii International Conference on System Sciences (HICSS'7) > 53

Proceedings of the 40th Annual Hawaii International Conference on System Sciences

The purpose of this research is to propose an appropriate classification approach to improving the effectiveness of spam filtering on the issue of skewed class distributions. A clustering-based classifier is proposed to first cluster documents into several groups, and then an equal number of keywords are extracted

chapter

Simple linguistic processing effect on multi-label emotion classification

Ye Wu, F. Ren

2009 International Conference on Natural Language Processing and Knowledge Engineering > 1 - 5

2009 International Conference on Natural Language Processing and Knowledge Engineering (NLP-KE)

events. And a huge resource of text-based emotion can be found from the World Wide Web nowadays. This paper reports a study to investigate the effectiveness of using SVM (Support Vector Machine) on linguistic features considering emotion keywords and negative words, and classify a collection of blog posts sentences tagged

chapter

A Method of Semantic Dictionary Construction from On-line Encyclopedia Classifications

Yun Li, Fang Tian, F. Ren, S. Kuroiwa, more

2007 International Conference on Natural Language Processing and Knowledge Engineering > 82 - 89

2007 IEEE International Conference on Natural Language Processing and Knowledge Engineering (NLP-KE '07)

This paper introduces a method of constructing a semantic dictionary automatically from the keywords and classify relations of the web encyclopedia Chinese WikiPedia. Semantic units, which are affixes (core/modifier) shared between many phrased-keywords, are selected using statistic method and string affix matching

chapter

Classifying Web Pages Using Information Extraction Patterns Preliminary Results and Findings

Lay-Ki Soon, Sang Ho Lee

2010 Sixth International Conference on Signal-Image Technology and Internet Based Systems > 195 - 202

Sixth International Conference on Signal-Image Technology & Internet-Based Systems (SITIS 2010)

Web page classification plays an essential role in facilitating more efficient information retrieval and information processing. Conventionally, web text documents are represented by term frequency matrix for classification purpose. However, considering the limitations of representing documents using terms or keywords

chapter

A novel term weighting scheme with distributional coefficient for text categorization with support vector machine

Yuan Ping, Ya-jian Zhou, Yi-xian Yang, Wei-ping Peng

2010 IEEE Youth Conference on Information, Computing and Telecommunications > 182 - 185

2010 IEEE Youth Conference on Information, Computing and Telecommunications (YC-ICT 2010)

In text categorization, vectorizing a document by probability distribution is an effective dimension reduction way to save training time. However, the data sets that share many common keywords between categories affect the classification performance seriously. To address that problem, firstly, we conduct an effective

chapter

Text categorization of Enron email corpus based on information bottleneck and maximal entropy

Man Wang, Yifan He, Minghu Jiang

IEEE 10th INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING PROCEEDINGS > 2472 - 2475

2010 10th International Conference on Signal Processing (ICSP 2010)

of the classifier. Our experimental results shows that these measures can improve the classifier's performances, for keywords change too rapidly in emails while address groups are much steadier.

chapter

Hybrid text mining model for document classification

K A Vidhya, G Aghila

2010 The 2nd International Conference on Computer and Automation Engineering (ICCAE) > 1 > 210 - 214

2nd International Conference on Computer and Automation Engineering (ICCAE 2010)

likelihood in the entire training documents where the training and test data are split randomly into k-subsets like 2/3 for training and 1/3 for test data. In addition, it also utilizes two level hierarchy structures for training documents like features from title, keywords and content with the predefined knowledge available

Filter options

Keywords:
TEXT ANALYSIS
CLASSIFICATION

Publication date

Set your own date range

Content availability

Available (12)
None (1)

Keywords

DATA MINING (7)
TEXT CATEGORIZATION (6)
TRAINING (6)
INFORMATION RETRIEVAL (5)
CLASSIFICATION ALGORITHMS (4)
INTERNET (4)
ACCURACY (3)
DOCUMENT CLASSIFICATION (3)
PATTERN CLUSTERING (3)
SUPPORT VECTOR MACHINES (3)
TEXT MINING (3)
COMPUTER SCIENCE (2)
FEATURE EXTRACTION (2)
KEYWORD EXTRACTION (2)
LEARNING (ARTIFICIAL INTELLIGENCE) (2)
PROBABILITY (2)
STATISTICAL ANALYSIS (2)
SUPPORT VECTOR MACHINE (2)
WEB PAGES (2)
WEB SITES (2)
APPROXIMATION METHODS (1)
ARTIFICIAL NEURAL NETWORKS (1)
ASPECT MODEL (1)
AUXILIARY UNLABELED DOCUMENT (1)
BAYES METHODS (1)
BAYESIAN METHODS (1)
BLOG POST (1)
BLOGS (1)
BRIDGES (1)
CHINESE WIKIPEDIA (1)
CLASSIFICATION PERFORMANCE (1)
CLASSIFIER PERFORMANCE (1)
CO-CLUSTERING BASED CLASSIFICATION ALGORITHM (1)
COMMUNITY SEARCH (1)
CONCEPT BASE (1)
CONTEXTUAL KEYWORD RELEVANCE DISCOVERY (1)
DATA SETS (1)
DECISION TREE (1)
DECISION TREES (1)
DEGREE OF ASSOCIATION (1)
DICTIONARIES (1)
DIMENSION REDUCTION (1)
DISTRIBUTIONAL COEFFICIENT (1)
DOCUMENT CLASSICATION (1)
DOCUMENT CLUSTERING (1)
ELECTRONIC MAIL (1)
EMAIL CORPUS (1)
EMAIL TEXT (1)
EMOTION CLASSIFICATION (1)
EMOTION KEYWORDS (1)
EMOTION RECOGNITION (1)
ENCYCLOPAEDIAS (1)
ENRON EMAIL CORPUS (1)
ENTROPY (1)
EQUATIONS (1)
FEATURE REDUCTION (1)
FEATURE SELECTION (1)
GALLIUM NITRIDE (1)
HITS (1)
HITS ALGORITHM (1)
HUMAN COMMUNICATION (1)
HUMAN COMPUTER INTERACTION (1)
HUMAN-MACHINE INTERFACE (1)
HYBRID POWER SYSTEMS (1)
HYBRID TEXT MINING MODEL (1)
HYPERLINKS (1)
INDEXING (1)
INFORMATION ARRANGEMENT (1)
INFORMATION ARRANGEMENT TECHNIQUE (1)
INFORMATION BOTTLENECK (1)
INFORMATION CLASSIFICATION (1)
INFORMATION EXTRACTION (1)
INFORMATION EXTRACTION PATTERNS (1)
INFORMATION GAIN (1)
INFORMATION PROCESSING (1)
INFORMATION REPRESENTATION (1)
INFORMATION SERVICES (1)
JOINTS (1)
KERNEL (1)
KEY WORD CLUSTERING (1)
KEYWORD ASSIGNMENT (1)
KEYWORD BASED FEATURE (1)
KEYWORD RELEVANCE (1)
KEYWORD-BASED AND PHRASE-BASED FEATURES (1)
KEYWORD-LABELED CLASSIFICATION (1)
KNOWLEDGE BASED SYSTEMS (1)
LABELING (1)
LEARNING ALGORITHM (1)
LINGUISTIC PROCESSING (1)
LINGUISTICS (1)
MACHINE LEARNING (1)
MACHINE LEARNING ALGORITHMS (1)
MATRIX ALGEBRA (1)
MAXIMAL ENTROPY (1)
MULTI-LABEL (1)
MULTILABEL EMOTION CLASSIFICATION (1)
MUSIC (1)
MUTUAL INFORMATION (1)
more

INFONA - science communication portal

Search results

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Content availability

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options