Search results

Items from 1 to 11 out of 11 results

chapter

Text categorization of Enron email corpus based on information bottleneck and maximal entropy

Man Wang, Yifan He, Minghu Jiang

IEEE 10th INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING PROCEEDINGS > 2472 - 2475

2010 10th International Conference on Signal Processing (ICSP 2010)

This paper is for text categorization of Enron email corpus, we use the information bottleneck (IB) method to cluster the key words based on their distribution on different class labels, then we use threads and address groups as additional features to email texts, and the maximal entropy model to improve the accuracy of the classifier. Our experimental results shows that these measures can improve...

chapter

A MultiExpert Approach for Bayesian Network Structural Learning

F. Colace, M. De Santo, M. Vento

2010 43rd Hawaii International Conference on System Sciences > 1 - 11

2010 43rd Hawaii International Conference on System Sciences (HICSS-43)

The determination of a Bayesian network structure, especially in the case of wide domains, can be often complex, time consuming and imprecise. Therefore the interest of scientific community in learning Bayesian network structure from data is increasing: many techniques or disciplines, as data mining, text categorization, ontology building, can take advantage from structural learning. In literature...

chapter

An efficient k-means algorithm integrated with Jaccard distance measure for document clustering

M.-U.-S. Shameem, R. Ferdous

2009 First Asian Himalayas International Conference on Internet > 1 - 6

2009 First Asian Himalayas International Conference on Internet. AH-ICI 2009

Document Clustering is a widely studied problem in Text Categorization. It is the process of partitioning or grouping a given set of documents into disjoint clusters where documents in the same cluster are similar. K-means, one of the simplest unsupervised learning algorithms, solves the well known clustering problem following a simple and easy way to classify a given data set through a certain number...

chapter

Chinese text sentiment classification based on granule network

Xia Zhang, Suzhen Wang, Mingzhu Xu, Yixin Yin

2009 IEEE International Conference on Granular Computing > 775 - 778

2009 IEEE International Conference on Granular Computing (GrC 2009)

With the expanding of text comment information, text sentiment classification become a hot issue. Domestic research on chinese sentiment classification mainly focus on segmentation and features selection or focus on classifying algorithm based on statistics. Rules mining method is a kind of important techniques of text classification. This paper propose a new approach which apply the rule mining by...

chapter

Categorization, clustering and association rule mining on WWW

S.S. Bedi, H. Yadav, P. Yadav

2009 International Multimedia, Signal Processing and Communication Technologies > 173 - 177

2009 International Multimedia, Signal Processing and Communication Technologies (IMPACT-2009)

Clustering techniques have been used by many intelligent software agents in order to retrieve, filter, and categorize documents available on the World Wide Web. Clustering is also useful in extracting salient features of related Web documents to automatically formulate queries and search for other similar documents on the Web. Traditional clustering algorithms either use a priori knowledge of document...

chapter

A novel feature weight algorithm for text categorization

Wenqian Shang, Hongbin Dong, Haibin Zhu, Yongbin Wang

2008 International Conference on Natural Language Processing and Knowledge Engineering > 1 - 7

2008 International Conference on Natural Language Processing and Knowledge Engineering (NLP-KE)

With the development of the Web, large numbers of documents are put onto the Internet. More and more digital libraries, news sources and inner data of companies are available. Automatic text categorization becomes more and more important for dealing with massive data. However, text preprocessing is still the bottleneck of text categorization based on vector space model (VSM). The result of text preprocessing...

chapter

Information Extraction, Search, Interaction and Collaboration on the Web in Mexico

J.A. Sanchez, E. Chavez, M. Montes

2008 Latin American Web Conference > 156 - 164

2008 Latin American Web Conference (LA-WEB)

Web research in Mexico has been addressing issues related mainly to search mechanisms, information extraction, and mediating user interaction and group collaboration. In this paper we provide an overview of representative projects in the area and present a sample of recent advances by research groups in Mexican institutions. These include initiatives aimed to exploring extraction techniques that regard...

chapter

Text Classification Based on Rule Mining by Granule Network Constructing

Xia Zhang, Yixin Yin, Xiuyan Meng, Hailong Zhao

2008 Fifth International Conference on Fuzzy Systems and Knowledge Discovery > 2 > 514 - 518

2008 Fifth International Conference on Fuzzy Systems and Knowledge Discovery (FSKD)

Text classification is one of the practices of knowledge discovery. Designation of the classifier is the most important par of text classification. Comparing with the methods based on statistic theory, classification based on rule learning is a better one on some situations. A granular computing approach is proposed to learn rules by constructing a granule network while classifying texts. The algorithm...

chapter

A Web Text Classification Rules Extraction Algorithm

He Liu, Dayou Liu, Xiaohu Shi

2008 Fourth International Conference on Natural Computation > 1 > 693 - 697

2008 Fourth International Conference on Natural Computation (ICNC)

Text classification is a very important technique for gathering Web information. A novel approach based on multi-population collaborative optimization is proposed for the extraction of Web text classification rules. The information entropy was applied for the initialization of the populations and the multi-population collaborative optimization was applied for the evolution of the populations. The...

chapter

Measuring the representativeness of index terms in literary texts: an experiment on the Quran

Hayati Abd Rahman,, Shahrul Azman Noah,, Hector Jimenez-Salazar

2008 International Symposium on Information Technology > 2 > 1 - 5

2008 International Symposium on Information Technology

Concept hierarchy is a hierarchically organized collection of domain concepts. It is particularly useful in many applications such as information retrieval, document browsing and document classification. One of the important tasks in the construction of concept hierarchy is the identification of suitable terms with appropriate size of domain vocabulary. One way of achieving such a size is by using...

chapter

A novel risk assessment system for port state control inspection

Zhong Gao, Guanming Lu, Mengjue Liu, Meng Cui

2008 IEEE International Conference on Intelligence and Security Informatics > 242 - 244

2008 IEEE International Conference on Intelligence and Security Informatics (ISI 2008)

Port state control (PSC) inspection is the most important mechanism to ensure world marine safe. Recently, some SVM-based risk assessment systems have been presented in the world. They estimate the risk of each candidate ship based on its generic factors and history inspection factors to select high-risk one before conducting on-board PSC inspection. However, how to improve the performance of the...

Filter options

Keywords:
DATA MINING
ENTROPY

Publication date

Set your own date range

Keywords

TEXT ANALYSIS (8)
CLASSIFICATION ALGORITHMS (7)
TRAINING (7)
FEATURE EXTRACTION (5)
INTERNET (5)
ALGORITHM DESIGN AND ANALYSIS (3)
INFORMATION RETRIEVAL (3)
PATTERN CLUSTERING (3)
ACCURACY (2)
ARTIFICIAL INTELLIGENCE (2)
CLASSIFICATION (2)
CLUSTERING ALGORITHMS (2)
COLLABORATION (2)
DATABASES (2)
INDEXES (2)
KNOWLEDGE ENGINEERING (2)
MARINE VEHICLES (2)
PARTITIONING ALGORITHMS (2)
PATTERN CLASSIFICATION (2)
RULE MINING (2)
STATISTICAL ANALYSIS (2)
SUPPORT VECTOR MACHINES (2)
TEXT CLASSIFICATION (2)
VECTORS (2)
WORLD WIDE WEB (2)
AEROSPACE ELECTRONICS (1)
ARTIFICIAL NEURAL NETWORKS (1)
ASSOCIATION RULE MINING (1)
BAG OF WORDS (1)
BAG-OF-WORDS (1)
BAYESIAN METHODS (1)
BAYESIAN NETWORK STRUCTURAL LEARNING (1)
BELIEF NETWORKS (1)
BUILDINGS (1)
CANDIDATE SHIP (1)
CHINESE TEXT SENTIMENT CLASSIFICATION (1)
CLASSIFIER PERFORMANCE (1)
CLUSTERING TECHNIQUE (1)
COLLABORATIVE OPTIMIZATION (1)
COMPUTER LANGUAGES (1)
COMPUTERS (1)
CONSTRUCTION INDUSTRY (1)
DISTANCE MEASUREMENT (1)
DOCUMENT CLUSTERING (1)
ELECTRONIC MAIL (1)
EMAIL CORPUS (1)
EMAIL TEXT (1)
ENRON EMAIL CORPUS (1)
ESTIMATION (1)
EVOLUTION (BIOLOGY) (1)
EXPERT SYSTEMS (1)
F1-MEASURE (1)
FEATURE WEIGHT ALGORITHM (1)
FREQUENCY CONVERSION (1)
GAIN (1)
GOVERNMENT (1)
GRANULAR COMPUTING (1)
GRANULAR NETWORK (1)
GRANULE NETWORK (1)
GRANULE NETWORK CONSTRUCTION (1)
GRAPH PARTITIONING (1)
GRAPH THEORY (1)
GROUP COLLABORATION (1)
HIDDEN MARKOV MODELS (1)
HISTORY (1)
HTML (1)
INFORMATION BOTTLENECK (1)
INFORMATION ENTROPY (1)
INFORMATION EXTRACTION (1)
INFORMATION FILTERING (1)
INSPECTION (1)
INTELLIGENT CONTROL (1)
INTELLIGENT SOFTWARE AGENT (1)
JACCARD DISTANCE MEASURE (1)
K-MEANS ALGORITHM (1)
K-NEAREST NEIGHBOR (1)
KEY WORD CLUSTERING (1)
KNN-SVM CLASSIFIER (1)
KNOWLEDGE DISCOVERY (1)
LEAD (1)
LEARNING (ARTIFICIAL INTELLIGENCE) (1)
MACHINE LEARNING (1)
MAINTENANCE ENGINEERING (1)
MAJORITY VOTE COMBINING RULE (1)
MARINE ENGINEERING (1)
MARINE SAFETY (1)
MAXIMAL ENTROPY (1)
MEDIA (1)
MEXICAN INSTITUTIONS (1)
MULTI-EXPERT APPROACH (1)
MULTIMEDIA (1)
MULTIMEDIA COMMUNICATION (1)
MULTIMEDIA WEB CONTENTS (1)
MULTIPOPULATION COLLABORATIVE OPTIMIZATION (1)
MUTUAL INFORMATION (1)
NOISE (1)
ONTOLOGIES (ARTIFICIAL INTELLIGENCE) (1)
more

INFONA - science communication portal

Search results

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options