Search results

Items from 1 to 5 out of 5 results

chapter

Dynamic Fluzzy Clustering Algorithm for Web Documents Mining

Qi Luo

2010 International Conference on Computational Intelligence and Security > 64 - 67

2010 International Conference on Computational Intelligence and Security (CIS 2010)

This paper first studies the methods of web documents mining and text clustering, and summaries the fuzzy clustering algorithms and similarity measure functions, then proposes a modified similarity function which can solve the problems of feature selection and feature extraction in high-dimensional space. Finally, this paper puts forward to a dynamic fluzzy clustering algorithm(DCFCM) by combining...

chapter

An efficient k-means algorithm integrated with Jaccard distance measure for document clustering

M.-U.-S. Shameem, R. Ferdous

2009 First Asian Himalayas International Conference on Internet > 1 - 6

2009 First Asian Himalayas International Conference on Internet. AH-ICI 2009

Document Clustering is a widely studied problem in Text Categorization. It is the process of partitioning or grouping a given set of documents into disjoint clusters where documents in the same cluster are similar. K-means, one of the simplest unsupervised learning algorithms, solves the well known clustering problem following a simple and easy way to classify a given data set through a certain number...

chapter

Comparative Advantage Approach for Sparse Text Data Clustering

Jie Ji, T.Y.T. Chan, Qiangfu Zhao

2009 Ninth IEEE International Conference on Computer and Information Technology > 1 > 3 - 8

2009 Ninth IEEE International Conference on Computer and Information Technology. CIT 2009

Document clustering is the process of partitioning a set of unlabeled n documents into clusters such that documents in each cluster share some common concepts. Each concept is conveniently represented by some key terms. Using words as features, text data are represented as a vector in a very high dimensional vector space. However, most documents are sparse vectors, for example, more than ten thousand...

chapter

Development of a multilingual text mining approach for knowledge discovery in patents

Chung-Hong Lee, Hsin-Chang Yang, Yi-Ju Li

2009 IEEE International Conference on Systems, Man and Cybernetics > 2265 - 2269

2009 IEEE International Conference on Systems, Man and Cybernetics. SMC 2009

In this paper we describe our work on developing a novel technique for discovery of implicit knowledge about patents from multilingual patent information sources. In this work we developed a system platform to support locating similar and relevant multilingual patent documents. The platform was implemented using a multilingual vector space based on the latent semantic indexing (LSI) model, and utilizing...

chapter

Hybridization of K-Means and Harmony Search Methods for Web Page Clustering

R. Forsati, M.R. Meybodi, M. Mahdavi, A.G. Neiat

2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology > 1 > 329 - 335

2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology

Clustering is currently one of the most crucial techniques for dealing with massive amount of heterogeneous information on the web, which is beyond human beingpsilas capacity to digest. Recent studies have shown that the most commonly used partitioning-based clustering algorithm, the K-means algorithm, is more suitable for large datasets. However, the K-means algorithm can generate a local optimal...

Filter options

Keywords:
CLASSIFICATION ALGORITHMS
TEXT ANALYSIS
DOCUMENT CLUSTERING

Publication date

Set your own date range

Keywords

CLUSTERING ALGORITHMS (4)
PARTITIONING ALGORITHMS (4)
PATTERN CLUSTERING (4)
ALGORITHM DESIGN AND ANALYSIS (2)
INTERNET (2)
K-MEANS ALGORITHM (2)
TEXT CLUSTERING (2)
TEXT MINING (2)
APPROXIMATED C-MEDIODS (1)
CATALOGS (1)
CLASSIFICATION ALGORITHM (1)
CLUSTERING (1)
COMPARATIVE ADVANTAGE. (1)
CONFERENCES (1)
DATA MODELS (1)
DIMENSION REDUCTION (1)
DOCUMENT RETRIEVAL (1)
DYNAMIC FLUZZY CLUSTERING ALGORITHM (1)
ELECTRIC SHOCK (1)
ENTROPY (1)
F1-MEASURE (1)
FEATURE EXTRACTION (1)
FEATURE SELECTION (1)
FUZZY CLUSTERING (1)
HARMONY SEARCH CLUSTERING ALGORITHM (1)
HARMONY SEARCH OPTIMIZATION METHOD (1)
HEURISTIC ALGORITHMS (1)
HIGH DIMENSIONAL VECTOR SPACE (1)
HISTOGRAMS (1)
INDEXING (1)
INFORMATION RETRIEVAL (1)
JACCARD DISTANCE MEASURE (1)
K-MEANS (1)
KEY TERM EXTRACTION (1)
KNOWLEDGE DISCOVERY (1)
LATENT SEMANTIC INDEXING (1)
LIVER (1)
MARINE VEHICLES (1)
MULTILINGUAL PATENT INFORMATION SOURCES (1)
MULTILINGUAL PATENT RETRIEVAL (1)
MULTILINGUAL TEXT MINING (1)
MULTILINGUAL VECTOR SPACE (1)
OPTIMISATION (1)
OPTIMIZATION (1)
OPTIMIZATION METHODS (1)
PARTITIONING-BASED CLUSTERING ALGORITHM (1)
PATENT RETRIEVAL (1)
PATENTS (1)
PATTERN CLASSIFICATION (1)
PRECISSION (1)
PROFESSIONAL CHINESE-ENGLISH PARALLEL CORPORA (1)
RECALL (1)
RELATEDNESS EVALUATION (1)
SEARCH PROBLEMS (1)
SIMILARITY MEASURE FUNCTION (1)
SIMILARITY MEASURE FUNCTIONS (1)
SPARSE TEXT DATA CLUSTERING (1)
SPARSITY (1)
STATISTICS (1)
SUPPORT VECTOR MACHINE CLASSIFICATION (1)
TEXT CATEGORIZATION (1)
TEXT DATA (1)
UNSUPERVISED LEARNING (1)
UNSUPERVISED LEARNING ALGORITHMS (1)
VECTORS (1)
WEB DOCUMENTS MINING (1)
WEB PAGE CLUSTERING (1)
WORDS (1)
more

INFONA - science communication portal

Search results

Dynamic Fluzzy Clustering Algorithm for Web Documents Mining

An efficient k-means algorithm integrated with Jaccard distance measure for document clustering

Comparative Advantage Approach for Sparse Text Data Clustering

Development of a multilingual text mining approach for knowledge discovery in patents

Hybridization of K-Means and Harmony Search Methods for Web Page Clustering

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options