Search results

Items from 1 to 4 out of 4 results

chapter

A bi-directional sampling based on K-means method for imbalance text classification

Jia Song, Xianglin Huang, Sijun Qin, Qing Song

2016 IEEE/ACIS 15th International Conference on Computer and Information Science (ICIS) > 1 - 5

2016 IEEE/ACIS 15th International Conference on Computer and Information Science (ICIS)

This paper studies the imbalanced data classifycation problem and proposes bi-directional sampling based on clustering (BDSK) for the imbalanced data classification. This algorithm combines SMOTE over-sampling algorithm and under-sampling algorithm based on K-Means to solve the within-class imbalance problem and the between-class imbalance problem. It not only avoid induce too much noise but also...

chapter

An efficient k-means algorithm integrated with Jaccard distance measure for document clustering

M.-U.-S. Shameem, R. Ferdous

2009 First Asian Himalayas International Conference on Internet > 1 - 6

2009 First Asian Himalayas International Conference on Internet. AH-ICI 2009

Document Clustering is a widely studied problem in Text Categorization. It is the process of partitioning or grouping a given set of documents into disjoint clusters where documents in the same cluster are similar. K-means, one of the simplest unsupervised learning algorithms, solves the well known clustering problem following a simple and easy way to classify a given data set through a certain number...

chapter

Improving arabic text categorization using decision trees

F. Harrag, E. El-Qawasmeh, P. Pichappan

2009 First International Conference on Networked Digital Technologies > 110 - 115

2009 First International Conference on Networked Digital Technologies (NDT 2009)

This paper presents the results of classifying Arabic text documents using a decision tree algorithm. Experiments are performed over two self collected data corpus and the results show that the suggested hybrid approach of Document Frequency Thresholding using an embedded information gain criterion of the decision tree algorithm is the preferable feature selection criterion. The study concluded that...

chapter

Categorization, clustering and association rule mining on WWW

S.S. Bedi, H. Yadav, P. Yadav

2009 International Multimedia, Signal Processing and Communication Technologies > 173 - 177

2009 International Multimedia, Signal Processing and Communication Technologies (IMPACT-2009)

Clustering techniques have been used by many intelligent software agents in order to retrieve, filter, and categorize documents available on the World Wide Web. Clustering is also useful in extracting salient features of related Web documents to automatically formulate queries and search for other similar documents on the Web. Traditional clustering algorithms either use a priori knowledge of document...

INFONA - science communication portal

Search results

A bi-directional sampling based on K-means method for imbalance text classification

An efficient k-means algorithm integrated with Jaccard distance measure for document clustering

Improving arabic text categorization using decision trees

Categorization, clustering and association rule mining on WWW

Filter options

Publication date

Keywords

INFONA - science communication portal

Search results

A bi-directional sampling based on K-means method for imbalance text classification

An efficient k-means algorithm integrated with Jaccard distance measure for document clustering

Improving arabic text categorization using decision trees

Categorization, clustering and association rule mining on WWW

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options