Advanced search

Advanced search in people

From:

To:

Items from 1 to 20 out of 93 results

chapter

Hierarchical document clustering based on cosine similarity measure

Shraddha K. Popat, Pramod B. Deshmukh, Vishakha A. Metre

2017 1st International Conference on Intelligent Systems and Information Management (ICISIM) > 153 - 159

2017 1st International Conference on Intelligent Systems and Information Management (ICISIM)

Clustering is one of the prime topics in data mining. Clustering partitions the data and classifies the data into meaningful subgroups. Document clustering is a set of the document into groups such that two groups show different characteristics with respect to likeness. In this paper, an experimental exploration of similarity based method, HSC for measuring the similarity between data objects particularly...

chapter

On multiclass text classification algorithm based on 1-a-r and multiconlitron

Yuping Qin, Fengfeng Qin, Qiangkui Leng, Aihua Zhang

2017 6th Data Driven Control and Learning Systems (DDCLS) > 370 - 373

2017 IEEE 6th Data Driven Control and Learning Systems Conference (DDCLS)

Aim to multiclass text categorization problem, a classification algorithm based on multiconlitron and 1-a-r method is presented. 1-a-r method is used to convert a multiclass categorization problem to several binary problems. Multiconlitron is constructed for each binary problem in input space. For the text to be classified, its class is decided by multiconlitrons. The classification experiments are...

chapter

Using KNN algorithm for classification of textual documents

Aiman Moldagulova, Rosnafisah Bte. Sulaiman

2017 8th International Conference on Information Technology (ICIT) > 665 - 671

2017 8th International Conference on Information Technology (ICIT)

Nowadays the exponential growth of generation of textual documents and the emergent need to structure them increase the attention to the automated classification of documents into predefined categories. There is wide range of supervised learning algorithms that deal with text classification. This paper deals with an approach for building a machine learning system in R that uses K-Nearest Neighbors...

chapter

Naive Bayes classifiers for music emotion classification based on lyrics

Yunjing An, Shutao Sun, Shujuan Wang

2017 IEEE/ACIS 16th International Conference on Computer and Information Science (ICIS) > 635 - 638

2017 IEEE/ACIS 16th International Conference on Computer and Information Science (ICIS)

There is a constantly growing interest in evaluating music information retrieval (MIR) systems that can provide effective management of the music resources. The crucial characteristic of music is its emotion, which reflect the human's perception. To do the automatic classification of Chinese music emotions more effective, we use the lyrics of music to analysis and classify music based on emotion....

chapter

Fusing Gini Index and Term Frequency for Text Feature Selection

Lin Wu, Yongbin Wang, Shengyan Zhang, Yannan Zhang

2017 IEEE Third International Conference on Multimedia Big Data (BigMM) > 280 - 283

2017 IEEE Third International Conference on Multimedia Big Data (BigMM)

Automatic text classification is the key technology to process and organize large-scale text data. It is well known that the high dimensionality of feature space is a main challenge for text classification. In order to attenuate such a problem as well as inspired by existing arts, we propose an effective text feature selection algorithm by novelly fusing the classical methodologies of Gini index and...

chapter

Text categorization using Rocchio algorithm and random forest algorithm

S. Thamarai Selvi, P. Karthikeyan, A. Vincent, V. Abinaya, more

2016 Eighth International Conference on Advanced Computing (ICoAC) > 7 - 12

2016 Eighth International Conference on Advanced Computing (ICoAC)

Millions of file uploads and downloads happen every minute resulting in big data creation and manual text categorization is not possible. Hence, there is a need for automatic categorization of documents that makes storage and retrieval more efficient. This research paper proposes a hybrid text categorization model that combines both Rocchio algorithm and Random Forest algorithm to perform Multi-label...

chapter

An empirical analysis and classification of crisis related tweets

J. Rexiline Ragini, P. M. Rubesh Anand

2016 IEEE International Conference on Computational Intelligence and Computing Research (ICCIC) > 1 - 4

2016 IEEE International Conference on Computational Intelligence and Computing Research (ICCIC)

The social media generates large volume of data through tweets and text messages during and after any disaster. The analysis and classification of the obtained data at the time of disaster is essential for conveying the information to the appropriate rescue personnel. In this paper, an automated text classification system is proposed in order to classify the data effectively. The classification of...

chapter

Feature selection for text classification using genetic algorithms

Noria Bidi, Zakaria Elberrichi

2016 8th International Conference on Modelling, Identification and Control (ICMIC) > 806 - 810

2016 8th International Conference on Modelling, Identification and Control (ICMIC)

In text classification, feature selection is essential to improve the classification effectiveness. This paper provides an empirical study of a feature selection method based on genetic algorithms for different text representation methods. This feature selection algorithm can accomplish two goals: in one hand is the search of a feature subset such that the performance of classifier is best; in other...

chapter

Maximal frequent sequences for document classification

Hai Nguyen Thi Tuyet, Tan Hanh

2016 International Conference on Advanced Technologies for Communications (ATC) > 152 - 157

2016 International Conference on Advanced Technologies for Communications (ATC)

Document Classification has attracted several attentions from researchers due to the increase of digital form documents and the need of these documents' organization. One of the most popular approaches to deal with this problem is based on machine learning techniques [1]. However, the result of classification much depends on the linguistic preprocess and the document representation. The dependence...

chapter

Authorship identification of the Azerbaijani texts using n-grams

K.R. Aida-zade, S.Q. Talibov

2016 IEEE 10th International Conference on Application of Information and Communication Technologies (AICT) > 1 - 3

2016 IEEE 10th International Conference on Application of Information and Communication Technologies (AICT)

The purpose of this study is to show how n-grams are used for author recognition in the Azerbaijani language. As attribute vectors for analyzing of authorship are taken monogram and digram. We have developed a new approach to the determination of the attribute vectors for recognition of the author of an unknown text.

chapter

Structural learning framework for binary short text classification

Wuying Liu, Lin Wang

2016 12th International Conference on Natural Computation, Fuzzy Systems and Knowledge Discovery (ICNC-FSKD) > 1188 - 1193

2016 12th International Conference on Natural Computation and 13th Fuzzy Systems and Knowledge Discovery (ICNC-FSKD)

With the fast-paced prevalence of smartphones, binary short text classification (STC) is becoming a basic and challenging issue, and relevant STC algorithms can be successfully used in spam filtering for short message service (SMS), wechat, microblogging, and so on. In this manuscript, we address the structural feature of SMS documents and propose a structural learning framework, which decomposes...

chapter

A hybrid statistical and semantic model for identification of mental health and behavioral disorders using social network analysis

Madan Krishnamurthy, Khalid Mahmood, Pawel Marcinek

2016 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM) > 1019 - 1026

2016 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM)

The advent of social networking and open health web forums such as PatientsLikeMe, WebMD, ehealth forum etc. have provided avenues for social user data that can prove instrumental in suggesting futuristic trends in healthcare. Homophily in social networks is a vital contributor for analyzing patterns for medical conditions, diagnosis and treatment options. Since, members with similar medical issues...

chapter

Research on Text Categorization of KNN Based on K-Means for Class Imbalanced Problem

Wang Yu, Xu Linying

2016 Sixth International Conference on Instrumentation & Measurement, Computer, Communication and Control (IMCCC) > 579 - 583

2016 Sixth International Conference on Instrumentation & Measurement, Computer, Communication and Control (IMCCC)

With the rapid development of Web and the rapid expansion of text information, how to effectively organize and manage these information is a great challenge for the current information science. Text automatic classification technology can effectively organize a large number of texts and help people to improve the efficiency of information retrieval. It has become one of the most important research...

chapter

An Improved Parallel Algorithm for Text Categorization

Wenchuan Yang, Yimin Fu, Dong Zhang

2016 International Symposium on Computer, Consumer and Control (IS3C) > 451 - 454

2016 International Symposium on Computer, Consumer and Control (IS3C)

This paper proposes an approach using MapReduce-based Rocchio relevance feedback algorithm, which improved the traditional Rocchio algorithm in the MapReduce paradigm, to resolve the problem of massive information filtering. Traditional text classification algorithms have vital impact on information filtering.

chapter

Autonomous website categorization with pre-defined dictionary

Adsadawut Chanakitkarnchok, Kulit Na Nakorn, Kultida Rojviboolchai

2016 13th International Conference on Electrical Engineering/Electronics, Computer, Telecommunications and Information Technology (ECTI-CON) > 1 - 6

2016 13th International Conference on Electrical Engineering/Electronics, Computer, Telecommunications and Information Technology (ECTI-CON)

In this technology emerging era, the number of websites is increasing dramatically. The content and category of information are overflowing the Internet World. Finding the right information from almost a billion of websites is considerably hard, but finding the accurate and quality one is even harder. Hence, the need of website categorization's demand is increasing tremendously. Unfortunately, the...

chapter

A novel text mining approach based on TF-IDF and Support Vector Machine for news classification

Seyyed Mohammad Hossein Dadgar, Mohammad Shirzad Araghi, Morteza Mastery Farahani

2016 IEEE International Conference on Engineering and Technology (ICETECH) > 112 - 116

2016 IEEE International Conference on Engineering and Technology (ICETECH)

With the development of weblogs and social networks, many news providers share their news headlines on different websites and weblogs. One of the main text mining topics is how to classify news into different groups. This study aims to classify news into various groups so that users can identify the most popular news group in the desired country at any given time. Based on Term Frequency-Inverse Document...

chapter

Document classification with a weighted frequency pattern tree algorithm

Froila Helixia Dsouza, Ananthanarayana V.S.

2016 International Conference on Data Mining and Advanced Computing (SAPIENCE) > 29 - 34

2016 International Conference on Data Mining and Advanced Computing (SAPIENCE)

Document classification can be defined as the task of automatically categorizing collections of electronic documents into their annotated classes, based on their contents. It is an important problem in Data mining. Due to the exponential growth of documents in the Internet and the emergent need to organize them, developing an efficient document classification method to automatically manipulate web...

chapter

Online analysis of sentiment on Twitter

Shokoufeh Salem Minab, Mehrdad Jalali, Mohammad Hossein Moattar

2015 International Congress on Technology, Communication and Knowledge (ICTCK) > 359 - 365

2015 International Congress on Technology, Communication and Knowledge (ICTCK)

Social media such as Twitter create space to explain the thoughts and opinions on various topics and different events, millions of users can share their ideas in this Micrblog, Therefore Twitter is converted as a source to exploration of information; make a decision and an analysis of sentiment. There is a sense in all of the texts, but it is more important to provide strategies for obtaining suitable...

chapter

Text categorization with machine learning and hierarchical structures

M. Krendzelak, F. Jakab

2015 13th International Conference on Emerging eLearning Technologies and Applications (ICETA) > 1 - 5

2015 13th International Conference on Emerging eLearning Technologies and Applications (ICETA)

Text categorization with machine learning algorithms usually assumes to have flat set of categories. Such classifiers are very domain specific and not reusable for some other generic text classifications. It is very possible that a hierarchically structured set of categories might have a higher impact on the way classifiers are used and built. As presented in this document, the list of most common...

chapter

Improved Expected Cross Entropy Method for Text Feature Selection

Guohua Wu, Liuyang Wang, Nailiang Zhao, Hairong Lin

2015 International Conference on Computer Science and Mechanical Automation (CSMA) > 49 - 54

2015 International Conference on Computer Science and Mechanical Automation (CSMA)

Feature selection plays an important role in text categorization, and contributes directly to the accuracy of the categorization. In the process of feature selection, due to the lack of consideration of the traditional expected cross entropy algorithm for document frequency, we first improve the expected cross entropy formula of the traditional, and then propose an improved text feature selection...

Content availability:
Available
Keywords:
ALGORITHM DESIGN AND ANALYSIS
CLASSIFICATION ALGORITHMS
TEXT CATEGORIZATION

Publication date

Set your own date range

Publication type

book (92)
article (1)

Keywords

TRAINING (42)
TEXT ANALYSIS (39)
TEXT CLASSIFICATION (31)
SUPPORT VECTOR MACHINES (28)
PATTERN CLASSIFICATION (22)
FEATURE EXTRACTION (17)
ACCURACY (16)
CLASSIFICATION (14)
DATA MINING (14)
FEATURE SELECTION (14)
CLUSTERING ALGORITHMS (13)
SUPPORT VECTOR MACHINE CLASSIFICATION (12)
LEARNING (ARTIFICIAL INTELLIGENCE) (11)
MACHINE LEARNING ALGORITHMS (11)
SUPPORT VECTOR MACHINE (11)
COMPUTERS (9)
INTERNET (9)
KNN (9)
BAYES METHODS (8)
MACHINE LEARNING (8)
ENTROPY (7)
INFORMATION RETRIEVAL (7)
PATTERN CLUSTERING (7)
SVM (7)
TESTING (7)
TEXT MINING (7)
DATABASES (6)
KERNEL (6)
COMPUTER SCIENCE (5)
FILTERING (5)
PROBABILITY (5)
SEMANTICS (5)
STATISTICAL ANALYSIS (5)
TRAINING DATA (5)
VECTOR SPACE MODEL (5)
VECTORS (5)
GENETIC ALGORITHMS (4)
HEURISTIC ALGORITHMS (4)
INFORMATION GAIN (4)
NAIVE BAYES ALGORITHM (4)
WORLD WIDE WEB (4)
BAYESIAN METHODS (3)
CLASSIFICATION TREE ANALYSIS (3)
COMPLEXITY THEORY (3)
DECISION TREES (3)
DOCUMENT CLASSIFICATION (3)
EXPERT SYSTEMS (3)
FEATURE SELECTION ALGORITHM (3)
INDEXES (3)
INFORMATION MANAGEMENT (3)
K-NEAREST NEIGHBOR (3)
KNN ALGORITHM (3)
MATHEMATICAL MODEL (3)
MEASUREMENT (3)
MUTUAL INFORMATION (3)
NATURAL LANGUAGES (3)
NIOBIUM (3)
OPTIMIZATION (3)
SOFTWARE ALGORITHMS (3)
TEXT CLASSIFICATION ALGORITHM (3)
WEIGHT MEASUREMENT (3)
APPROXIMATION ALGORITHMS (2)
ASSOCIATION RULE MINING (2)
AUTOMATIC TEXT CLASSIFICATION (2)
CHINESE TEXT CLASSIFIER (2)
CLASSIFICATION ALGORITHM (2)
COMPUTATIONAL MODELING (2)
CORRELATION (2)
CORRELATION ANALYSIS (2)
DECISION MAKING (2)
DECISION SUPPORT SYSTEMS (2)
DIMENSION REDUCTION (2)
DISTANCE MEASUREMENT (2)
DOCUMENT HANDLING (2)
ENCODING (2)
GENETIC ALGORITHM (2)
GINI INDEX (2)
GRAPH THEORY (2)
INCREMENTAL LEARNING (2)
INDEXING (2)
INFERENCE MECHANISMS (2)
INFORMATION GAIN ALGORITHM (2)
K-MEANS ALGORITHM (2)
K-NN (2)
KNOWLEDGE ENGINEERING (2)
MAPREDUCE (2)
MARINE VEHICLES (2)
MEDIA (2)
NAïVE BAYES (2)
NAIVE BAYES (2)
NATURAL LANGUAGE PROCESSING (2)
NEAREST NEIGHBOR SEARCHES (2)
NOISE (2)
PARTITIONING ALGORITHMS (2)
PORTUGUESE LANGUAGE (2)
PREDICTION ALGORITHMS (2)
PRINCIPAL COMPONENT ANALYSIS (2)
more

INFONA - science communication portal

Advanced search

Advanced search in people

Hierarchical document clustering based on cosine similarity measure

On multiclass text classification algorithm based on 1-a-r and multiconlitron

Using KNN algorithm for classification of textual documents

Naive Bayes classifiers for music emotion classification based on lyrics

Fusing Gini Index and Term Frequency for Text Feature Selection

Text categorization using Rocchio algorithm and random forest algorithm

An empirical analysis and classification of crisis related tweets

Feature selection for text classification using genetic algorithms

Maximal frequent sequences for document classification

Authorship identification of the Azerbaijani texts using n-grams

Structural learning framework for binary short text classification

A hybrid statistical and semantic model for identification of mental health and behavioral disorders using social network analysis

Research on Text Categorization of KNN Based on K-Means for Class Imbalanced Problem

An Improved Parallel Algorithm for Text Categorization

Autonomous website categorization with pre-defined dictionary

A novel text mining approach based on TF-IDF and Support Vector Machine for news classification

Document classification with a weighted frequency pattern tree algorithm

Online analysis of sentiment on Twitter

Text categorization with machine learning and hierarchical structures

Improved Expected Cross Entropy Method for Text Feature Selection

Filter options

Publication date

Publication type

Keywords

INFONA - science communication portal

Advanced search

Advanced search in people

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Publication type

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options