Search results

Items from 1 to 20 out of 474 results

chapter

Hierarchical document clustering based on cosine similarity measure

Shraddha K. Popat, Pramod B. Deshmukh, Vishakha A. Metre

2017 1st International Conference on Intelligent Systems and Information Management (ICISIM) > 153 - 159

2017 1st International Conference on Intelligent Systems and Information Management (ICISIM)

Clustering is one of the prime topics in data mining. Clustering partitions the data and classifies the data into meaningful subgroups. Document clustering is a set of the document into groups such that two groups show different characteristics with respect to likeness. In this paper, an experimental exploration of similarity based method, HSC for measuring the similarity between data objects particularly...

chapter

A comprehensive study of text classification algorithms

Vikas K Vijayan, K. R. Bindu, Latha Parameswaran

2017 International Conference on Advances in Computing, Communications and Informatics (ICACCI) > 1109 - 1113

2017 International Conference on Advances in Computing, Communications and Informatics (ICACCI)

Huge amount of data in today's world are stored in the form of electronic documents. Text mining is the process of extracting the information out of those textual documents. Text classification is the process of classifying text documents into fixed number of predefined classes. The application of text classification includes spam filtering, email routing, sentiment analysis, language identification...

chapter

Emotion classification on youtube comments using word embedding

Julio Savigny, Ayu Purwarianti

2017 International Conference on Advanced Informatics, Concepts, Theory, and Applications (ICAICTA) > 1 - 5

2017 International Conference on Advanced Informatics, Concepts, Theory, and Applications (ICAICTA)

Youtube is one of the most popular video sharing platform in Indonesia. A person can react to a video by commenting on the video. A comment may contain an emotion that can be identified automatically. In this study, we conducted experiments on emotion classification on Indonesian Youtube comments. A corpus containing 8,115 Youtube comments is collected and manually labelled using 6 basic emotion label...

chapter

Sentiment Classification: Feature Selection Based Approaches Versus Deep Learning

Alper Kursat Uysal, Yi Lu Murphey

2017 IEEE International Conference on Computer and Information Technology (CIT) > 23 - 30

2017 IEEE International Conference on Computer and Information Technology (CIT)

Classification of text documents is commonly carried out using various models of bag-of-words that are generated using feature selection methods. In these models, selected features are used as input to well-known classifiers such as Support Vector Machines (SVM) and neural networks. In recent years, a technique called word embeddings has been developed for text mining and, deep learning models using...

chapter

Reasearch on feature mapping based on labels information in multi-label text classification

Tao Wang, Tao Luo, Jianfeng Li, Cong Wang

2017 7th IEEE International Conference on Electronics Information and Emergency Communication (ICEIEC) > 452 - 456

2017 7th IEEE International Conference on Electronics Information and Emergency Communication (ICEIEC)

Feature representation plays an important role in text classification. Feature mapping based on labels information is an algorithm suitable for Binary Relevance. Compared with the conventional text representation, it makes the dimension of the text under control by means of word embedding. More importantly, it takes full advantage of the general characteristics of the label on text representation...

chapter

Sentiment Classification Incorporating User Profile

Yuyang Xu, Bing Li

2017 4th International Conference on Information Science and Control Engineering (ICISCE) > 663 - 667

2017 4th International Conference on Information Science and Control Engineering (ICISCE)

With the emergence of the Internet social shopping platform, a large quantity of sentiment corpus is accumulating rapidly. Sentiment classification, which is a specific application of sentiment analysis, has received a lot of attention from researchers in the fields of natural language processing. The traditional method to classify sentiment text is usually limited to the content of text. However,...

chapter

The evaluation of heterogeneous classifier ensembles for Turkish texts

Zeynep Hilal Kilimci, Selim Akyokus, Sevinc Ilhan Omurca

2017 IEEE International Conference on INnovations in Intelligent SysTems and Applications (INISTA) > 307 - 311

2017 IEEE International Conference on INnovations in Intelligent SysTems and Applications (INISTA)

The basic idea behind the classifier ensembles is to use more than one classifier by expecting to improve the overall accuracy. It is known that the classifier ensembles boost the overall classification performance by depending on two factors namely, individual success of the base learners and diversity. One way of providing diversity is to use the same or different type of base learners. When the...

chapter

Automatic keyword extraction system for Thai website categorization system

Adsadawut Chanakitkarnchok, Kulit Na Nakorn, Kultida Rojviboonchai

2017 14th International Conference on Electrical Engineering/Electronics, Computer, Telecommunications and Information Technology (ECTI-CON) > 206 - 209

2017 14th International Conference on Electrical Engineering/Electronics, Computer, Telecommunications and Information Technology (ECTI-CON)

In this information era, the number of websites in the Internet has dramatically increased over a few years. Any information and services can be retrieved from the website. However, the most valuable content of the website is still a text which is related to the topic or category of the websites. But there has only few researches focusing on categorizing Thai language information. The rest of researches...

chapter

On multiclass text classification algorithm based on 1-a-r and multiconlitron

Yuping Qin, Fengfeng Qin, Qiangkui Leng, Aihua Zhang

2017 6th Data Driven Control and Learning Systems (DDCLS) > 370 - 373

2017 IEEE 6th Data Driven Control and Learning Systems Conference (DDCLS)

Aim to multiclass text categorization problem, a classification algorithm based on multiconlitron and 1-a-r method is presented. 1-a-r method is used to convert a multiclass categorization problem to several binary problems. Multiconlitron is constructed for each binary problem in input space. For the text to be classified, its class is decided by multiconlitrons. The classification experiments are...

chapter

Using KNN algorithm for classification of textual documents

Aiman Moldagulova, Rosnafisah Bte. Sulaiman

2017 8th International Conference on Information Technology (ICIT) > 665 - 671

2017 8th International Conference on Information Technology (ICIT)

Nowadays the exponential growth of generation of textual documents and the emergent need to structure them increase the attention to the automated classification of documents into predefined categories. There is wide range of supervised learning algorithms that deal with text classification. This paper deals with an approach for building a machine learning system in R that uses K-Nearest Neighbors...

chapter

A preprocessing method of AdaBoost for mislabeled data classification

Xiangyang Liu, Yaping Dai, Yan Zhang, Qiao Yuan, more

2017 29th Chinese Control And Decision Conference (CCDC) > 2738 - 2742

2017 29th Chinese Control And Decision Conference (CCDC)

AdaBoost is one of the most popular algorithm for classification and has been successfully used for text classification, face detection and tracking. However noise sensitivity is regarded as a major disadvantage and previous works show that AdaBoost will be overfitting when dealing with the data sets with noisy data. To improve the noise tolerance of conventional AdaBoost, this paper proposed a preprocessing...

chapter

Research review on key techniques of topic-based news elements extraction

Song Qing, Zhang Ying, Zhang Pengzhou

2017 IEEE/ACIS 16th International Conference on Computer and Information Science (ICIS) > 585 - 590

2017 IEEE/ACIS 16th International Conference on Computer and Information Science (ICIS)

With the development of computer and network techniques, and the digital Chinese news texts explosion, facing a massive unstructured news data, a better way for knowledge extraction and storage, on the one hand, can help readers understand the core content of news, on the other hand, completed news knowledge accumulation will support the reportage. In recent years, information extraction technology...

chapter

Naive Bayes classifiers for music emotion classification based on lyrics

Yunjing An, Shutao Sun, Shujuan Wang

2017 IEEE/ACIS 16th International Conference on Computer and Information Science (ICIS) > 635 - 638

2017 IEEE/ACIS 16th International Conference on Computer and Information Science (ICIS)

There is a constantly growing interest in evaluating music information retrieval (MIR) systems that can provide effective management of the music resources. The crucial characteristic of music is its emotion, which reflect the human's perception. To do the automatic classification of Chinese music emotions more effective, we use the lyrics of music to analysis and classify music based on emotion....

chapter

Fusing Gini Index and Term Frequency for Text Feature Selection

Lin Wu, Yongbin Wang, Shengyan Zhang, Yannan Zhang

2017 IEEE Third International Conference on Multimedia Big Data (BigMM) > 280 - 283

2017 IEEE Third International Conference on Multimedia Big Data (BigMM)

Automatic text classification is the key technology to process and organize large-scale text data. It is well known that the high dimensionality of feature space is a main challenge for text classification. In order to attenuate such a problem as well as inspired by existing arts, we propose an effective text feature selection algorithm by novelly fusing the classical methodologies of Gini index and...

chapter

Decision tree rule-based feature selection for large-scale imbalanced data

Haoyue Liu, MengChu Zhou

2017 26th Wireless and Optical Communication Conference (WOCC) > 1 - 6

2017 26th Wireless and Optical Communication Conference (WOCC)

A class imbalance problem often appears in many real world applications, e.g. fault diagnosis, text categorization, fraud detection. When dealing with a large-scale imbalanced dataset, feature selection becomes a great challenge. To confront it, this work proposes a feature selection approach based on a decision tree rule. The effectiveness of the proposed approach is verified by classifying a large-scale...

chapter

Research on text categorization model based on LDA — KNN

Weihua Chen, Xian Zhang

2017 IEEE 2nd Advanced Information Technology, Electronic and Automation Control Conference (IAEAC) > 2719 - 2726

2017 IEEE 2nd Advanced Information Technology, Electronic and Automation Control Conference (IAEAC)

In the text classification, The similarity between the text need to be calculated, but the existing classification methods only consider the similarity between feature words and categories and does not involve the semantic similarity between feature words. In this paper, a new classification model LDA (Latent Dirichlet Allocation) — KNN (K-Nearest Neighbor) is proposed. LDA is used to solve the problem...

chapter

An improved text classification model for mobile data security testing

Feng Xiaorong, Lin Jun, Mai Songtao, Jia Shizhun

2017 IEEE 2nd Advanced Information Technology, Electronic and Automation Control Conference (IAEAC) > 1732 - 1736

2017 IEEE 2nd Advanced Information Technology, Electronic and Automation Control Conference (IAEAC)

In the view of mobile data security detection, text classification model can be realized in the application layer to detect malicious attacks. Since traditional C4.5 decision tree has the disadvantage of no considering about interaction influence between properties in attribute selection, an improved model of C4.5 decision tree based on AdaBoost algorithm is put forward. The problem in measuring the...

chapter

Effective text classification using multi-level fuzzy neural network

Shima Zobeidi, Marjan Naderan, Seyed Enayatollah Alavi

2017 5th Iranian Joint Congress on Fuzzy and Intelligent Systems (CFIS) > 91 - 96

2017 5th Iranian Joint Congress on Fuzzy and Intelligent Systems (CFIS)

Nowadays, large volumes of text data are being produced in real time due to expansion of communication. It is necessary to organize this data for exploitation and extraction of useful information. Text classification based on the topic is one of the efficient solutions to this problem. Efficient algorithms are applied for text classification if they address high dimensional data. In this paper, a...

chapter

Sentiment classification on big data using Naïve bayes and logistic regression

Anjuman Prabhat, Vikas Khullar

2017 International Conference on Computer Communication and Informatics (ICCCI) > 1 - 5

2017 International Conference on Computer Communication and Informatics (ICCCI)

The huge expansion of world wide web has involved a contemporary fashion of conveying the attitude or viewpoint of human being. It is a channel where anybody any visualize opinion and sentiments of different customers. It is also possible to see opinion classified into different categories and ratings given on different products. This information plays a supreme role in sentiment classification task...

chapter

Text categorization using Rocchio algorithm and random forest algorithm

S. Thamarai Selvi, P. Karthikeyan, A. Vincent, V. Abinaya, more

2016 Eighth International Conference on Advanced Computing (ICoAC) > 7 - 12

2016 Eighth International Conference on Advanced Computing (ICoAC)

Millions of file uploads and downloads happen every minute resulting in big data creation and manual text categorization is not possible. Hence, there is a need for automatic categorization of documents that makes storage and retrieval more efficient. This research paper proposes a hybrid text categorization model that combines both Rocchio algorithm and Random Forest algorithm to perform Multi-label...

Keywords:
CLASSIFICATION ALGORITHMS
Publication type:
book

Publication date

Set your own date range

Content availability

Available (469)
None (5)

Keywords

TRAINING (252)
TEXT ANALYSIS (247)
SUPPORT VECTOR MACHINES (165)
TEXT CLASSIFICATION (145)
FEATURE EXTRACTION (137)
PATTERN CLASSIFICATION (120)
ACCURACY (114)
ALGORITHM DESIGN AND ANALYSIS (94)
SUPPORT VECTOR MACHINE CLASSIFICATION (89)
CLASSIFICATION (88)
MACHINE LEARNING (87)
DATA MINING (76)
FEATURE SELECTION (68)
LEARNING (ARTIFICIAL INTELLIGENCE) (59)
SUPPORT VECTOR MACHINE (46)
INTERNET (45)
NATURAL LANGUAGE PROCESSING (43)
BAYES METHODS (40)
SVM (38)
CLUSTERING ALGORITHMS (37)
INFORMATION RETRIEVAL (37)
COMPUTERS (34)
MACHINE LEARNING ALGORITHMS (33)
TEXT MINING (32)
TESTING (30)
SEMANTICS (29)
ENTROPY (27)
NIOBIUM (26)
KERNEL (24)
VECTOR SPACE MODEL (24)
COMPUTATIONAL MODELING (23)
KNN (22)
WEB PAGES (21)
ARTIFICIAL NEURAL NETWORKS (20)
DECISION TREES (20)
TRAINING DATA (20)
PROBABILITY (19)
DATABASES (18)
FILTERING (18)
MUTUAL INFORMATION (18)
STATISTICAL ANALYSIS (18)
MATHEMATICAL MODEL (17)
VECTORS (17)
BAYESIAN METHODS (16)
CORRELATION (16)
DICTIONARIES (16)
PATTERN CLUSTERING (16)
CLASSIFICATION TREE ANALYSIS (15)
PREDICTION ALGORITHMS (15)
COMPUTER SCIENCE (14)
GENETIC ALGORITHMS (14)
INDEXING (14)
INFORMATION GAIN (14)
NAIVE BAYES (14)
EDUCATIONAL INSTITUTIONS (13)
INFORMATION FILTERING (13)
DOCUMENT HANDLING (12)
INDEXES (12)
ROUGH SET THEORY (12)
SEMI-SUPERVISED LEARNING (12)
SENTIMENT ANALYSIS (12)
VOCABULARY (12)
WORD PROCESSING (12)
DISTANCE MEASUREMENT (11)
EQUATIONS (11)
NEAREST NEIGHBOR SEARCHES (11)
ONTOLOGIES (11)
WEB SITES (11)
CONTEXT (10)
DATA MODELS (10)
DECISION TREE (10)
ELECTRONIC MAIL (10)
ENCODING (10)
NOISE (10)
ROUGH SET (10)
TEXT CLASSIFICATION ALGORITHM (10)
CHINESE TEXT CATEGORIZATION (9)
CLUSTERING (9)
DIMENSION REDUCTION (9)
FUZZY SET THEORY (9)
MATRIX DECOMPOSITION (9)
NAIVE BAYES CLASSIFIER (9)
ONTOLOGIES (ARTIFICIAL INTELLIGENCE) (9)
OPTIMIZATION (9)
SEARCH ENGINES (9)
SET THEORY (9)
ARTIFICIAL INTELLIGENCE (8)
COMPLEXITY THEORY (8)
DECISION MAKING (8)
DOCUMENT CLASSIFICATION (8)
FEATURE SELECTION METHOD (8)
FILTERING ALGORITHMS (8)
GAIN (8)
GENETIC ALGORITHM (8)
K-NEAREST NEIGHBOR (8)
KNN ALGORITHM (8)
KNOWLEDGE ENGINEERING (8)
NAïVE BAYES (8)
more

INFONA - science communication portal

Search results

Hierarchical document clustering based on cosine similarity measure

A comprehensive study of text classification algorithms

Emotion classification on youtube comments using word embedding

Sentiment Classification: Feature Selection Based Approaches Versus Deep Learning

Reasearch on feature mapping based on labels information in multi-label text classification

Sentiment Classification Incorporating User Profile

The evaluation of heterogeneous classifier ensembles for Turkish texts

Automatic keyword extraction system for Thai website categorization system

On multiclass text classification algorithm based on 1-a-r and multiconlitron

Using KNN algorithm for classification of textual documents

A preprocessing method of AdaBoost for mislabeled data classification

Research review on key techniques of topic-based news elements extraction

Naive Bayes classifiers for music emotion classification based on lyrics

Fusing Gini Index and Term Frequency for Text Feature Selection

Decision tree rule-based feature selection for large-scale imbalanced data

Research on text categorization model based on LDA — KNN

An improved text classification model for mobile data security testing

Effective text classification using multi-level fuzzy neural network

Sentiment classification on big data using Naïve bayes and logistic regression

Text categorization using Rocchio algorithm and random forest algorithm

Filter options

Publication date

Content availability

Keywords

INFONA - science communication portal

Search results

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Content availability

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options