Search results

Items from 1 to 20 out of 374 results

chapter

Vietnamese news classification based on BoW with keywords extraction and neural network

Toan Pham Van, Ta Minh Thanh

2017 21st Asia Pacific Symposium on Intelligent and Evolutionary Systems (IES) > 43 - 48

2017 21st Asia Pacific Symposium on Intelligent and Evolutionary Systems (IES)

Nowadays, text classification (TC) becomes the main applications of NLP (natural language processing). Actually, we have a lot of researches in classifying text documents, such as Random Forest, Support Vector Machines and Naive Bayes. However, most of them are applied for English documents. Therefore, the text classification researches on Vietnamese still are limited. By using a Vietnamese news corpus,...

chapter

Detection of cyberbullying on social media messages in Turkish

Selma Ayse Ozel, Esra Sarac, Seyran Akdemir, Hulya Aksu

2017 International Conference on Computer Science and Engineering (UBMK) > 366 - 370

2017 International Conference on Computer Science and Engineering (UBMK)

The increased use of the Internet and the ease of access to online communities like social media have provided an avenue for cybercrimes. Cyberbullying, which is a kind of cybercrime, is defined as an aggressive, intentional action against a defenseless person by using the Internet, social media, or other electronic contents. Researchers have found that many of the bullying cases have tragically ended...

chapter

Categorizing the Turkish web pages by data mining techniques

Secil Sekerci Husem, Ayla Gulcu

2017 International Conference on Computer Science and Engineering (UBMK) > 255 - 260

2017 International Conference on Computer Science and Engineering (UBMK)

Today, it is not possible to use human power alone to cope with the increasing amount of data. For this reason, some automated methods are needed to group similar documents together or to place documents in predefined categories according to certain rules. The use of automated classification techniques is becoming increasingly important for this reason. In this study, a database consisting of 22 thousand...

chapter

Effects of various preprocessing techniques to Turkish text categorization using n-gram features

Ayca Deniz, Hakan Ezgi Kiziloz

2017 International Conference on Computer Science and Engineering (UBMK) > 655 - 660

2017 International Conference on Computer Science and Engineering (UBMK)

Natural Language Processing (NLP) is a prominent subject which includes various subcategories such as text classification, error correction, machine translation, etc. Unlike other languages, there are limited number of Turkish NLP studies in literature. In this study, we apply text classification on Turkish documents by using n-gram features. Our algorithm applies different preprocessing techniques,...

chapter

A comprehensive study of text classification algorithms

Vikas K Vijayan, K. R. Bindu, Latha Parameswaran

2017 International Conference on Advances in Computing, Communications and Informatics (ICACCI) > 1109 - 1113

2017 International Conference on Advances in Computing, Communications and Informatics (ICACCI)

Huge amount of data in today's world are stored in the form of electronic documents. Text mining is the process of extracting the information out of those textual documents. Text classification is the process of classifying text documents into fixed number of predefined classes. The application of text classification includes spam filtering, email routing, sentiment analysis, language identification...

chapter

Emotion classification on youtube comments using word embedding

Julio Savigny, Ayu Purwarianti

2017 International Conference on Advanced Informatics, Concepts, Theory, and Applications (ICAICTA) > 1 - 5

2017 International Conference on Advanced Informatics, Concepts, Theory, and Applications (ICAICTA)

Youtube is one of the most popular video sharing platform in Indonesia. A person can react to a video by commenting on the video. A comment may contain an emotion that can be identified automatically. In this study, we conducted experiments on emotion classification on Indonesian Youtube comments. A corpus containing 8,115 Youtube comments is collected and manually labelled using 6 basic emotion label...

chapter

Sentiment Classification: Feature Selection Based Approaches Versus Deep Learning

Alper Kursat Uysal, Yi Lu Murphey

2017 IEEE International Conference on Computer and Information Technology (CIT) > 23 - 30

2017 IEEE International Conference on Computer and Information Technology (CIT)

Classification of text documents is commonly carried out using various models of bag-of-words that are generated using feature selection methods. In these models, selected features are used as input to well-known classifiers such as Support Vector Machines (SVM) and neural networks. In recent years, a technique called word embeddings has been developed for text mining and, deep learning models using...

chapter

Text genre classification research

Zhijuan Xu, Lizhen Liu, Wei Song, Chao Du

2017 International Conference on Computer, Information and Telecommunication Systems (CITS) > 175 - 178

2017 International Conference on Computer, Information and Telecommunication Systems (CITS)

Essays in different text genres have different ideas and writing method. Prediction the text genres firstly will help get a better accuracy when predicting the success of literary or finding the beautiful words and sentences in the essay. And it will help set a different standard for different text genres when scoring the writing by computer. Words and structure can be effective in discriminating...

chapter

Reasearch on feature mapping based on labels information in multi-label text classification

Tao Wang, Tao Luo, Jianfeng Li, Cong Wang

2017 7th IEEE International Conference on Electronics Information and Emergency Communication (ICEIEC) > 452 - 456

2017 7th IEEE International Conference on Electronics Information and Emergency Communication (ICEIEC)

Feature representation plays an important role in text classification. Feature mapping based on labels information is an algorithm suitable for Binary Relevance. Compared with the conventional text representation, it makes the dimension of the text under control by means of word embedding. More importantly, it takes full advantage of the general characteristics of the label on text representation...

chapter

The evaluation of heterogeneous classifier ensembles for Turkish texts

Zeynep Hilal Kilimci, Selim Akyokus, Sevinc Ilhan Omurca

2017 IEEE International Conference on INnovations in Intelligent SysTems and Applications (INISTA) > 307 - 311

2017 IEEE International Conference on INnovations in Intelligent SysTems and Applications (INISTA)

The basic idea behind the classifier ensembles is to use more than one classifier by expecting to improve the overall accuracy. It is known that the classifier ensembles boost the overall classification performance by depending on two factors namely, individual success of the base learners and diversity. One way of providing diversity is to use the same or different type of base learners. When the...

chapter

Automatic keyword extraction system for Thai website categorization system

Adsadawut Chanakitkarnchok, Kulit Na Nakorn, Kultida Rojviboonchai

2017 14th International Conference on Electrical Engineering/Electronics, Computer, Telecommunications and Information Technology (ECTI-CON) > 206 - 209

2017 14th International Conference on Electrical Engineering/Electronics, Computer, Telecommunications and Information Technology (ECTI-CON)

In this information era, the number of websites in the Internet has dramatically increased over a few years. Any information and services can be retrieved from the website. However, the most valuable content of the website is still a text which is related to the topic or category of the websites. But there has only few researches focusing on categorizing Thai language information. The rest of researches...

chapter

On multiclass text classification algorithm based on 1-a-r and multiconlitron

Yuping Qin, Fengfeng Qin, Qiangkui Leng, Aihua Zhang

2017 6th Data Driven Control and Learning Systems (DDCLS) > 370 - 373

2017 IEEE 6th Data Driven Control and Learning Systems Conference (DDCLS)

Aim to multiclass text categorization problem, a classification algorithm based on multiconlitron and 1-a-r method is presented. 1-a-r method is used to convert a multiclass categorization problem to several binary problems. Multiconlitron is constructed for each binary problem in input space. For the text to be classified, its class is decided by multiconlitrons. The classification experiments are...

chapter

Turkish tweet sentiment analysis with word embedding and machine learning

Deger Ayata, Murat Saraclar, Arzucan Ozgur

2017 25th Signal Processing and Communications Applications Conference (SIU) > 1 - 4

2017 25th Signal Processing and Communications Applications Conference (SIU)

This work includes processing and classification of tweets which are written in Turkish language. Four different sector tweet datasets are vectorized with Word Embedding model and classified with Support Vector Machine and Random Forests classifiers and results have been compared. We have showed that sector based tweet classification is more successful compared to general tweets. Accuracy rates for...

chapter

Feature extension for Chinese short text classification based on topical N-Grams

Baoshan Sun, Peng Zhao

2017 IEEE/ACIS 16th International Conference on Computer and Information Science (ICIS) > 477 - 482

2017 IEEE/ACIS 16th International Conference on Computer and Information Science (ICIS)

Because of the feature sparseness problem, conventional text classification methods hardly achieve a good effect on short texts. This paper presents a novel feature extension method based on the TNG model to solve this problem. This algorithm can infers not only the unigram words distribution but also the phrases distribution on each topic. We can build a feature extension library using TNG algorithm...

chapter

Analysis of Turkish parliament records in terms of party coherence

Ersin Esen, Savas Ozkan

2017 25th Signal Processing and Communications Applications Conference (SIU) > 1 - 4

2017 25th Signal Processing and Communications Applications Conference (SIU)

In natural language processing and text mining, highly successful applications are developed with the recently introduced techniques. Particularly, noticeable performance increases are achieved on countless applications by using word embedding method. In this paper, we propose a novel text mining method based on word embedding and Fisher vector. The automatic analysis of political records is selected...

chapter

An ensemble based NLP feature assessment in binary classification

Saurabh Kr. Srivasatava, Roshan Kumari, Sandeep Kr. Singh

2017 International Conference on Computing, Communication and Automation (ICCCA) > 345 - 349

2017 International Conference on Computing, Communication and Automation (ICCCA)

Text feature selection plays an important role in text mining. Terms are the key players in document representation. The document representation can help application in following areas-indexing, summarization, classification, clustering and filtering. Text instances come with a challenge of high dimensional feature space and using such features can be extremely useful in text analysis. Hence it is...

chapter

Fusing Gini Index and Term Frequency for Text Feature Selection

Lin Wu, Yongbin Wang, Shengyan Zhang, Yannan Zhang

2017 IEEE Third International Conference on Multimedia Big Data (BigMM) > 280 - 283

2017 IEEE Third International Conference on Multimedia Big Data (BigMM)

Automatic text classification is the key technology to process and organize large-scale text data. It is well known that the high dimensionality of feature space is a main challenge for text classification. In order to attenuate such a problem as well as inspired by existing arts, we propose an effective text feature selection algorithm by novelly fusing the classical methodologies of Gini index and...

chapter

Feature selection algorithm for hierarchical text classification using Kullback-Leibler divergence

Yao Lifang, Qin Sijun, Zhu Huan

2017 IEEE 2nd International Conference on Cloud Computing and Big Data Analysis (ICCCBDA) > 421 - 424

2017 IEEE 2nd International Conference on Cloud Computing and Big Data Analysis (ICCCBDA)

Text classification, a simple and effective method, is considered as the key technology to deal with and organize a large amount of text data. At present, the simple text classification is unable to meet the increasing of user's demand, hierarchical text classification has received extensive attention and has broad application prospects. Hierarchical feature selection algorithm is the key technology...

chapter

Decision tree rule-based feature selection for large-scale imbalanced data

Haoyue Liu, MengChu Zhou

2017 26th Wireless and Optical Communication Conference (WOCC) > 1 - 6

2017 26th Wireless and Optical Communication Conference (WOCC)

A class imbalance problem often appears in many real world applications, e.g. fault diagnosis, text categorization, fraud detection. When dealing with a large-scale imbalanced dataset, feature selection becomes a great challenge. To confront it, this work proposes a feature selection approach based on a decision tree rule. The effectiveness of the proposed approach is verified by classifying a large-scale...

chapter

Effective text classification using multi-level fuzzy neural network

Shima Zobeidi, Marjan Naderan, Seyed Enayatollah Alavi

2017 5th Iranian Joint Congress on Fuzzy and Intelligent Systems (CFIS) > 91 - 96

2017 5th Iranian Joint Congress on Fuzzy and Intelligent Systems (CFIS)

Nowadays, large volumes of text data are being produced in real time due to expansion of communication. It is necessary to organize this data for exploitation and extraction of useful information. Text classification based on the topic is one of the efficient solutions to this problem. Efficient algorithms are applied for text classification if they address high dimensional data. In this paper, a...

Keywords:
TEXT CATEGORIZATION

Publication date

Set your own date range

Content availability

Available (371)
None (3)

Publication type

book (349)
article (25)

Keywords

CLASSIFICATION ALGORITHMS (167)
TRAINING (159)
TEXT ANALYSIS (157)
FEATURE EXTRACTION (110)
TEXT CLASSIFICATION (94)
MACHINE LEARNING (88)
SUPPORT VECTOR MACHINE (88)
ACCURACY (85)
PATTERN CLASSIFICATION (77)
SVM (63)
CLASSIFICATION (57)
KERNEL (56)
FEATURE SELECTION (48)
DATA MINING (47)
LEARNING (ARTIFICIAL INTELLIGENCE) (45)
SUPPORT VECTOR MACHINE CLASSIFICATION (38)
INTERNET (34)
TEXT MINING (34)
NATURAL LANGUAGE PROCESSING (31)
ALGORITHM DESIGN AND ANALYSIS (30)
NIOBIUM (29)
INFORMATION RETRIEVAL (26)
TESTING (25)
BAYES METHODS (21)
MACHINE LEARNING ALGORITHMS (21)
COMPUTERS (18)
SEMANTICS (18)
VECTORS (18)
INDEXING (17)
CLUSTERING ALGORITHMS (16)
COMPUTATIONAL MODELING (16)
SENTIMENT ANALYSIS (14)
VECTOR SPACE MODEL (14)
ARTIFICIAL NEURAL NETWORKS (13)
DECISION TREES (13)
KNN (12)
NAIVE BAYES (12)
WEB PAGES (12)
EDUCATIONAL INSTITUTIONS (11)
ELECTRONIC MAIL (10)
ENCODING (10)
LOGISTICS (10)
MUTUAL INFORMATION (10)
SENTIMENT CLASSIFICATION (10)
STANDARDS (10)
STATISTICAL ANALYSIS (10)
SUPERVISED LEARNING (10)
TRAINING DATA (10)
WEB SITES (10)
DICTIONARIES (9)
DOCUMENT CLASSIFICATION (9)
DOCUMENT HANDLING (9)
NATURAL LANGUAGES (9)
PROBABILITY (9)
SVM CLASSIFIER (9)
CHINESE TEXT CATEGORIZATION (8)
COMPUTER SCIENCE (8)
CONFERENCES (8)
CORRELATION (8)
DISTANCE MEASUREMENT (8)
ENTROPY (8)
EQUATIONS (8)
INDEXES (8)
K-NEAREST NEIGHBOR (8)
MATHEMATICAL MODEL (8)
NOISE (8)
TERM WEIGHTING (8)
TEXT REPRESENTATION (8)
BAYESIAN METHODS (7)
COMPUTATIONAL LINGUISTICS (7)
CONTEXT (7)
DATABASES (7)
DECISION TREE (7)
DIMENSIONALITY REDUCTION (7)
DOCUMENT CATEGORIZATION (7)
FILTERING (7)
MEASUREMENT (7)
NEAREST NEIGHBOR SEARCHES (7)
NEURAL NETWORKS (7)
RADIO FREQUENCY (7)
RANDOM FOREST (7)
ROUGH SET (7)
ROUGH SET THEORY (7)
TWITTER (7)
VOCABULARY (7)
CLASSIFICATION TREE ANALYSIS (6)
CLUSTERING (6)
DIMENSION REDUCTION (6)
FREQUENCY MEASUREMENT (6)
GENETIC ALGORITHMS (6)
INFORMATION EXTRACTION (6)
INFORMATION GAIN (6)
LEARNING SYSTEMS (6)
ONTOLOGIES (ARTIFICIAL INTELLIGENCE) (6)
OPTIMIZATION (6)
PATTERN CLUSTERING (6)
PREDICTION ALGORITHMS (6)
SOFTWARE (6)
more

Data set

ieee (359)
Elsevier (7)
Springer (7)
BazTech (1)

INFONA - science communication portal

Search results

Vietnamese news classification based on BoW with keywords extraction and neural network

Detection of cyberbullying on social media messages in Turkish

Categorizing the Turkish web pages by data mining techniques

Effects of various preprocessing techniques to Turkish text categorization using n-gram features

A comprehensive study of text classification algorithms

Emotion classification on youtube comments using word embedding

Sentiment Classification: Feature Selection Based Approaches Versus Deep Learning

Text genre classification research

Reasearch on feature mapping based on labels information in multi-label text classification

The evaluation of heterogeneous classifier ensembles for Turkish texts

Automatic keyword extraction system for Thai website categorization system

On multiclass text classification algorithm based on 1-a-r and multiconlitron

Turkish tweet sentiment analysis with word embedding and machine learning

Feature extension for Chinese short text classification based on topical N-Grams

Analysis of Turkish parliament records in terms of party coherence

An ensemble based NLP feature assessment in binary classification

Fusing Gini Index and Term Frequency for Text Feature Selection

Feature selection algorithm for hierarchical text classification using Kullback-Leibler divergence

Decision tree rule-based feature selection for large-scale imbalanced data

Effective text classification using multi-level fuzzy neural network

Filter options

Publication date

Content availability

Publication type

Keywords

Data set

INFONA - science communication portal

Search results

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Content availability

Publication type

Keywords

Data set

Reporting an error / abuse

Sending the report failed

Accessibility options