Search results

Items from 1 to 20 out of 1,151 results

chapter

Vietnamese news classification based on BoW with keywords extraction and neural network

Toan Pham Van, Ta Minh Thanh

2017 21st Asia Pacific Symposium on Intelligent and Evolutionary Systems (IES) > 43 - 48

2017 21st Asia Pacific Symposium on Intelligent and Evolutionary Systems (IES)

Nowadays, text classification (TC) becomes the main applications of NLP (natural language processing). Actually, we have a lot of researches in classifying text documents, such as Random Forest, Support Vector Machines and Naive Bayes. However, most of them are applied for English documents. Therefore, the text classification researches on Vietnamese still are limited. By using a Vietnamese news corpus,...

chapter

Recurrent convolution neural networks for classification of protein-protein interaction articles from biomedical literature

Sabenabanu Abdulkadhar, Gurusamy Murugesan, Jeyakumar Natarajan

2017 Third International Conference on Research in Computational Intelligence and Communication Networks (ICRCICN) > 192 - 197

2017 Third International Conference on Research in Computational Intelligence and Communication Networks (ICRCICN)

Text classification (TC) is a task that assigns a text to one or more classes and predefined categories. Constructing text classifiers with high accuracy is a vital task in biomedical field, given the wealth of information hidden in unlabelled documents. Because of large feature spaces, traditionally discriminative approaches, such as logistic regression and support vector machines with n-gram and...

chapter

Detection of cyberbullying on social media messages in Turkish

Selma Ayse Ozel, Esra Sarac, Seyran Akdemir, Hulya Aksu

2017 International Conference on Computer Science and Engineering (UBMK) > 366 - 370

2017 International Conference on Computer Science and Engineering (UBMK)

The increased use of the Internet and the ease of access to online communities like social media have provided an avenue for cybercrimes. Cyberbullying, which is a kind of cybercrime, is defined as an aggressive, intentional action against a defenseless person by using the Internet, social media, or other electronic contents. Researchers have found that many of the bullying cases have tragically ended...

chapter

Categorizing the Turkish web pages by data mining techniques

Secil Sekerci Husem, Ayla Gulcu

2017 International Conference on Computer Science and Engineering (UBMK) > 255 - 260

2017 International Conference on Computer Science and Engineering (UBMK)

Today, it is not possible to use human power alone to cope with the increasing amount of data. For this reason, some automated methods are needed to group similar documents together or to place documents in predefined categories according to certain rules. The use of automated classification techniques is becoming increasingly important for this reason. In this study, a database consisting of 22 thousand...

chapter

Effects of various preprocessing techniques to Turkish text categorization using n-gram features

Ayca Deniz, Hakan Ezgi Kiziloz

2017 International Conference on Computer Science and Engineering (UBMK) > 655 - 660

2017 International Conference on Computer Science and Engineering (UBMK)

Natural Language Processing (NLP) is a prominent subject which includes various subcategories such as text classification, error correction, machine translation, etc. Unlike other languages, there are limited number of Turkish NLP studies in literature. In this study, we apply text classification on Turkish documents by using n-gram features. Our algorithm applies different preprocessing techniques,...

chapter

Similarity detection between Turkish text documents with distance metrics

Mumine Kaya Keles, Selma Ayse Ozel

2017 International Conference on Computer Science and Engineering (UBMK) > 316 - 321

2017 International Conference on Computer Science and Engineering (UBMK)

The aim of this study is to compare the successes of various distance metrics and to determine the most appropriate methods in order to detect similarities among textual documents written in Turkish. Computing similarities between text documents is the basic step of plagiarism detection, and text mining methods like author detection, text classification and clustering. Therefore, plagiarism detection...

chapter

A fuzzy logic-based text classification method for social media data

KeYuan Wu, MengChu Zhou, Xiaoyu Sean Lu, Li Huang

2017 IEEE International Conference on Systems, Man, and Cybernetics (SMC) > 1942 - 1947

2017 IEEE International Conference on Systems, Man and Cybernetics (SMC)

Social media offer abundant information for studying people's behaviors, emotions and opinions during the evolution of various rare events such as natural disasters. It is useful to analyze the correlation between social media and human-affected events. This study uses Hurricane Sandy 2012 related Twitter text data to conduct information extraction and text classification. Considering that the original...

chapter

Hierarchical document clustering based on cosine similarity measure

Shraddha K. Popat, Pramod B. Deshmukh, Vishakha A. Metre

2017 1st International Conference on Intelligent Systems and Information Management (ICISIM) > 153 - 159

2017 1st International Conference on Intelligent Systems and Information Management (ICISIM)

Clustering is one of the prime topics in data mining. Clustering partitions the data and classifies the data into meaningful subgroups. Document clustering is a set of the document into groups such that two groups show different characteristics with respect to likeness. In this paper, an experimental exploration of similarity based method, HSC for measuring the similarity between data objects particularly...

chapter

Document embedding approach for efficient authorship attribution

Hayri Volkan Agun, Ozgur Yilmazel

2017 2nd International Conference on Knowledge Engineering and Applications (ICKEA) > 194 - 198

2017 2nd International Conference on Knowledge Engineering and Applications (ICKEA)

Authorship attribution has been well studied in terms of text classification with many diverse feature sets. However, finding topic independent features is hard and trained models with hand crafted features in one domain may not work in another domain. In this study we used a semi-supervised neural language model which is known as document embeddings for authorship attribution problem. This method...

chapter

A fusion method of text categorization based on key sentence extraction and neural network

Fang Fang, Zhen Wu, Luchen Zhang, Shi Wang, more

2017 2nd International Conference on Knowledge Engineering and Applications (ICKEA) > 166 - 172

2017 2nd International Conference on Knowledge Engineering and Applications (ICKEA)

In this paper, we present a novel method to classify directions of capital flows in Internet finance. Our method is different from previous text classification methods in that extracts key sentences which may directly reflect the semantics of input text before classification. We use the Bi-LSTM model as a classifier to process input sentences. In this paper, we represent the matrix of key sentences...

chapter

Character-Level neural networks for short text classification

Jingxue Liu, Fanrong Meng, Yong Zhou, Bing Liu

2017 International Smart Cities Conference (ISC2) > 1 - 7

2017 International Smart Cities Conference (ISC2)

Since short text is characterized of the short length, sparse features and strong context dependency, the traditional models have a limited precision. Motivated by this, this article offers an empirical exploration on a character-level model which implements a combination of convolutional neural network(CNN) and recurrent neural networks(RNN) for short text classification. Including the highway networks...

chapter

Text classification on mahout with Naïve-Bayes machine learning algorithm

Mehmet Umut Salur, Sezai Tokat, Ibrahim Berkan Aydilek

2017 International Artificial Intelligence and Data Processing Symposium (IDAP) > 1 - 5

2017 International Artificial Intelligence and Data Processing Symposium (IDAP)

In daily life, we use the internet for many purposes. The Internet makes easier our life and it has led to the providing to occur new technologies. Several smart devices that use the Internet infrastructure generates digital data in different formats and with different generation speeds. The evaluation of the generated data is carried out by the algorithms associated with the field of machine learning...

chapter

Haber metinlerinin farkli metin madenciliği yöntemleriyle siniflandirilmasi

Fatma Baskaya, Ilhan Aydin

2017 International Artificial Intelligence and Data Processing Symposium (IDAP) > 1 - 5

2017 International Artificial Intelligence and Data Processing Symposium (IDAP)

With the development of technology, people are entering the virtual world more and more. Parallel to this, the internet becomes a bigger network every day and it gets a complex structure depending on this growth. Achieving the desired information with structred data becomes an increasingly important problem. One of the useful ways to find solution for this problem is to divide this complex data into...

chapter

A distributed Arabic text classification approach using latent semantic analysis for big data

Hadeel Alazzam, Abdulsalam Alsmady

2017 12th International Scientific and Technical Conference on Computer Sciences and Information Technologies (CSIT) > 1 > 58 - 61

2017 12th International Scientific and Technical Conference on Computer Sciences and Information Technologies (CSIT)

Recently, big data have special concerns from researchers, this due to the valuable information can be collected from it. LSA has an effective performance in classification, and information retrieval, since it deals with the semantics of the words. In this paper, we proposed a distributed text classification approach based on LSA, and Cosine Similarity, and can be applied to big data. The proposed...

chapter

Deep encrypted text categorization

R. Vinayakumar, K. P. Soman, Prabaharan Poornachandran

2017 International Conference on Advances in Computing, Communications and Informatics (ICACCI) > 364 - 370

2017 International Conference on Advances in Computing, Communications and Informatics (ICACCI)

Long short-term memory (LSTM) is a significant approach to capture the long-range temporal context in sequences of arbitrary length. This had shown astonishing performance in sentence and document modeling. To leverage this, we use LSTM network to the encrypted text categorization at character and word level of texts. These texts are transformed in to dense word-vectors by using bag-of-words embedding...

chapter

Novel hybrid feature selection models for unsupervised document categorization

Amol P. Bhopale, S. Sowmya Kamath

2017 International Conference on Advances in Computing, Communications and Informatics (ICACCI) > 1471 - 1477

2017 International Conference on Advances in Computing, Communications and Informatics (ICACCI)

Dealing with high dimensional data is a challenging and computationally complex task in the data pre-processing phase of text clustering. Conventionally, union and intersection approaches have been used to combine results of different feature selection methods to optimize relevant feature space for document collection. Union method selects all features from considered sub-models, whereas, intersection...

chapter

A comprehensive study of text classification algorithms

Vikas K Vijayan, K. R. Bindu, Latha Parameswaran

2017 International Conference on Advances in Computing, Communications and Informatics (ICACCI) > 1109 - 1113

2017 International Conference on Advances in Computing, Communications and Informatics (ICACCI)

Huge amount of data in today's world are stored in the form of electronic documents. Text mining is the process of extracting the information out of those textual documents. Text classification is the process of classifying text documents into fixed number of predefined classes. The application of text classification includes spam filtering, email routing, sentiment analysis, language identification...

chapter

Feature Selection with Structural Sparse Mode for Text Categorization

Wenbin Zheng, Dan Tang, Haiqing Zhang, Hong Tang

2017 9th International Conference on Intelligent Human-Machine Systems and Cybernetics (IHMSC) > 1 > 359 - 362

2017 9th International Conference on Intelligent Human-Machine Systems and Cybernetics (IHMSC)

The grouped structure has successfully been embedded in sparse models for feature selection; however, some groups generated by clustering method might be difficult to interpret their semantic information if the number of words in the group is very large. This paper proposes a novel approach in which a group structure is constructed and its corresponding sparse model is used to select features for...

chapter

Emotion classification on youtube comments using word embedding

Julio Savigny, Ayu Purwarianti

2017 International Conference on Advanced Informatics, Concepts, Theory, and Applications (ICAICTA) > 1 - 5

2017 International Conference on Advanced Informatics, Concepts, Theory, and Applications (ICAICTA)

Youtube is one of the most popular video sharing platform in Indonesia. A person can react to a video by commenting on the video. A comment may contain an emotion that can be identified automatically. In this study, we conducted experiments on emotion classification on Indonesian Youtube comments. A corpus containing 8,115 Youtube comments is collected and manually labelled using 6 basic emotion label...

chapter

Sentiment Classification: Feature Selection Based Approaches Versus Deep Learning

Alper Kursat Uysal, Yi Lu Murphey

2017 IEEE International Conference on Computer and Information Technology (CIT) > 23 - 30

2017 IEEE International Conference on Computer and Information Technology (CIT)

Classification of text documents is commonly carried out using various models of bag-of-words that are generated using feature selection methods. In these models, selected features are used as input to well-known classifiers such as Support Vector Machines (SVM) and neural networks. In recent years, a technique called word embeddings has been developed for text mining and, deep learning models using...

Publication type:
book

Publication date

Set your own date range

Content availability

Available (1,133)
None (18)

Keywords

TEXT CATEGORIZATION (1,151)
CLASSIFICATION ALGORITHMS (474)
TEXT ANALYSIS (461)
TRAINING (445)
SUPPORT VECTOR MACHINES (349)
FEATURE EXTRACTION (298)
TEXT CLASSIFICATION (259)
ACCURACY (225)
PATTERN CLASSIFICATION (196)
MACHINE LEARNING (181)
CLASSIFICATION (180)
DATA MINING (161)
FEATURE SELECTION (152)
SUPPORT VECTOR MACHINE CLASSIFICATION (131)
INFORMATION RETRIEVAL (116)
LEARNING (ARTIFICIAL INTELLIGENCE) (115)
INTERNET (112)
ALGORITHM DESIGN AND ANALYSIS (109)
TEXT MINING (95)
NATURAL LANGUAGE PROCESSING (94)
SUPPORT VECTOR MACHINE (92)
SEMANTICS (91)
SVM (73)
CLUSTERING ALGORITHMS (65)
COMPUTERS (64)
BAYES METHODS (62)
KERNEL (62)
COMPUTATIONAL MODELING (60)
TESTING (59)
NIOBIUM (54)
VECTORS (54)
ARTIFICIAL NEURAL NETWORKS (50)
ENTROPY (50)
VECTOR SPACE MODEL (47)
DATABASES (43)
TRAINING DATA (43)
INDEXING (42)
MACHINE LEARNING ALGORITHMS (41)
EDUCATIONAL INSTITUTIONS (40)
FILTERING (38)
PATTERN CLUSTERING (37)
VOCABULARY (37)
WEB PAGES (36)
ELECTRONIC MAIL (35)
MATHEMATICAL MODEL (35)
MUTUAL INFORMATION (35)
KNN (34)
DICTIONARIES (33)
ONTOLOGIES (33)
SENTIMENT ANALYSIS (32)
CONFERENCES (31)
INDEXES (31)
PROBABILITY (30)
CORRELATION (29)
DECISION TREES (29)
NAIVE BAYES (29)
STATISTICAL ANALYSIS (29)
BAYESIAN METHODS (27)
CONTEXT (27)
DOCUMENT HANDLING (26)
COMPUTER SCIENCE (25)
NATURAL LANGUAGES (24)
SOFTWARE (24)
DATA MODELS (23)
NEURAL NETWORKS (23)
ONTOLOGIES (ARTIFICIAL INTELLIGENCE) (23)
DOCUMENT CLASSIFICATION (22)
GENETIC ALGORITHMS (22)
OPTIMIZATION (22)
SEARCH ENGINES (22)
WEB SITES (22)
WORD PROCESSING (22)
INFORMATION FILTERING (21)
INFORMATION GAIN (21)
MEASUREMENT (21)
ROUGH SET THEORY (21)
STANDARDS (21)
ENCODING (20)
EQUATIONS (20)
NOISE (20)
CLUSTERING (19)
TERM WEIGHTING (19)
CLASSIFICATION TREE ANALYSIS (18)
TAXONOMY (18)
TF-IDF (18)
TWITTER (18)
DISTANCE MEASUREMENT (17)
NEURAL NETWORK (17)
PREDICTION ALGORITHMS (17)
SENTIMENT CLASSIFICATION (17)
SUPERVISED LEARNING (17)
COMPLEXITY THEORY (16)
FREQUENCY MEASUREMENT (16)
GENETIC ALGORITHM (16)
LARGE SCALE INTEGRATION (16)
MATRIX DECOMPOSITION (16)
PRAGMATICS (16)
TEXT CLUSTERING (16)
CHINESE TEXT CATEGORIZATION (15)
DECISION TREE (15)
more

Data set

ieee (1,093)
Springer (58)

INFONA - science communication portal

Search results

Vietnamese news classification based on BoW with keywords extraction and neural network

Recurrent convolution neural networks for classification of protein-protein interaction articles from biomedical literature

Detection of cyberbullying on social media messages in Turkish

Categorizing the Turkish web pages by data mining techniques

Effects of various preprocessing techniques to Turkish text categorization using n-gram features

Similarity detection between Turkish text documents with distance metrics

A fuzzy logic-based text classification method for social media data

Hierarchical document clustering based on cosine similarity measure

Document embedding approach for efficient authorship attribution

A fusion method of text categorization based on key sentence extraction and neural network

Character-Level neural networks for short text classification

Text classification on mahout with Naïve-Bayes machine learning algorithm

Haber metinlerinin farkli metin madenciliği yöntemleriyle siniflandirilmasi

A distributed Arabic text classification approach using latent semantic analysis for big data

Deep encrypted text categorization

Novel hybrid feature selection models for unsupervised document categorization

A comprehensive study of text classification algorithms

Feature Selection with Structural Sparse Mode for Text Categorization

Emotion classification on youtube comments using word embedding

Sentiment Classification: Feature Selection Based Approaches Versus Deep Learning

Filter options

Publication date

Content availability

Keywords

Data set

INFONA - science communication portal

Search results

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Content availability

Keywords

Data set

Reporting an error / abuse

Sending the report failed

Accessibility options