Search results

Items from 1 to 20 out of 29 results

chapter

Recurrent convolution neural networks for classification of protein-protein interaction articles from biomedical literature

Sabenabanu Abdulkadhar, Gurusamy Murugesan, Jeyakumar Natarajan

2017 Third International Conference on Research in Computational Intelligence and Communication Networks (ICRCICN) > 192 - 197

2017 Third International Conference on Research in Computational Intelligence and Communication Networks (ICRCICN)

Text classification (TC) is a task that assigns a text to one or more classes and predefined categories. Constructing text classifiers with high accuracy is a vital task in biomedical field, given the wealth of information hidden in unlabelled documents. Because of large feature spaces, traditionally discriminative approaches, such as logistic regression and support vector machines with n-gram and...

chapter

Haber metinlerinin farkli metin madenciliği yöntemleriyle siniflandirilmasi

Fatma Baskaya, Ilhan Aydin

2017 International Artificial Intelligence and Data Processing Symposium (IDAP) > 1 - 5

2017 International Artificial Intelligence and Data Processing Symposium (IDAP)

With the development of technology, people are entering the virtual world more and more. Parallel to this, the internet becomes a bigger network every day and it gets a complex structure depending on this growth. Achieving the desired information with structred data becomes an increasingly important problem. One of the useful ways to find solution for this problem is to divide this complex data into...

chapter

Turkish tweet sentiment analysis with word embedding and machine learning

Deger Ayata, Murat Saraclar, Arzucan Ozgur

2017 25th Signal Processing and Communications Applications Conference (SIU) > 1 - 4

2017 25th Signal Processing and Communications Applications Conference (SIU)

This work includes processing and classification of tweets which are written in Turkish language. Four different sector tweet datasets are vectorized with Word Embedding model and classified with Support Vector Machine and Random Forests classifiers and results have been compared. We have showed that sector based tweet classification is more successful compared to general tweets. Accuracy rates for...

chapter

Research on text categorization model based on LDA — KNN

Weihua Chen, Xian Zhang

2017 IEEE 2nd Advanced Information Technology, Electronic and Automation Control Conference (IAEAC) > 2719 - 2726

2017 IEEE 2nd Advanced Information Technology, Electronic and Automation Control Conference (IAEAC)

In the text classification, The similarity between the text need to be calculated, but the existing classification methods only consider the similarity between feature words and categories and does not involve the semantic similarity between feature words. In this paper, a new classification model LDA (Latent Dirichlet Allocation) — KNN (K-Nearest Neighbor) is proposed. LDA is used to solve the problem...

chapter

Bag-of-Concepts Document Representation for Bayesian Text Classification

Marcos Mourino-Garcia, Roberto Perez-Rodriguez, Luis Anido-Rifon, Miguel Gomez-Carballa

2016 IEEE International Conference on Computer and Information Technology (CIT) > 281 - 288

2016 IEEE International Conference on Computer and Information Technology (CIT)

The classification of text documents into a number of pre-defined categories has many application scenarios, for example the classification of news items into thematic sections. Documents to be classified are commonly represented by a bag-of-words feature vector. The bag-of-words model cannot handle two language phenomena: synonymy and polysemy, besides, dimensions of feature vectors are orthogonal...

chapter

Importance weighted feature selection strategy for text classification

Baoli Li

2016 International Conference on Asian Language Processing (IALP) > 344 - 347

2016 International Conference on Asian Language Processing (IALP)

Feature selection, which aims at obtaining a compact and effective feature subset for better performance and higher efficiency, has been studied for decades. The traditional feature selection metrics, such as Chi-square and information gain, fail to consider how important a feature is in a document. Features, no matter how much effective semantic information they hold, are treated equally. Intuitively,...

chapter

Semi-supervised learning using higher-order co-occurrence paths to overcome the complexity of data representation

Murat Can Ganiz

2016 IEEE International Conference on Systems, Man, and Cybernetics (SMC) > 2242 - 2247

2016 IEEE International Conference on Systems, Man, and Cybernetics (SMC)

We present a novel approach to semi-supervised learning for text classification based on the higher-order co-occurrence paths of words. We name the proposed method as Semi-Supervised Semantic Higher-Order Smoothing (S3HOS). The S3HOS is built on a tri-partite graph based data representation of labeled and unlabeled documents that allows semantics in higher-order co-occurrence paths between terms (words)...

chapter

Semantic text classification with tensor space model-based naïve Bayes

Han-joon Kim, Jiyun Kim, Jinseog Kim

2016 IEEE International Conference on Systems, Man, and Cybernetics (SMC) > 4206 - 4210

2016 IEEE International Conference on Systems, Man, and Cybernetics (SMC)

This paper presents a semantic naïve Bayes classification technique that is based upon our tensor space model for text representation. In our work, each of Wikipedia articles is defined as a single concept, and a document is represented as a 2^nd-order tensor. Our method expands the conventional naïve Bayes by incorporating the semantic concept features into term feature statistics under the tensor-space...

chapter

Unsupervised feature selection for text classification via word embedding

Weikang Rui, Jinwen Liu, Yawei Jia

2016 IEEE International Conference on Big Data Analysis (ICBDA) > 1 - 5

2016 IEEE International Conference on Big Data Analysis (ICBDA)

The key of big text documents data analysis is to classify those text documents. To classify those text documents, it is necessary to represent those text documents as vectors which is vector space model (VSM). A powerful vector space model should remain the classification information with dimensions as little as possible. To achieve that, it is important to select most effective features for text...

chapter

Spam filtering by semantics-based text classification

Wei Hu, Jinglong Du, Yongkang Xing

2016 Eighth International Conference on Advanced Computational Intelligence (ICACI) > 89 - 94

2016 Eighth International Conference on Advanced Computational Intelligence (ICACI)

Spam has been a serious and annoying problem for decades. Even though plenty of solutions have been put forward, there still remains a lot to be promoted in filtering spam emails more efficiently. Nowadays a major problem in spam filtering as well as text classification in natural language processing is the huge size of vector space due to the numerous feature terms, which is usually the cause of...

chapter

A novel feature selection based on Tibetan grammar for Tibetan text classification

Tao Jiang, Hongzhi Yu

2015 6th IEEE International Conference on Software Engineering and Service Science (ICSESS) > 445 - 448

2015 6th IEEE International Conference on Software Engineering and Service Science (ICSESS)

Feature selection is a strategy that aims at making text classifiers more efficient and accurate. In this paper, we proposed a novel feature selection method based on Tibetan grammar for Tibetan classification. Tibetan language express grammatical meaning through the function words and word order, and the function word has large proportions. By analyzing the Tibetan grammar and distribution of part...

chapter

Sentiment analysis of a document using deep learning approach and decision trees

Arman S. Zharmagambetov, Alexandr A. Pak

2015 Twelve International Conference on Electronics Computer and Computation (ICECCO) > 1 - 4

2015 Twelve International Conference on Electronics Computer and Computation (ICECCO)

The given paper describes modern approach to the task of sentiment analysis of movie reviews by using deep learning recurrent neural networks and decision trees. These methods are based on statistical models, which are in a nutshell of machine learning algorithms. The fertile area of research is the application of Google's algorithm Word2Vec presented by Tomas Mikolov, Kai Chen, Greg Corrado and Jeffrey...

chapter

Latent Factor SVM for Text Categorization

Xiaofei Zhou, Li Guo, Ping Liu, Yanbing Liu

2014 IEEE International Conference on Data Mining Workshop > 105 - 110

2014 IEEE International Conference on Data Mining Workshop (ICDMW)

Text categorization is an important research in nature language process and content analysis. In this paper, we present latent factor SVM (LF-SVM) for text categorization which use latent factor vectors for category representation on text categorization. We prove that latent factors extracted by PLSA (probability latent semantic analysis) can span convex structure to express text category. Based on...

chapter

Improving classification performance by extending documents terms

Widodo, Wahyu Catur Wibowo

2014 International Conference on Data and Software Engineering (ICODSE) > 1 - 5

2014 International Conference on Data and Software Engineering (ICODSE)

Classification is a technique in data mining for categorizing objects. Text Classification is re-challenged for classifying very short documents or text as shown in social media collection. This paper proposes a method to improve the performance of classification on short documents. In this work, we expand words in every document before the documents are classified We use TFIDF model, Hidden Markov...

chapter

A simple semantic kernel approach for SVM using higher-order paths

Berna Altinel, Murat Can Ganiz, Banu Diri

2014 IEEE International Symposium on Innovations in Intelligent Systems and Applications (INISTA) Proceedings > 431 - 435

2014 IEEE International Symposium on Innovations in Intelligent Systems and Applications (INISTA)

The bag of words (BOW) representation of documents is very common in text classification systems. However, the BOW approach ignores the position of the words in the document and more importantly, the semantic relations between the words. In this study, we present a simple semantic kernel for Support Vector Machines (SVM) algorithm. This kernel uses higher-order relations between terms in order to...

chapter

A novel higher-order semantic kernel for text classification

Berna Altlnel, Murat Can Ganiz, Banu Diri

2013 International Conference on Electronics, Computer and Computation (ICECCO) > 216 - 219

2013 International Conference on Electronics, Computer and Computation (ICECCO)

In conventional text categorization algorithms, documents are symbolized as “bag of words” (BOW) with the fact that documents are supposed to be independent from each other. While this approach simplifies the models, it ignores the semantic information between terms of each document. In this study, we develop a novel method to measure semantic similarity based on higher-order dependencies between...

chapter

Sparse Topic Model for text classification

Tao Liu

2013 International Conference on Machine Learning and Cybernetics > 4 > 1916 - 1920

2013 International Conference on Machine Learning and Cybernetics (ICMLC)

This paper addresses a new text classification method: Sparse Topic Model, which represents documents by the sparse coding of topics. Topics contain more semantic information than words, so it's more effective for feature representation of documents. Topics are extracted from documents by LDA in an unsupervised way. Based on these topics, sparse coding is applied to discover more high-level representation...

chapter

RLS-MARS: An Effective Feature Selection Tool for Text Classification

Li Xi, Dai Hang, Wang Mingwen

2012 Fourth International Conference on Multimedia Information Networking and Security > 254 - 257

2012 4th International Conference on Multimedia Information Networking and Security (MINES)

The RLS-MARS (Regularized Least Squares-Multi Angle Regression and Shrinkage) feature selection model is used to select the relevant information, in which both, the keeping and the leaving-out of the regularizer are present. The RLS-MARS model is to find a series of directions in multidimensional space, leading the gradient vectors to change along those directions which would make the gradient matrix's...

chapter

The Mining of Term Semantic Relationships and its Application in Text Classification

Sun Yueheng, Liu Xing, Cui Xiaoyuan

2012 Fifth International Conference on Intelligent Computation Technology and Automation > 356 - 359

2012 Fifth International Conference on Intelligent Computation Technology and Automation (ICICTA)

This paper proposes an approach for mining the semantic relationships between terms. Using a dependency model based on syntactic parsing, the syntactic features of a term are first extracted from large scale corpus, and then the vector representation for this term is constructed. By the cosine similarities between vectors, we can get the semantically related words for a term. We apply the semantic...

chapter

Feature selection algorithm based on the Community discovery

Xiaoqiang Jia

2011 Seventh International Conference on Computational Intelligence and Security > 455 - 458

2011 Seventh International Conference on Computational Intelligence and Security (CIS)

In order to overcome the SVM for text classification ignoring the context of semantic information and the use of a community to text classification, one boundary point can only belong to a community of view, the concept of contribution and overlapping coefficient based on the complex network diagram is introduced. And feature selection algorithm based on community discovery is proposed. Experiments...

Keywords:
TEXT CATEGORIZATION
SEMANTICS

Publication date

Set your own date range

Keywords

FEATURE EXTRACTION (11)
ACCURACY (9)
CLASSIFICATION ALGORITHMS (9)
TRAINING (9)
FEATURE SELECTION (7)
TEXT ANALYSIS (7)
MACHINE LEARNING (5)
SUPPORT VECTOR MACHINE CLASSIFICATION (5)
SUPPORT VECTOR MACHINES (5)
VECTORS (5)
NATURAL LANGUAGE PROCESSING (4)
PATTERN CLASSIFICATION (4)
COMPUTATIONAL MODELING (3)
COMPUTERS (3)
ELECTRONIC MAIL (3)
ENCYCLOPEDIAS (3)
MUTUAL INFORMATION (3)
SUPPORT VECTOR MACHINE (3)
BAYES METHODS (2)
BOW (2)
INDEXES (2)
INFORMATION GAIN (2)
INFORMATION RETRIEVAL (2)
INTERNET (2)
KERNEL (2)
MOTION PICTURES (2)
NIOBIUM (2)
SEMANTIC KERNEL (2)
SENTIMENT ANALYSIS (2)
SYNTACTICS (2)
TEXT MINING (2)
VECTOR SPACE MODEL (2)
WORD EMBEDDING (2)
ABSTRACTS (1)
ARTIFICIAL INTELLIGENCE (1)
ARTIFICIAL NEURAL NETWORKS (1)
AUTOMATIC TEXT CLASSIFICATION (1)
BAG OF WORD (1)
BAG-OF-CONCEPTS (1)
BAG-OF-WORDS (1)
BAGGING (1)
BANKING (1)
BAYESIAN METHODS (1)
BIG DATA (1)
BIOCREATIVE (1)
BOW REPRESENTATION (1)
BOW REPRESENTATION MODEL (1)
CAA SYSTEM (1)
CATEGORIZATION (1)
CHARACTERISTIC COLLECTION (1)
CHARACTERISTIC COLLECTION DEFLATION (1)
CHI-SQUARE (1)
CHINESE INFORMATION PROCESSING (1)
CHINESE TEXT CATEGORIZATION (1)
CHINESE TEXT CONCEPT (1)
CLASSIFICATION (1)
CLUSTERING (1)
COMMUNITIES (1)
COMMUNITY (1)
COMPLEX NETWORK GRAPH (1)
COMPLEX NETWORKS (1)
COMPUTER AIDED INSTRUCTION (1)
COMPUTER ASSISTED ASSESSMENT (1)
COMPUTER ASSISTED ASSESSMENT APPROACH (1)
CONCEPT ELEMENT (1)
CONCEPT ELEMENTS (1)
CONCEPTS (1)
CONFERENCES (1)
CONTEXT (1)
CONTRIBUTION (1)
CONVOLUTION (1)
COREFERENCE ANALYSIS (1)
CORRELATIVE RELATION (1)
CORRELATIVE RELATIONS (1)
DATA MINING (1)
DATA MODELS (1)
DATA STRUCTURES (1)
DEEP LEARNING (1)
DEPENDENCY MODEL (1)
DEPENDENT RELATION (1)
DEPENDENT RELATIONS (1)
DICTIONARIES (1)
DOCUMENT CLASSIFICATION PROCESS (1)
DOCUMENT MINING (1)
DOCUMENT REPRESENTATION (1)
DOMAIN RELEVANCE TERM WEIGHTING METHOD (1)
EDUCATIONAL ADMINISTRATIVE DATA PROCESSING (1)
EDUCATIONAL INSTITUTIONS (1)
ELECTRONIC PUBLISHING (1)
ERROR ANALYSIS (1)
ESSAY ASSESSMENT (1)
EXTEND WORDS (1)
FEATURE SELECT (1)
FEATURE VECTOR (1)
FILTERING (1)
FREE TEXT ANSWER (1)
FREE-TEXT ASSESMENT (1)
more

INFONA - science communication portal

Search results

Recurrent convolution neural networks for classification of protein-protein interaction articles from biomedical literature

Haber metinlerinin farkli metin madenciliği yöntemleriyle siniflandirilmasi

Turkish tweet sentiment analysis with word embedding and machine learning

Research on text categorization model based on LDA — KNN

Bag-of-Concepts Document Representation for Bayesian Text Classification

Importance weighted feature selection strategy for text classification

Semi-supervised learning using higher-order co-occurrence paths to overcome the complexity of data representation

Semantic text classification with tensor space model-based naïve Bayes

Unsupervised feature selection for text classification via word embedding

Spam filtering by semantics-based text classification

A novel feature selection based on Tibetan grammar for Tibetan text classification

Sentiment analysis of a document using deep learning approach and decision trees

Latent Factor SVM for Text Categorization

Improving classification performance by extending documents terms

A simple semantic kernel approach for SVM using higher-order paths

A novel higher-order semantic kernel for text classification

Sparse Topic Model for text classification

RLS-MARS: An Effective Feature Selection Tool for Text Classification

The Mining of Term Semantic Relationships and its Application in Text Classification

Feature selection algorithm based on the Community discovery

Filter options

Publication date

Keywords

INFONA - science communication portal

Search results

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options