Search results

Items from 1 to 20 out of 39 results

chapter

An application of strength pareto evolutionary algorithm for feature selection from crime data

Priyanka Das, Asit Kumar Das

2017 8th International Conference on Computing, Communication and Networking Technologies (ICCCNT) > 1 - 6

2017 8th International Conference on Computing, Communication and Networking Technologies (ICCCNT)

Genetic algorithm is a computational technique that helps to find the optimal solution in the process of natural selection and crossover involving the basic steps for every evolutionary algorithms. The present work accentuates on an application of a genetic algorithm named strength pareto evolutionary algorithm (SPEA) for selection of features from crime datasets. The proposed work extracts crime...

chapter

Crime analysis against women from online newspaper reports and an approach to apply it in dynamic environment

Priyanka Das, Asit Kumar Das

2017 International Conference on Big Data Analytics and Computational Intelligence (ICBDAC) > 312 - 317

2017 International Conference on Big Data Analytics and Computational Intelligence (ICBDAC)

Crime against women in India has become an eminent topic of discussion in recent years and the issue has been brought to the foreground for concern due to the increasing trends in crimes performed against women. Most of the crimes get reported and a massive dataset is being generated every year. Analysing the crime reports can help the law enforcement section to take preventive measures for reducing...

chapter

Dimensionality reduction approach for high dimensional text documents

G. Suresh Reddy

2016 International Conference on Engineering & MIS (ICEMIS) > 1 - 6

2016 International Conference on Engineering & MIS (ICEMIS)

Feature dimensionality has always been one of the key challenges in text mining as it increases complexity when mining documents with high dimensionality. High dimensionality introduces sparseness, noise, and boosts the computational and space complexities. Dimensionality reduction is usually addressed by implementing either feature reduction or feature selection techniques. In this work, the problem...

chapter

Text mining: An improvised feature based model approach

Shivaprasad KM, T Hanumantha Reddy

2016 2nd International Conference on Applied and Theoretical Computing and Communication Technology (iCATccT) > 38 - 42

2016 2nd International Conference on Applied and Theoretical Computing and Communication Technology (iCATccT)

In this knowledge era, plethora of textual information is growing rapidly which is usually semistructured or unstructured data collected and stored in various databases. Discovery of knowledge from this available database is not simple. Thus, the automatic feature selection approach is very much necessary in the processing of this unstructured data. The Feature selection approach focuses towards processing...

chapter

Effective 20 Newsgroups Dataset Cleaning

Khaled Albishre, Mubarak Albathan, Yuefeng Li

2015 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology (WI-IAT) > 3 > 98 - 101

2015 IEEE / WIC / ACM International Conference on Web Intelligence and Intelligent Agent Technology (WI-IAT)

The rapid increase in the number of text documents available on the Internet has created pressure to use effective cleaning techniques. Cleaning techniques are needed for converting these documents to structured documents. Text cleaning techniques are one of the key mechanisms in typical text mining application frameworks. In this paper, we explore the role of text cleaning in the 20 newsgroups dataset,...

chapter

Optimized Swarm Search-Based Feature Selection for Text Mining in Sentiment Analysis

Simon Fong, Elisa Gao, Raymond Wong

2015 IEEE International Conference on Data Mining Workshop (ICDMW) > 1153 - 1162

2015 IEEE International Conference on Data Mining Workshop (ICDMW)

Sentiment analysis emerged as an important computational domain to gain insights from snippets of texts, as social media recently gained popularity. Text mining has long been a fundamental data analytic for sentiment analysis. One of the popular preprocessing approaches in text mining is transforming text strings to word vectors which form a high-dimensional sparse matrix. This sparse matrix poses...

chapter

Iterative Term Weighting for Short Text Data

Chutao Zheng, Cheng Liu, Hau-San Wong

2015 IEEE International Conference on Systems, Man, and Cybernetics > 1687 - 1692

2015 IEEE International Conference on Systems, Man, and Cybernetics (SMC)

With the development of social media applications, short text mining is becoming more and more important. Due to the sparseness of short text data, both the feature correlation information (word co-occurrence) and data contiguity information (context information) are less reliable, thus most existing text mining methods which are designed to address regular text data are less efficient in short text...

chapter

Hybrid feature selection methods for online biomedical publication classification

Long Ma, Yanqing Zhang, Raj Sunderraman, Peter T. Fox, more

2015 IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB) > 1 - 8

2015 IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB)

We review several feature selection methods: Recursive Feature Elimination, Select K Best, and Random Forests, as elements of a processing chain for feature selection in a text mining task. The text mining task is a multi-label classification problem of label assignment; metadata that is usually applied to published scientific papers by expert curators. In the formulation of this classification task,...

chapter

Evaluating service quality in insurance customer complaint handling throught text categorization

Shuang Dong, Zhihong Wang

2015 International Conference on Logistics, Informatics and Service Sciences (LISS) > 1 - 5

2015 International Conference on Logistics, Informatics and Service Sciences (LISS)

The paper presents the findings of an industry-based study in the utility of text categorization. The purpose of the study is to explore new approach to evaluate service quality of customer complaint handling. The industrial research setting is a large China insurance company. The text categorization methodologies are used in this research including nature language processing and machine learning...

chapter

Feature selection for event extraction in biomedical text

Amit Majumder, Mohammed Hasanuzzaman, Asif Ekbal

2015 Eighth International Conference on Advances in Pattern Recognition (ICAPR) > 1 - 6

2015 Eighth International Conference on Advances in Pattern Recognition (ICAPR)

In this paper we report our work on multiobjective optimization (MOO) based feature selection approach for event extraction in biomedical texts. Event extraction deals with the detection and classification of expressions that represent complex biological phenomenon involving genes and proteins. We perform feature selection within the framework of a robust machine learning algorithm, namely Conditional...

chapter

How to protect investors? A GA-based DWD approach for financial statement fraud detection

Xinyang Li, Wei Xu, Xuesong Tian

2014 IEEE International Conference on Systems, Man, and Cybernetics (SMC) > 3548 - 3554

2014 IEEE International Conference on Systems, Man and Cybernetics - SMC

As one type of the financial fraud, financial statement fraud has not only led to a huge loss for individual investors and financial institutions, but also impacted the overall stability of the whole industry. This paper used financial and textual features extracted from annually submitted 10-k filings and combined data and text mining techniques for detection of financial statement fraud. When the...

chapter

Automatic Defect Categorization Based on Fault Triggering Conditions

Xin Xia, David Lo, Xinyu Wang, Bo Zhou

2014 19th International Conference on Engineering of Complex Computer Systems > 39 - 48

2014 19th International Conference on Engineering of Complex Computer Systems (ICECCS)

Due to the complexity of software systems, defects are inevitable. Understanding the types of defects could help developers to adopt measures in current and future software releases. In practice, developers often categorize defects into various types. One common categorization is based on fault triggers of defects. Fault trigger is a set of conditions which activate a defect (i.e., Fault) and propagate...

chapter

Towards an Improvement of Bug Severity Classification

Nivir Kanti Singha Roy, Bruno Rossi

2014 40th EUROMICRO Conference on Software Engineering and Advanced Applications > 269 - 276

2014 40th EUROMICRO Conference on Software Engineering and Advanced Applications (SEAA)

Predicting the severity of bugs has been found in past research to improve triaging and the bug resolution process. For this reason, many classification/prediction approaches emerged over the years to provide an automated reasoning over severity classes. In this paper, we use text mining together with bi-grams and feature selection to improve the classification of bugs in severe/non-severe classes...

chapter

Mining Relevant Text Features for Retrieving Web Information

Luepol Pipanmaekaporn, Suwatchai Kamolsantiroj

2014 IIAI 3rd International Conference on Advanced Applied Informatics > 447 - 452

2014 IIAI 3rd International Conference on Advanced Applied Informatics (IIAIAAI)

It is a big challenge to develop effective methods that can discover high quality and useful features in text documents. Most existing information retrieval and text mining methods focuses on term-based approach that often suffers from the problems of term variation and noise. This paper illustrates an innovative approach that discovers relevant knowledge to precisely describe text features for retrieving...

chapter

Automated Configuration Bug Report Prediction Using Text Mining

Xin Xia, David Lo, Weiwei Qiu, Xingen Wang, more

2014 IEEE 38th Annual Computer Software and Applications Conference > 107 - 116

2014 IEEE 38th Annual Computer Software and Applications Conference (COMPSAC)

Configuration bugs are one of the dominant causes of software failures. Previous studies show that a configuration bug could cause huge financial losses in a software system. The importance of configuration bugs has attracted various research studies, e.g., To detect, diagnose, and fix configuration bugs. Given a bug report, an approach that can identify whether the bug is a configuration bug could...

chapter

Designing and Implementing a Real-Time Speech Summarizer System

Ding Yuan Cheng, Chi Hua Chen, Yu Rou Wu, Chi Chun Lo, more

2014 International Symposium on Computer, Consumer and Control > 725 - 728

2014 International Symposium on Computer, Consumer and Control (IS3C)

As the number of speech and video documents increases on the Internet and portable devices proliferate, speech summarization becomes increasingly essential. Relevant research in this domain has typically focused on broadcasts and news, however, the automatic summarization methods used in the past may not apply to other speech domains (e.g., speech in lectures). Therefore, this study explores the lecture...

chapter

Educational data mining: A case study of teacher's classroom questions

Anwar Ali Yahya, Addin Osman, Ahmed Abdu Alattab

2013 13th International Conference on Intellient Systems Design and Applications > 92 - 97

2013 13th International Conference on Intelligent Systems Design and Applications (ISDA)

This paper presents a new application of data mining techniques, particularly text mining, to analyze educational questions asked by teachers in classrooms. More specifically, it reports on the performance of four machine learning techniques and four feature selection approaches on the classification of teacher's questions into different cognitive levels identified in Bloom's taxonomy. In doing so,...

chapter

An Empirical Study on Improving Severity Prediction of Defect Reports Using Feature Selection

Cheng-Zen Yang, Chun-Chi Hou, Wei-Chen Kao, Ing-Xiang Chen

2012 19th Asia-Pacific Software Engineering Conference > 1 > 240 - 249

2012 19th Asia-Pacific Software Engineering Conference (APSEC)

In software maintenance, severity prediction on defect reports is an emerging issue obtaining research attention due to the considerable triaging cost. In the past research work, several text mining approaches have been proposed to predict the severity using advanced learning models. Although these approaches demonstrate the effectiveness of predicting the severity, they do not discuss the problem...

chapter

Design and Implementation of Parallel Term Contribution Algorithm Based on Mapreduce Model

Peng Chao, Wu Bin, Deng Chao

2012 7th Open Cirrus Summit > 43 - 47

2012 7th Open Cirrus Summit (OCS)

MapReduce is a software framework introduced byGoogle in 2004 to support distributed computing on large datasets on clusters of computers. The term contribution(TC)algorithm is a relatively new algorithm in text mining to selectfeatures for clustering. In this paper, we design and implement a parallel term contribution(PTC) algorithm based on MapReduce model. By experiment, we come to the conclusion...

chapter

Categorizing temporal events: A case study of domestic terrorism

Wingyan Chung

2012 IEEE International Conference on Intelligence and Security Informatics > 159 - 161

2012 IEEE International Conference on Intelligence and Security Informatics (ISI 2012)

In many emergency incidents, multiple reports and information sources are often used to help intelligence and security personnel to understand the situation during a short time period. Proper categorization and analysis of this information could enhance the efficiency of handling this large amount of potentially conflicting information, thus contributing to saving lives. The study of categorization...

Data set:
ieee
Keywords:
FEATURE SELECTION
Publication type:
book

Publication date

Set your own date range

Keywords

FEATURE EXTRACTION (23)
DATA MINING (14)
TEXT ANALYSIS (13)
TRAINING (12)
SUPPORT VECTOR MACHINES (11)
TEXT CATEGORIZATION (9)
ACCURACY (7)
CLASSIFICATION ALGORITHMS (7)
MACHINE LEARNING (7)
CLUSTERING ALGORITHMS (6)
NATURAL LANGUAGE PROCESSING (6)
CLASSIFICATION (5)
PATTERN CLUSTERING (5)
INFORMATION RETRIEVAL (4)
SVM (4)
DATA MODELS (3)
ELECTRONIC MAIL (3)
PATTERN CLASSIFICATION (3)
PREDICTIVE MODELS (3)
PROTEINS (3)
ROUGH SET THEORY (3)
STATISTICAL ANALYSIS (3)
SUPPORT VECTOR MACHINE CLASSIFICATION (3)
TESTING (3)
ALGORITHM DESIGN AND ANALYSIS (2)
ART (2)
BAYES METHODS (2)
BUILDINGS (2)
CLASSIFICATION TREE ANALYSIS (2)
COMPLEXITY THEORY (2)
COMPUTER BUGS (2)
COMPUTERS (2)
CONFERENCES (2)
CONTEXT (2)
DATABASES (2)
DOCUMENT CLUSTERING (2)
DOCUMENT HANDLING (2)
EDUCATIONAL INSTITUTIONS (2)
FEATURE REDUCTION (2)
FREQUENCY MEASUREMENT (2)
INTERNET (2)
K-MEANS CLUSTERING (2)
KERNEL (2)
MACHINE LEARNING ALGORITHMS (2)
MEDIA (2)
MEDICAL INFORMATION SYSTEMS (2)
MEDLINE (2)
NAïVE BAYES (2)
NIOBIUM (2)
NOISE MEASUREMENT (2)
PARTICLE SWARM OPTIMIZATION (2)
PARTITIONING ALGORITHMS (2)
ROUGH SET (2)
ROUGH SETS (2)
SECURITY (2)
SEMANTICS (2)
SENTIMENT ANALYSIS (2)
TEXT CLASSIFICATION (2)
TEXT CLUSTERING (2)
20 NEWSGROUPS (1)
ABSTRACT-LEVEL SYSTEMS (1)
AGGRESSIVE FEATURE SELECTION (1)
AMBIGUITY MEASURE (1)
ANOMALY DETECTION (1)
APPROXIMATED C-MEDIODS (1)
APPROXIMATION METHODS (1)
ARABIC CORPUS (1)
ARABIC LANGUAGE PROCESSING (1)
ARABIC TEXT CATEGORIZATION (1)
ARABIC TEXT DOCUMENT CLASSIFICATION (1)
ARTIFICIAL INTELLIGENCE (1)
ARTIFICIAL NEURAL NETWORKS (1)
AUTOMATION (1)
BEHAVIOURAL SCIENCES COMPUTING (1)
BIBLIOGRAPHIC SYSTEMS (1)
BIOHAZARDS (1)
BIOINFORMATICS (1)
BIOLOGICAL CELLS (1)
BIOMEDICAL TEXT (1)
BIOTERRORISM (1)
BLOGS (1)
BLOOM'S TAXONOMY (1)
BOHRBUG (1)
BOUNDARY EXPANSION (1)
BREAD MODEL (1)
BUG SEVERITY CLASSIFICATION (1)
BUSINESS (1)
CAMERAS (1)
CANCER (1)
CANCER THERAPY (1)
CATALOGS (1)
CATEGORIZATION (1)
CHARACTER N-GRAM REPRESENTATION (1)
CHI SQUARE FEATURE SELECTION (1)
CHI-SQUARE (1)
CHINESE TEXT CATEGORIZATION (1)
CHINESE TEXT CLASSIFICATION (1)
CITIES AND TOWNS (1)
more

INFONA - science communication portal

Search results

An application of strength pareto evolutionary algorithm for feature selection from crime data

Crime analysis against women from online newspaper reports and an approach to apply it in dynamic environment

Dimensionality reduction approach for high dimensional text documents

Text mining: An improvised feature based model approach

Effective 20 Newsgroups Dataset Cleaning

Optimized Swarm Search-Based Feature Selection for Text Mining in Sentiment Analysis

Iterative Term Weighting for Short Text Data

Hybrid feature selection methods for online biomedical publication classification

Evaluating service quality in insurance customer complaint handling throught text categorization

Feature selection for event extraction in biomedical text

How to protect investors? A GA-based DWD approach for financial statement fraud detection

Automatic Defect Categorization Based on Fault Triggering Conditions

Towards an Improvement of Bug Severity Classification

Mining Relevant Text Features for Retrieving Web Information

Automated Configuration Bug Report Prediction Using Text Mining

Designing and Implementing a Real-Time Speech Summarizer System

Educational data mining: A case study of teacher's classroom questions

An Empirical Study on Improving Severity Prediction of Defect Reports Using Feature Selection

Design and Implementation of Parallel Term Contribution Algorithm Based on Mapreduce Model

Categorizing temporal events: A case study of domestic terrorism

Filter options

Publication date

Keywords

INFONA - science communication portal

Search results

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options