Search results

chapter

An Empirical Evaluation of Techniques for Feature Selection with Cost

Stephen Adams, Ryan Meekins, Peter A. Beling

2017 IEEE International Conference on Data Mining Workshops (ICDMW) > 834 - 841

2017 IEEE International Conference on Data Mining Workshops (ICDMW)

Feature selection is the process of selecting a subset of relevant features from the larger set of collected features. As the amount of available data grows with technology, feature selection becomes a more important part of the system-design process. In real-world applications, there are several costs associated with the collection, processing, and storage of data. Given that these costs can vary...

chapter

Predicting hospital readmission from longitudinal healthcare data using graph pattern mining based temporal phenotypes

Xiangzhen Xu, Lizhen Cui, Shijun Liu, Hui Li, more

2017 IEEE International Conference on Bioinformatics and Biomedicine (BIBM) > 824 - 829

2017 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)

The rapidly increasing availability of healthcare data from multiple heterogeneous sources has spearheaded the adoption of data-driven approaches for improved clinical research, decision making, and patient management. The patient healthcare data are usually longitudinal and can be expressed as medical event sequences, where the events include clinical diagnosis, medications, laboratory reports, etc...

chapter

Distributed Representations of Subgraphs

Bijaya Adhikari, Yao Zhang, Naren Ramakrishnan, B. Aditya Prakash

2017 IEEE International Conference on Data Mining Workshops (ICDMW) > 111 - 117

2017 IEEE International Conference on Data Mining Workshops (ICDMW)

There has been a surge in research interest in learning feature representation of networks in recent times. Researchers, motivated by the recent successes of embeddings in natural language processing and advances in deep learning, have explored various means for network embedding. Network embedding is useful as it can exploit off-the-shelf machine learning algorithms for network mining tasks like...

chapter

Co-Training for Demographic Classification Using Deep Learning from Label Proportions

Ehsan Mohammady Ardehaly, Aron Culotta

2017 IEEE International Conference on Data Mining Workshops (ICDMW) > 1017 - 1024

2017 IEEE International Conference on Data Mining Workshops (ICDMW)

Deep learning algorithms have recently produced state-of-the-art accuracy in many classification tasks, but this success is typically dependent on access to many annotated training examples. For domains without such data, an attractive alternative is to train models with light, or distant supervision. In this paper, we introduce a deep neural network for the Learning from Label Proportion (LLP) setting,...

chapter

Estimating Personality from Social Media Posts

Nasser Alsadhan, David Skillicorn

2017 IEEE International Conference on Data Mining Workshops (ICDMW) > 350 - 356

2017 IEEE International Conference on Data Mining Workshops (ICDMW)

An individual's personality determines the probable repertoire of their reactions to a particular situation. A social robot is much more effective if it is able to learn and so take into account the properties of the humans around it, including personalities. We investigate how well personality can be estimated based on modest amounts of speech or writing, which a social robot might (over)hear. Such...

chapter

Post Rectifying Methods to Improve the Accuracy of Image Annotation

Artin Ghostan Khatchatoorian, Mansour Jamzad

2017 International Conference on Digital Image Computing: Techniques and Applications (DICTA) > 1 - 7

2017 International Conference on Digital Image Computing: Techniques and Applications (DICTA)

Image annotation methods construct a Tag distance matrix, which entries show the relevancy of tags for each test image. More accuracy in calculating this matrix provides better annotation results. The aim of our two methods is to improve the accuracy of the Tag distance matrix using the class information already available in most datasets. If the class information is not available, extracting important...

chapter

An ensemble of rule-based classifiers for incomplete data

Cao Truong Tran, Mengjie Zhang, Peter Andreae, Bing Xue, more

2017 21st Asia Pacific Symposium on Intelligent and Evolutionary Systems (IES) > 7 - 12

2017 21st Asia Pacific Symposium on Intelligent and Evolutionary Systems (IES)

Many real-world datasets suffer from the problem of missing values. Imputation which replaces missing values with plausible values is a major method for classification with data containing missing values. However, powerful imputation methods including multiple imputation are usually computationally intensive for estimating missing values in unseen incomplete instances. Rule-based classification algorithms...

article

Maritime Traffic Probabilistic Forecasting Based on Vessels’ Waterway Patterns and Motion Behaviors

Zhe Xiao, Loganathan Ponnambalam, Xiuju Fu, Wanbing Zhang

IEEE Transactions on Intelligent Transportation Systems > 2017 > 18 > 11 > 3122 - 3134

Maritime traffic prediction is critical for ocean transportation safety management. In this paper, we propose a novel knowledge assisted methodology for maritime traffic forecasting based on a vessel’s waterway pattern and motion behavior. The vessel’s waterway pattern is extracted through a proposed lattice-based DBSCAN algorithm that significantly reduces the problem scale, and its motion behavior...

chapter

Missing value imputation methods for TCM medical data and its effect in the classifier accuracy

Dan Zeng, Dan Xie, Ran Liu, Xiaodong Li

2017 IEEE 19th International Conference on e-Health Networking, Applications and Services (Healthcom) > 1 - 4

2017 IEEE 19th International Conference on e-Health Networking, Applications and Services (Healthcom)

Objective: Medical data mining is a research hotspot. But medical data often contains missing values, which brings difficulties to the medical data analysis. This work evaluates the performance of several imputation methods. Methods: In this paper, we first simulate the missing data set by completely deleting some data from the complete data set, and use the Euclidean distance KNN, the correlation...

chapter

Exploring risk factors and predicting UPDRS score based on Parkinson's speech signals

Jianxin Zhang, Weifeng Xu, Qiang Zhang, Bo Jin, more

2017 IEEE 19th International Conference on e-Health Networking, Applications and Services (Healthcom) > 1 - 6

2017 IEEE 19th International Conference on e-Health Networking, Applications and Services (Healthcom)

The unified Parkinson's disease rating scale (UPDRS) is the most widely employed scale for tracking Parkinson's disease (PD) symptom progression. However, conventional way to achieve UPDRS, mainly based on the physical examinations of clinic patients performed by the trained medical staffs, involves the disadvantages of inconvenience and high medical expense. Hence, in this study, we try to explore...

chapter

Reversible data hiding based on directional prediction and multiple histograms modification

Chang Song, Yifeng Zhang, Guojun Lu

2017 9th International Conference on Wireless Communications and Signal Processing (WCSP) > 1 - 6

2017 9th International Conference on Wireless Communications and Signal Processing (WCSP)

The reversible data hiding is an emerging technology that uses the redundancy of the carrier (typically digital images) to embed secret information and ensure the reversibility of the carrier and hidden information. In recent year, a number of reversible data hiding algorithms based on prediction error expansion have been developed. In prediction error expansion, prediction on the center pixel is...

chapter

Improvement of ID3 algorithm based on simplified information entropy and coordination degree

Li Yi-bin, Wang Ying-ying, Rong Xue-wen

2017 Chinese Automation Congress (CAC) > 1526 - 1530

2017 Chinese Automation Congress (CAC)

In data classification mining, the decision tree method is a key algorithm. ID3 (Iterative Dichotomiser 3) algorithm which was presented by Quinlan is a famous decision tree algorithms, but ID3 has some shortcomings such as high complex computation in computing the information entropy expression, multivalue bios problem in the process of selecting an optimal attribute, large scales, etc. In order...

chapter

Candidate Teacher Performance Prediction Using Classification Techniques: A Case Study of High Schools in Gaza-Strip

Mohammed KH. Zoroub, Ashraf Y. Maghari

2017 International Conference on Promising Electronic Technologies (ICPET) > 129 - 134

2017 International Conference on Promising Electronic Technologies (ICPET)

This paper aims to build data mining model to predict the performance of candidate teachers who apply for employment in education of high schools of Gaza Strip. We apply three classification algorithms on our dataset which are Decision Tree, Naïve Bays and KNN. Our dataset contains 8000 teacher records collected from ministry of education in Gaza Strip. Although there are a lot of researchers...

chapter

Data mining classification experiments with decision trees over the forest covertype database

Laviniu Aurelian Badulescu

2017 21st International Conference on System Theory, Control and Computing (ICSTCC) > 236 - 241

2017 21st International Conference on System Theory, Control and Computing (ICSTCC)

The paper exposes the behavior of the Decision Trees (DT) algorithms on a big database with many cases and many attributes: Forest Covertype (FC) from UCI Knowledge Discovery in Databases Archive. In classification experiments considered have been taken into account 22 splitting criteria and two pruning methods whose performances were presented in terms of classification error rate on test data, data...

chapter

Predicting heart failure class using a sequence prediction algorithm

Carine Bou Rjeily, Georges Badr, Amir Hajjam Al Hassani, Emmanuel Andres

2017 Fourth International Conference on Advances in Biomedical Engineering (ICABME) > 1 - 4

2017 Fourth International Conference on Advances in Biomedical Engineering (ICABME)

One of the major causes of death in the world is Heart Failure. This disease affects directly the heart's pumping job. Because of this perturbation, nutriments and oxygen are not well circulated and distributed. The New York Heart Association has classified this disease into four different classes based on patient symptoms. In this paper, we are using a data mining technique, more precisely a sequential...

chapter

Multiple early-termination scheme for TZ search algorithm based on data mining and decision trees

Paulo Goncalves, Guilherme Correa, Marcelo Porto, Bruno Zatt, more

2017 IEEE 19th International Workshop on Multimedia Signal Processing (MMSP) > 1 - 6

2017 IEEE 19th International Workshop on Multimedia Signal Processing (MMSP)

The latest video compression standards, such as the H.264/AVC and the High Efficiency Video Coding (HEVC), provide fast Motion Estimation (ME) algorithms in their reference software aiming at complexity reduction. Test Zone Search (TZS) is the state-of-the-art fast ME algorithm, currently deployed in the reference HEVC encoder due to its great coding efficiency. However, ME is still one of the main...

chapter

A novel clustering algorithm based on searched experiences

Chun-Wei Tsai, Yong-Chun Ding, Ming-Chao Chiang, Chu-Sing Yang

2017 IEEE International Conference on Systems, Man, and Cybernetics (SMC) > 804 - 808

2017 IEEE International Conference on Systems, Man and Cybernetics (SMC)

How to reduce the computation time and how to improve the quality of the clustering result are the two major research issues. Although several efficient and effective clustering algorithms have been presented, none of which is perfect. As such, an effective clustering algorithm, which is based on the prediction of searching information to determine the search directions at later iterations and employs...

chapter

Short-term traffic flow prediction with Conv-LSTM

Yipeng Liu, Haifeng Zheng, Xinxin Feng, Zhonghui Chen

2017 9th International Conference on Wireless Communications and Signal Processing (WCSP) > 1 - 6

2017 9th International Conference on Wireless Communications and Signal Processing (WCSP)

The accurate short-term traffic flow prediction can provide timely and accurate traffic condition information which can help one to make travel decision and mitigate the traffic jam. Deep learning (DL) provides a new paradigm for the analysis of big data generated by the urban daily traffic. In this paper, we propose a novel end-to-end deep learning architecture which consists of two modules. We combine...

chapter

Feature selection software development using Artificial Bee Colony on DNA microarray data

Wildan Andaru, Iwan Syarif, Ali Ridho Barakbah

2017 International Electronics Symposium on Knowledge Creation and Intelligent Computing (IES-KCIC) > 6 - 11

2017 International Electronics Symposium on Knowledge Creation and Intelligent Computing (IES-KCIC)

DNA Microarray data is a high-dimensional data that enables the researchers to analyze the expression of many genes in a single reaction quickly and in an efficient manner. Its characteristics such as small sample size, class imbalance, and data complexity causes it difficult to classified. Feature selection is a process that automatically selects features that are most relevant to the predictive...

chapter

Sequential Mining Classification

Carine Bou Rjeily, Georges Badr, Amir Hajjam El Hassani, Emmanuel Andres

2017 International Conference on Computer and Applications (ICCA) > 190 - 194

2017 International Conference on Computer and Applications (ICCA)

Sequential pattern mining is a data mining technique that aims to extract and analyze frequent subsequences from sequences of events or items with time constraint. Sequence data mining was introduced in 1995 with the well-known Apriori algorithm. The algorithm studied the transactions through time, in order to extract frequent patterns from the sequences of products related to a customer. Later, this...

INFONA - science communication portal

Search results

An Empirical Evaluation of Techniques for Feature Selection with Cost

Predicting hospital readmission from longitudinal healthcare data using graph pattern mining based temporal phenotypes

Distributed Representations of Subgraphs

Co-Training for Demographic Classification Using Deep Learning from Label Proportions

Estimating Personality from Social Media Posts

Post Rectifying Methods to Improve the Accuracy of Image Annotation

An ensemble of rule-based classifiers for incomplete data

Maritime Traffic Probabilistic Forecasting Based on Vessels’ Waterway Patterns and Motion Behaviors

Missing value imputation methods for TCM medical data and its effect in the classifier accuracy

Exploring risk factors and predicting UPDRS score based on Parkinson's speech signals

Reversible data hiding based on directional prediction and multiple histograms modification

Improvement of ID3 algorithm based on simplified information entropy and coordination degree

Candidate Teacher Performance Prediction Using Classification Techniques: A Case Study of High Schools in Gaza-Strip

Data mining classification experiments with decision trees over the forest covertype database

Predicting heart failure class using a sequence prediction algorithm

Multiple early-termination scheme for TZ search algorithm based on data mining and decision trees

A novel clustering algorithm based on searched experiences

Short-term traffic flow prediction with Conv-LSTM

Feature selection software development using Artificial Bee Colony on DNA microarray data

Sequential Mining Classification

Filter options

Publication date

Content availability

Publication type

Keywords

INFONA - science communication portal

Search results

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Content availability

Publication type

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options