2017 IEEE International Conference on Data Mining Workshops (ICDMW)

chapter

Survival Random Forest to Predict Time to Fill

Summer M. Husband, Jason Roberts

2017 IEEE International Conference on Data Mining Workshops (ICDMW) > 195 - 198

Traditionally, the time-to-fill metric is used as a scorecard for past performance. An organization may use time to fill to assess the performance of its internal recruiting team, or as a way to set service level agreements with outsourced recruiting partners. By first developing a set of quantifiable job features and then applying survival analysis to historical time-to-fill data, we build a predictor...

chapter

Finding the Best Job Applicants for a Job Posting: A Comparison of Human Resources Search Strategies

Christopher G. Harris

2017 IEEE International Conference on Data Mining Workshops (ICDMW) > 189 - 194

2017 IEEE International Conference on Data Mining Workshops (ICDMW)

Finding the best candidates to match a set of job requirements can be viewed as both an art and a science. In this paper, we conduct an empirical study using actual job candidates and job applicants. We compare the ranked lists generated by executive recruiting experts with the list generated by three search strategies: one using crowdworkers in a gamified environment, a second using information retrieval-based...

chapter

Data-Driven Job Search Engine Using Skills and Company Attribute Filters

Rohit Muthyala, Sam Wood, Yi Jin, Yixing Qin, more

2017 IEEE International Conference on Data Mining Workshops (ICDMW) > 199 - 206

2017 IEEE International Conference on Data Mining Workshops (ICDMW)

According to a report online [34], more than 200 million unique users search for jobs online every month. This incredibly large and fast growing demand has enticed software giants such as Google and Facebook to enter this space, which was previously dominated by companies such as LinkedIn, Indeed, Dice and CareerBuilder. Recently, Google released their “AIpowered Jobs Search Engine”, “Google For Jobs”...

chapter

Principled Multilayer Network Embedding

Weiyi Liu, Pin-yu Chen, Sailung Yeung, Toyotaro Suzumura, more

2017 IEEE International Conference on Data Mining Workshops (ICDMW) > 134 - 141

2017 IEEE International Conference on Data Mining Workshops (ICDMW)

Multilayer network analysis has become a vital tool for understanding different relationships and their interactions in a complex system, where each layer in a multilayer network depicts the topological structure of a group of nodes corresponding to a particular relationship. The interactions among different layers imply how the interplay of different relations on the topology of each layer. For a...

chapter

Network Model Selection for Task-Focused Attributed Network Inference

Ivan Brugere, Chris Kanich, Tanya Y. Berger-Wolf

2017 IEEE International Conference on Data Mining Workshops (ICDMW) > 118 - 125

2017 IEEE International Conference on Data Mining Workshops (ICDMW)

Networks are models representing relationships between entities. Often these relationships are explicitly given, or we must learn a representation which generalizes and predicts observed behavior in underlying individual data (e.g. attributes or labels). Whether given or inferred, choosing the best representation affects subsequent tasks and questions on the network. This work focuses on model selection...

chapter

Deep and Confident Prediction for Time Series at Uber

Lingxue Zhu, Nikolay Laptev

2017 IEEE International Conference on Data Mining Workshops (ICDMW) > 103 - 110

2017 IEEE International Conference on Data Mining Workshops (ICDMW)

Reliable uncertainty estimation for time series prediction is critical in many fields, including physics, biology, and manufacturing. At Uber, probabilistic time series forecasting is used for robust prediction of number of trips during special events, driver incentive allocation, as well as real-time anomaly detection across millions of metrics. Classical time series models are often used in conjunction...

chapter

Message from the Conference Chairs

2017 IEEE International Conference on Data Mining Workshops (ICDMW) > xvii - xviii

2017 IEEE International Conference on Data Mining Workshops (ICDMW)

Presents the introductory welcome message from the conference proceedings. May include the conference officers' congratulations to all involved with the conference event and publication of the proceedings record.

chapter

[Title page i]

2017 IEEE International Conference on Data Mining Workshops (ICDMW) > i

2017 IEEE International Conference on Data Mining Workshops (ICDMW)

Presents the title page of the proceedings record.

chapter

Dealing with Class Imbalance the Scalable Way: Evaluation of Various Techniques Based on Classification Grade and Computational Complexity

Bernhard Schlegel, Bernhard Sick

2017 IEEE International Conference on Data Mining Workshops (ICDMW) > 69 - 78

2017 IEEE International Conference on Data Mining Workshops (ICDMW)

Highly imbalanced datasets continue to be a challenge in many data mining applications. It is surprising that state-of-the-art techniques countering class imbalances are usually very computationally expensive and therefore unscalable. Most research effort has been directed into enhancing those techniques, e.g., by focusing on borderline examples or combining multiple techniques. This is usually accompanied...

chapter

Co-Training for Demographic Classification Using Deep Learning from Label Proportions

Ehsan Mohammady Ardehaly, Aron Culotta

2017 IEEE International Conference on Data Mining Workshops (ICDMW) > 1017 - 1024

2017 IEEE International Conference on Data Mining Workshops (ICDMW)

Deep learning algorithms have recently produced state-of-the-art accuracy in many classification tasks, but this success is typically dependent on access to many annotated training examples. For domains without such data, an attractive alternative is to train models with light, or distant supervision. In this paper, we introduce a deep neural network for the Learning from Label Proportion (LLP) setting,...

chapter

Anomaly Detection for a Water Treatment System Using Unsupervised Machine Learning

Jun Inoue, Yoriyuki Yamagata, Yuqi Chen, Christopher M. Poskitt, more

2017 IEEE International Conference on Data Mining Workshops (ICDMW) > 1058 - 1065

2017 IEEE International Conference on Data Mining Workshops (ICDMW)

In this paper, we propose and evaluate the application of unsupervised machine learning to anomaly detection for a Cyber-Physical System (CPS). We compare two methods: Deep Neural Networks (DNN) adapted to time series data generated by a CPS, and one-class Support Vector Machines (SVM). These methods are evaluated against data from the Secure Water Treatment (SWaT) testbed, a scaled-down but fully...

chapter

Pattern-Based Contextual Anomaly Detection in HVAC Systems

Mohsin Munir, Steffen Erkel, Andreas Dengel, Sheraz Ahmed

2017 IEEE International Conference on Data Mining Workshops (ICDMW) > 1066 - 1073

2017 IEEE International Conference on Data Mining Workshops (ICDMW)

This paper presents detailed anomaly detection evaluation on operational time-series data of Internet of Things (IoT) based household devices in general and Heating, Ventilation and Air Conditioning (HVAC) systems in specific. Due to the number of issues observed during evaluation of widely used distance-based, statistical-based, and cluster-based anomaly detection techniques, we also present a pattern-based...

chapter

EmTaggeR: A Word Embedding Based Novel Method for Hashtag Recommendation on Twitter

Kuntal Dey, Ritvik Shrivastava, Saroj Kaushik, L. Venkata Subramaniam

2017 IEEE International Conference on Data Mining Workshops (ICDMW) > 1025 - 1032

2017 IEEE International Conference on Data Mining Workshops (ICDMW)

The hashtag recommendation problem addresses recommending (suggesting) one or more hashtags to explicitly tag a post made on a given social network platform, based upon the content and context of the post. In this work, we propose a novel methodology for hashtag recommendation for microblog posts, specifically Twitter. The methodology, EmTaggeR, is built upon a training-testing framework that builds...

chapter

Are Words Commensurate with Actions? Quantifying Commitment to a Cause from Online Public Messaging

Zhao Wang, Jennifer Cutler, Aron Culotta

2017 IEEE International Conference on Data Mining Workshops (ICDMW) > 1050 - 1057

2017 IEEE International Conference on Data Mining Workshops (ICDMW)

Public entities such as companies and politicians increasingly use online social networks to communicate directly with their constituencies. Often, this public messaging is aimed at aligning the entity with a particular cause or issue, such as the environment or public health. However, as a consumer or voter, it can be difficult to assess an entity’s true commitment to a cause based on public messaging...

chapter

Discovering Cooperative Structure Among Online Items for Attention Dynamics

Kanji Matsutani, Masahito Kumano, Masahiro Kimura, Kazumi Saito, more

2017 IEEE International Conference on Data Mining Workshops (ICDMW) > 1033 - 1041

2017 IEEE International Conference on Data Mining Workshops (ICDMW)

Social Media allows people to post widely and share the posted online-items. Such items gain their popularity by the amount of attention received. Thus, studies on modeling the arrival process of attention to an individual item have recently attracted a great deal of interest. In this paper, we propose, by combining a Dirichlet process with a Hawkes process in a novel way, a probabilistic model, called...

chapter

Live on TV, Alive on Twitter: Quantifying Continuous Partial Attention of Viewers During Live Television Telecasts

Rohit Saxena, Savita Bhat, Niranjan Pedanekar

2017 IEEE International Conference on Data Mining Workshops (ICDMW) > 1042 - 1049

2017 IEEE International Conference on Data Mining Workshops (ICDMW)

Even while engaged in an attention-consuming activity such as watching TV, social media users often end up paying attention to one or more social media. This is an example of a behavioral phenomenon called Continuous Partial Attention (CPA). Quantification of user attention can be a valuable metric in understanding user behavior under scenarios where their attention is divided. In this study, we propose...

chapter

TweeLoc: A System for Geolocalizing Tweets at Fine-Grain

Pavlos Paraskevopoulos, Giovanni Pellegrini, Themis Palpanas

2017 IEEE International Conference on Data Mining Workshops (ICDMW) > 1178 - 1183

2017 IEEE International Conference on Data Mining Workshops (ICDMW)

The recent rise in the use of social networks has resulted in an abundance of information on different aspects of everyday social activities that is available online. In the process of analysis of identifying the information originating from social networks, and especially Twitter, an important aspect is that of the geographic coordinates, i.e., geolocalisation, of the relevant information. Geolocalized...

chapter

Ranking from Crowdsourced Pairwise Comparisons via Smoothed Matrix Manifold Optimization

Jialin Dong, Kai Yang, Yuanming Shi

2017 IEEE International Conference on Data Mining Workshops (ICDMW) > 949 - 956

2017 IEEE International Conference on Data Mining Workshops (ICDMW)

As the blooming development of data mining in social computing systems (e.g., crowdsourcing system), statistical inference from crowdsourced data severs as a powerful tool to provide diversified services. To support critical applications (e.g., recommendation), in this paper, we shall focus on the collaborative ranking problems and construct a system of which the input is crowdsourced pairwise comparisons...

chapter

A Pattern Tree Based Method for Mining Conditional Contrast Patterns of Multi-source Data

Li Li, Sarah Erfani, Christopher Leckie

2017 IEEE International Conference on Data Mining Workshops (ICDMW) > 916 - 923

2017 IEEE International Conference on Data Mining Workshops (ICDMW)

Contrast patterns are itemsets that frequently occur in one dataset while not in another. These patterns have been successfully applied to many data mining domains, such as prediction, classification and clustering. However, none of the previous studies has considered extracting contrast patterns from different types of datasets. In this paper, we introduce a new type of contrast pattern, Conditional...

chapter

Improving Multivariate Time Series Forecasting with Random Walks with Restarts on Causality Graphs

Piotr Przymus, Youssef Hmamouche, Alain Casali, Lotfi Lakhal

2017 IEEE International Conference on Data Mining Workshops (ICDMW) > 924 - 931

2017 IEEE International Conference on Data Mining Workshops (ICDMW)

Forecasting models that utilize multiple predictors are gaining popularity in a variety of fields. In some cases they allow constructing more precise forecasting models, leveraging the predictive potential of many variables. Unfortunately, in practice we do not know which observed predictors have a direct impact on the target variable. Moreover, adding unrelated variables may diminish the quality...

INFONA - science communication portal

2017 IEEE International Conference on Data Mining Workshops (ICDMW)

Survival Random Forest to Predict Time to Fill

Finding the Best Job Applicants for a Job Posting: A Comparison of Human Resources Search Strategies

Data-Driven Job Search Engine Using Skills and Company Attribute Filters

Principled Multilayer Network Embedding

Network Model Selection for Task-Focused Attributed Network Inference

Deep and Confident Prediction for Time Series at Uber

Message from the Conference Chairs

[Title page i]

Dealing with Class Imbalance the Scalable Way: Evaluation of Various Techniques Based on Classification Grade and Computational Complexity

Co-Training for Demographic Classification Using Deep Learning from Label Proportions

Anomaly Detection for a Water Treatment System Using Unsupervised Machine Learning

Pattern-Based Contextual Anomaly Detection in HVAC Systems

EmTaggeR: A Word Embedding Based Novel Method for Hashtag Recommendation on Twitter

Are Words Commensurate with Actions? Quantifying Commitment to a Cause from Online Public Messaging

Discovering Cooperative Structure Among Online Items for Attention Dynamics

Live on TV, Alive on Twitter: Quantifying Continuous Partial Attention of Viewers During Live Television Telecasts

TweeLoc: A System for Geolocalizing Tweets at Fine-Grain

Ranking from Crowdsourced Pairwise Comparisons via Smoothed Matrix Manifold Optimization

A Pattern Tree Based Method for Mining Conditional Contrast Patterns of Multi-source Data

Improving Multivariate Time Series Forecasting with Random Walks with Restarts on Causality Graphs

Filter options

Publication date

Content availability

Keywords

INFONA - science communication portal

2017 IEEE International Conference on Data Mining Workshops (ICDMW) $("#expandableTitles").expandable();

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Content availability

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options

2017 IEEE International Conference on Data Mining Workshops (ICDMW)