2017 IEEE International Conference on Data Mining (ICDM)

chapter

IterativE Grammar-Based Framework for Discovering Variable-Length Time Series Motifs

Yifeng Gao, Jessica Lin, Huzefa Rangwala

2017 IEEE International Conference on Data Mining (ICDM) > 111 - 116

In recent years, finding repetitive similar patterns in time series has become a popular problem. These patterns are called time series motifs. Recent studies show that using grammar compression algorithms to find repeating patterns from the symbolized time series holds promise in discovering approximate motifs with variable length. However, grammar compression algorithms are traditionally designed...

chapter

Matrix Profile VIII: Domain Agnostic Online Semantic Segmentation at Superhuman Performance Levels

Shaghayegh Gharghabi, Yifei Ding, Chin-Chia Michael Yeh, Kaveh Kamgar, more

2017 IEEE International Conference on Data Mining (ICDM) > 117 - 126

2017 IEEE International Conference on Data Mining (ICDM)

Unsupervised semantic segmentation in the time series domain is a much-studied problem due to its potential to detect unexpected regularities and regimes in poorly understood data. However, the current techniques have several shortcomings, which have limited the adoption of time series semantic segmentation beyond academic settings for three primary reasons. First, most methods require setting/learning...

chapter

Overlapping Community Detection via Constrained PARAFAC: A Divide and Conquer Approach

Fatemeh Sheikholeslami, Georgios B. Giannakis

2017 IEEE International Conference on Data Mining (ICDM) > 127 - 136

2017 IEEE International Conference on Data Mining (ICDM)

The task of community detection over complex networks is of paramount importance in a multitude of applications. The present work puts forward a top-to-bottom community identification approach, termed DC-EgoTen, in which an egonet-tensor (EgoTen) based algorithm is developed in a divide-and-conquer (DC) fashion for breaking the network into smaller subgraphs, out of which the underlying communities...

chapter

Scalable Algorithms for Locally Low-Rank Matrix Modeling

Qilong Gu, Joshua D. Trzasko, Arindam Banerjee

2017 IEEE International Conference on Data Mining (ICDM) > 137 - 146

2017 IEEE International Conference on Data Mining (ICDM)

We consider the problem of modeling data matrices with locally low rank (LLR) structure, a generalization of the popular low rank structure widely used in a variety of real world application domains ranging from medical imaging to recommendation systems. While LLR modeling has been found to be promising in real world application domains, limited progress has been made on the design of scalable algorithms...

chapter

A Self-Adaptive Sliding Window Based Topic Model for Non-uniform Texts

Jin He, Lei Li, Xindong Wu

2017 IEEE International Conference on Data Mining (ICDM) > 147 - 156

2017 IEEE International Conference on Data Mining (ICDM)

The contents generated from different data sources are usually non-uniform, such as long texts produced by news websites and short texts produced by social media. Uncovering topics over large-scale non-uniform texts becomes an important task for analyzing network data. However, the existing methods may fail to recognize the difference between long texts and short texts. To address this problem, we...

chapter

Kernel Conditional Clustering

Xiao He, Thomas Gumbsch, Damian Roqueiro, Karsten Borgwardt

2017 IEEE International Conference on Data Mining (ICDM) > 157 - 166

2017 IEEE International Conference on Data Mining (ICDM)

Clustering results are often affected by covariates that are independent of the clusters one would like to discover. Traditionally, Alternative Clustering algorithms can be used to solve such a problem. However, these suffer from at least one of the following problems: i) continuous covariates or non-linearly separable clusters cannot be handled; ii) assumptions are made about the distribution of...

chapter

Data-Driven Utilization-Aware Trip Advisor for Bike-Sharing Systems

Ji Hu, Zidong Yang, Yuanchao Shu, Peng Cheng, more

2017 IEEE International Conference on Data Mining (ICDM) > 167 - 176

2017 IEEE International Conference on Data Mining (ICDM)

Rapid development of bike-sharing systems has brought people enormous convenience during the past decade. On the other hand, high transport flexibility comes with dynamic distribution of shared bikes, leading to an unbalanced bike usage and growing maintenance cost. In this paper, we consider to rebalance bicycle utilization by means of directing users to different stations. For the first time, we...

chapter

Multi-task Multi-modal Models for Collective Anomaly Detection

Tsuyoshi Ide, Dzung T. Phan, Jayant Kalagnanam

2017 IEEE International Conference on Data Mining (ICDM) > 177 - 186

2017 IEEE International Conference on Data Mining (ICDM)

This paper proposes a new framework for anomaly detection when collectively monitoring many complex systems. The prerequisite for condition-based monitoring in industrial applications is the capability of (1) capturing multiple operational states, (2) managing many similar but different assets, and (3) providing insights into the internal relationship of the variables. To meet these criteria, we propose...

chapter

Exploratory Analysis of Graph Data by Leveraging Domain Knowledge

Di Jin, Danai Koutra

2017 IEEE International Conference on Data Mining (ICDM) > 187 - 196

2017 IEEE International Conference on Data Mining (ICDM)

Given the soaring amount of data being generated daily, graph mining tasks are becoming increasingly challenging, leading to tremendous demand for summarization techniques. Feature selection is a representative approach that simplifies a dataset by choosing features that are relevant to a specific task, such as classification, prediction, and anomaly detection. Although it can be viewed as a way to...

chapter

Efficiently Discovering Locally Exceptional Yet Globally Representative Subgroups

Janis Kalofolias, Mario Boley, Jilles Vreeken

2017 IEEE International Conference on Data Mining (ICDM) > 197 - 206

2017 IEEE International Conference on Data Mining (ICDM)

Subgroup discovery is a local pattern mining technique to find interpretable descriptions of sub-populations that stand out on a given target variable. That is, these sub-populations are exceptional with regard to the global distribution. In this paper we argue that in many applications, such as scientific discovery, subgroups are only useful if they are additionally representative of the global distribution...

chapter

Visually-Aware Fashion Recommendation and Design with Generative Image Models

Wang-Cheng Kang, Chen Fang, Zhaowen Wang, Julian McAuley

2017 IEEE International Conference on Data Mining (ICDM) > 207 - 216

2017 IEEE International Conference on Data Mining (ICDM)

Building effective recommender systems for domains like fashion is challenging due to the high level of subjectivity and the semantic complexity of the features involved (i.e., fashion styles). Recent work has shown that approaches to 'visual' recommendation (e.g. clothing, art, etc.) can be made more accurate by incorporating visual signals directly into the recommendation objective, using 'off-the-shelf'...

chapter

AutoLearn — Automated Feature Generation and Selection

Ambika Kaul, Saket Maheshwary, Vikram Pudi

2017 IEEE International Conference on Data Mining (ICDM) > 217 - 226

2017 IEEE International Conference on Data Mining (ICDM)

In recent years, the importance of feature engineering has been confirmed by the exceptional performance of deep learning techniques, that automate this task for some applications. For others, feature engineering requires substantial manual effort in designing and selecting features and is often tedious and non-scalable. We present AutoLearn, a regression-based feature learning algorithm. Being data-driven,...

chapter

Collective Entity Resolution in Familial Networks

Pigi Kouki, Jay Pujara, Christopher Marcum, Laura Koehly, more

2017 IEEE International Conference on Data Mining (ICDM) > 227 - 236

2017 IEEE International Conference on Data Mining (ICDM)

Entity resolution in settings with rich relational structure often introduces complex dependencies between co-references. Exploiting these dependencies is challenging - it requires seamlessly combining statistical, relational, and logical dependencies. One task of particular interest is entity resolution in familial networks. In this setting, multiple partial representations of a family tree are provided,...

chapter

Scalable and Adaptive Algorithms for the Triangle Interdiction Problem on Billion-Scale Networks

Alan Kuhnle, Victoria G. Crawford, My T. Thai

2017 IEEE International Conference on Data Mining (ICDM) > 237 - 246

2017 IEEE International Conference on Data Mining (ICDM)

Motivated by the relevance of clustering or transitivity to a variety of network applications, we study the Triangle Interdiction Problem (TIP), which is to find a minimum-size set of edges that intersects all triangles of a network. As existing approximation algorithms for this NP-hard problem either do not scale well to massive networks or have poor solution quality, we formulate two algorithms,...

chapter

Online Learning of Acyclic Conditional Preference Networks from Noisy Data

Fabien Labernia, Bruno Zanuttini, Brice Mayag, Florian Yger, more

2017 IEEE International Conference on Data Mining (ICDM) > 247 - 256

2017 IEEE International Conference on Data Mining (ICDM)

We deal with online learning of acyclic Conditional Preference networks (CP-nets) from data streams, possibly corrupted with noise. We introduce a new, efficient algorithm relying on (i) information-theoretic measures defined over the induced preference rules, which allow us to deal with corrupted data in a principled way, and on (ii) the Hoeffding bound to define an asymptotically optimal decision...

chapter

GoGP: Fast Online Regression with Gaussian Processes

Trung Le, Khanh Nguyen, Vu Nguyen, Tu Dinh Nguyen, more

2017 IEEE International Conference on Data Mining (ICDM) > 257 - 266

2017 IEEE International Conference on Data Mining (ICDM)

One of the most current challenging problems in Gaussian process regression (GPR) is to handle large-scale datasets and to accommodate an online learning setting where data arrive irregularly on the fly. In this paper, we introduce a novel online Gaussian process model that could scale with massive datasets. Our approach is formulated based on alternative representation of the Gaussian process under...

chapter

HiMuV: Hierarchical Framework for Modeling Multi-modality Multi-resolution Data

Jianboi Li, Jingrui He, Yada Zhu

2017 IEEE International Conference on Data Mining (ICDM) > 267 - 276

2017 IEEE International Conference on Data Mining (ICDM)

Many real-world applications are characterized by temporal data collected from multiple modalities, each sampled with a different resolution. Examples include manufacturing processes and financial market prediction. In these applications, an interesting observation is that within the same modality, we often have data from multiple views, thus naturally forming a 2-level hierarchy: with the multiple...

chapter

Linear Time Complexity Time Series Classification with Bag-of-Pattern-Features

Xiaosheng Li, Jessica Lin

2017 IEEE International Conference on Data Mining (ICDM) > 277 - 286

2017 IEEE International Conference on Data Mining (ICDM)

Time series classification has attracted much attention due to the ubiquity of time series. With the advance of technologies, the volume of available time series data becomes huge and the content is changing rapidly. This requires time series data mining methods to have low computational complexities. In this paper, we propose a parameter-free time series classification method that has a linear time...

chapter

An Analysis of Boosted Linear Classifiers on Noisy Data with Applications to Multiple-Instance Learning

Rui Liu, Soumya Ray

2017 IEEE International Conference on Data Mining (ICDM) > 287 - 296

2017 IEEE International Conference on Data Mining (ICDM)

An interesting observation about the well-known AdaBoost algorithm is that, though theory suggests it should overfit when applied to noisy data, experiments indicate it often does not do so in practice. In this paper, we study the behavior of AdaBoost on datasets with one-sided uniform class noise using linear classifiers as the base learner. We show analytically that, under some ideal conditions,...

chapter

BiCycle: Item Recommendation with Life Cycles

Xinyue Liu, Yuanfang Song, Charu Aggarwal, Yao Zhang, more

2017 IEEE International Conference on Data Mining (ICDM) > 297 - 306

2017 IEEE International Conference on Data Mining (ICDM)

Recommender systems have attracted much attention in last decades, which can help the users explore new items in many applications. As a popular technique in recommender systems, item recommendation works by recommending items to users based on their historical interactions. Conventional item recommendation methods usually assume that users and items are stationary, which is not always the case in...

INFONA - science communication portal

2017 IEEE International Conference on Data Mining (ICDM)

IterativE Grammar-Based Framework for Discovering Variable-Length Time Series Motifs

Matrix Profile VIII: Domain Agnostic Online Semantic Segmentation at Superhuman Performance Levels

Overlapping Community Detection via Constrained PARAFAC: A Divide and Conquer Approach

Scalable Algorithms for Locally Low-Rank Matrix Modeling

A Self-Adaptive Sliding Window Based Topic Model for Non-uniform Texts

Kernel Conditional Clustering

Data-Driven Utilization-Aware Trip Advisor for Bike-Sharing Systems

Multi-task Multi-modal Models for Collective Anomaly Detection

Exploratory Analysis of Graph Data by Leveraging Domain Knowledge

Efficiently Discovering Locally Exceptional Yet Globally Representative Subgroups

Visually-Aware Fashion Recommendation and Design with Generative Image Models

AutoLearn — Automated Feature Generation and Selection

Collective Entity Resolution in Familial Networks

Scalable and Adaptive Algorithms for the Triangle Interdiction Problem on Billion-Scale Networks

Online Learning of Acyclic Conditional Preference Networks from Noisy Data

GoGP: Fast Online Regression with Gaussian Processes

HiMuV: Hierarchical Framework for Modeling Multi-modality Multi-resolution Data

Linear Time Complexity Time Series Classification with Bag-of-Pattern-Features

An Analysis of Boosted Linear Classifiers on Noisy Data with Applications to Multiple-Instance Learning

BiCycle: Item Recommendation with Life Cycles

Filter options

Publication date

Keywords

INFONA - science communication portal

2017 IEEE International Conference on Data Mining (ICDM) $("#expandableTitles").expandable();

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options

2017 IEEE International Conference on Data Mining (ICDM)