2017 IEEE International Conference on Data Mining (ICDM)

chapter

MDL for Causal Inference on Discrete Data

Kailash Budhathoki, Jilles Vreeken

2017 IEEE International Conference on Data Mining (ICDM) > 751 - 756

The algorithmic Markov condition states that the most likely causal direction between two random variables X and Y can be identified as the direction with the lowest Kolmogorov complexity. This notion is very powerful as it can detect any causal dependency that can be explained by a physical process. However, due to the halting problem, it is also not computable. In this paper we propose an computable...

chapter

Mining the Demographics of Political Sentiment from Twitter Using Learning from Label Proportions

Ehsan Mohammady Ardehaly, Aron Culotta

2017 IEEE International Conference on Data Mining (ICDM) > 733 - 738

2017 IEEE International Conference on Data Mining (ICDM)

Opinion mining and demographic attribute inference have many applications in social science. In this paper, we propose models to infer daily joint probabilities of multiple latent attributes from Twitter data, such as political sentiment and demographic attributes. Since it is costly and time-consuming to annotate data for traditional supervised classification, we instead propose scalable Learning...

chapter

Online and Distributed Robust Regressions Under Adversarial Data Corruption

Xuchao Zhang, Liang Zhao, Arnold P. Boedihardjo, Chang-Tien Lu

2017 IEEE International Conference on Data Mining (ICDM) > 625 - 634

2017 IEEE International Conference on Data Mining (ICDM)

In today's era of big data, robust least-squares regression becomes a more challenging problem when considering the adversarial corruption along with explosive growth of datasets. Traditional robust methods can handle the noise but suffer from several challenges when applied in huge dataset including 1) computational infeasibility of handling an entire dataset at once, 2) existence of heterogeneously...

chapter

Differentially Private Mixture of Generative Neural Networks

Gergely Acs, Luca Melis, Claude Castelluccia, Emiliano De Cristofaro

2017 IEEE International Conference on Data Mining (ICDM) > 715 - 720

2017 IEEE International Conference on Data Mining (ICDM)

Generative models are used in an increasing number of applications that rely on large amounts of contextually rich information about individuals. Owing to possible privacy violations, however, publishing or sharing generative models is not always viable. In this paper, we introduce a novel solution for privately releasing generative models and entire high-dimensional datasets produced by these models...

chapter

Matrix Profile VII: Time Series Chains: A New Primitive for Time Series Data Mining (Best Student Paper Award)

Yan Zhu, Makoto Imamura, Daniel Nikovski, Eamonn Keogh

2017 IEEE International Conference on Data Mining (ICDM) > 695 - 704

2017 IEEE International Conference on Data Mining (ICDM)

Since their introduction over a decade ago, time series motifs have become a fundamental tool for time series analytics, finding diverse uses in dozens of domains. In this work we introduce Time Series Chains, which are related to, but distinct from, time series motifs. Informally, time series chains are a temporally ordered set of subsequence patterns, such that each pattern is similar to the pattern...

chapter

Exploiting Hierarchical Structures for POI Recommendation

Pengpeng Zhao, Xiefeng Xu, Yanchi Liu, Ziting Zhou, more

2017 IEEE International Conference on Data Mining (ICDM) > 655 - 664

2017 IEEE International Conference on Data Mining (ICDM)

With the rapid development of location-based social networks, Point-of-Interest (POI) recommendation has played an important role in helping people discover attractive locations. However, existing POI recommendation methods assume a flat structure of POIs, which are better described in a hierarchical structure in reality. Furthermore, we discover that both users' content and spatial preferences exhibit...

chapter

BiCycle: Item Recommendation with Life Cycles

Xinyue Liu, Yuanfang Song, Charu Aggarwal, Yao Zhang, more

2017 IEEE International Conference on Data Mining (ICDM) > 297 - 306

2017 IEEE International Conference on Data Mining (ICDM)

Recommender systems have attracted much attention in last decades, which can help the users explore new items in many applications. As a popular technique in recommender systems, item recommendation works by recommending items to users based on their historical interactions. Conventional item recommendation methods usually assume that users and items are stationary, which is not always the case in...

chapter

GoGP: Fast Online Regression with Gaussian Processes

Trung Le, Khanh Nguyen, Vu Nguyen, Tu Dinh Nguyen, more

2017 IEEE International Conference on Data Mining (ICDM) > 257 - 266

2017 IEEE International Conference on Data Mining (ICDM)

One of the most current challenging problems in Gaussian process regression (GPR) is to handle large-scale datasets and to accommodate an online learning setting where data arrive irregularly on the fly. In this paper, we introduce a novel online Gaussian process model that could scale with massive datasets. Our approach is formulated based on alternative representation of the Gaussian process under...

chapter

Telling Cause from Effect Using MDL-Based Local and Global Regression

Alexander Marx, Jilles Vreeken

2017 IEEE International Conference on Data Mining (ICDM) > 307 - 316

2017 IEEE International Conference on Data Mining (ICDM)

We consider the fundamental problem of inferring the causal direction between two univariate numeric random variables X and Y from observational data. The two-variable case is especially difficult to solve since it is not possible to use standard conditional independence tests between the variables. To tackle this problem, we follow an information theoretic approach based on Kolmogorov complexity...

chapter

Online Nearest Neighbor Search in Binary Space

Sepehr Eghbali, Hassan Ashtiani, Ladan Tahvildari

2017 IEEE International Conference on Data Mining (ICDM) > 853 - 858

2017 IEEE International Conference on Data Mining (ICDM)

We revisit the K Nearest Neighbors (KNN) problem in large binary datasets which is of major importance in several applied areas. The goal is to find the K nearest items in a dataset to a query point where both the query and the items lie in the Hamming cube. The problem is addressed in its online setting, that is, data items are inserted sequentially into the dataset. To accommodate efficient similarity...

chapter

Topological Recurrent Neural Network for Diffusion Prediction

Jia Wang, Vincent W. Zheng, Zemin Liu, Kevin Chen-Chuan Chang

2017 IEEE International Conference on Data Mining (ICDM) > 475 - 484

2017 IEEE International Conference on Data Mining (ICDM)

In this paper, we study the problem of using representation learning to assist information diffusion prediction on graphs. In particular, we aim at estimating the probability of an inactive node to be activated next in a cascade. Despite the success of recent deep learning methods for diffusion, we find that they often underexplore the cascade structure. We consider a cascade as not merely a sequence...

chapter

STExNMF: Spatio-Temporally Exclusive Topic Discovery for Anomalous Event Detection

Dear Sungbok Shin, Minsuk Choi, Jinho Choi, Scott Langevin, more

2017 IEEE International Conference on Data Mining (ICDM) > 435 - 444

2017 IEEE International Conference on Data Mining (ICDM)

Understanding newly emerging events or topics associated with a particular region of a given day can provide deep insight on the critical events occurring in highly evolving metropolitan cities. We propose herein a novel topic modeling approach on text documents with spatio-temporal information (e.g., when and where a document was published) such as location-based social media data to discover prevalent...

chapter

Tensor Based Relations Ranking for Multi-relational Collective Classification

Chao Han, Qingyao Wu, Michael K. Ng, Jiezhang Cao, more

2017 IEEE International Conference on Data Mining (ICDM) > 901 - 906

2017 IEEE International Conference on Data Mining (ICDM)

In this paper, we study relations ranking and object classification for multi-relational data where objects are interconnected by multiple relations. The relations among objects should be exploited for achieving a good classification. While most existing approaches exploit either by directly counting the number of connections among objects or by learning the weight of each relation from labeled data...

chapter

Accurate Detection of Automatically Spun Content via Stylometric Analysis

Usman Shahid, Shehroze Farooqi, Raza Ahmad, Zubair Shafiq, more

2017 IEEE International Conference on Data Mining (ICDM) > 425 - 434

2017 IEEE International Conference on Data Mining (ICDM)

Spammers use automated content spinning techniques to evade plagiarism detection by search engines. Text spinners help spammers in evading plagiarism detectors by automatically restructuring sentences and replacing words or phrases with their synonyms. Prior work on spun content detection relies on the knowledge about the dictionary used by the text spinning software. In this work, we propose an approach...

chapter

GANG: Detecting Fraudulent Users in Online Social Networks via Guilt-by-Association on Directed Graphs

Binghui Wang, Neil Zhenqiang Gong, Hao Fu

2017 IEEE International Conference on Data Mining (ICDM) > 465 - 474

2017 IEEE International Conference on Data Mining (ICDM)

Detecting fraudulent users in online social networks is a fundamental and urgent research problem as adversaries can use them to perform various malicious activities. Global social structure based methods, which are known as guilt-by-association, have been shown to be promising at detecting fraudulent users. However, existing guilt-by-association methods either assume symmetric (i.e., undirected)...

chapter

Importance Sketching of Influence Dynamics in Billion-Scale Networks

Hung T. Nguyen, Tri P. Nguyen, NhatHai Phan, Thang N. Dinh

2017 IEEE International Conference on Data Mining (ICDM) > 337 - 346

2017 IEEE International Conference on Data Mining (ICDM)

The blooming availability of traces for social, biological, and communication networks opens up unprecedented opportunities in analyzing diffusion processes in networks. However, the sheer sizes of the nowadays networks raise serious challenges in computational efficiency and scalability. In this paper, we propose a new hyper-graph sketching framework for influence dynamics in networks. The core of...

chapter

Bayesian Optimization in Weakly Specified Search Space

Vu Nguyen, Sunil Gupta, Santu Rane, Cheng Li, more

2017 IEEE International Conference on Data Mining (ICDM) > 347 - 356

2017 IEEE International Conference on Data Mining (ICDM)

Bayesian optimization (BO) has recently emerged as a powerful and flexible tool for hyper-parameter tuning and more generally for the efficient global optimization of expensive black-box functions. Systems implementing BO has successfully solved difficult problems in automatic design choices and machine learning hyper-parameters tunings. Many recent advances in the methodologies and theories underlying...

chapter

Matrix Profile VI: Meaningful Multidimensional Motif Discovery

Chin-Chia Michael Yeh, Nickolas Kavantzas, Eamonn Keogh

2017 IEEE International Conference on Data Mining (ICDM) > 565 - 574

2017 IEEE International Conference on Data Mining (ICDM)

Time series motifs are approximately repeating patterns in real-valued time series data. They are useful for exploratory data mining and are often used as inputs for various time series clustering, classification, segmentation, rule discovery, and visualization algorithms. Since the introduction of the first motif discovery algorithm for univariate time series in 2002, multiple efforts have been made...

chapter

Large Scale Kernel Methods for Online AUC Maximization

Yi Ding, Chenghao Liu, Peilin Zhao, Steven C.H. Hoi

2017 IEEE International Conference on Data Mining (ICDM) > 91 - 100

2017 IEEE International Conference on Data Mining (ICDM)

Learning to optimize AUC performance for classifying label imbalanced data in online scenarios has been extensively studied in recent years. Most of the existing work has attempted to address the problem directly in the original feature space, which may not suitable for non-linearly separable datasets. To solve this issue, some kernel-based learning methods are proposed for non-linearly separable...

chapter

Exploratory Analysis of Graph Data by Leveraging Domain Knowledge

Di Jin, Danai Koutra

2017 IEEE International Conference on Data Mining (ICDM) > 187 - 196

2017 IEEE International Conference on Data Mining (ICDM)

Given the soaring amount of data being generated daily, graph mining tasks are becoming increasingly challenging, leading to tremendous demand for summarization techniques. Feature selection is a representative approach that simplifies a dataset by choosing features that are relevant to a specific task, such as classification, prediction, and anomaly detection. Although it can be viewed as a way to...

INFONA - science communication portal

2017 IEEE International Conference on Data Mining (ICDM)

MDL for Causal Inference on Discrete Data

Mining the Demographics of Political Sentiment from Twitter Using Learning from Label Proportions

Online and Distributed Robust Regressions Under Adversarial Data Corruption

Differentially Private Mixture of Generative Neural Networks

Matrix Profile VII: Time Series Chains: A New Primitive for Time Series Data Mining (Best Student Paper Award)

Exploiting Hierarchical Structures for POI Recommendation

BiCycle: Item Recommendation with Life Cycles

GoGP: Fast Online Regression with Gaussian Processes

Telling Cause from Effect Using MDL-Based Local and Global Regression

Online Nearest Neighbor Search in Binary Space

Topological Recurrent Neural Network for Diffusion Prediction

STExNMF: Spatio-Temporally Exclusive Topic Discovery for Anomalous Event Detection

Tensor Based Relations Ranking for Multi-relational Collective Classification

Accurate Detection of Automatically Spun Content via Stylometric Analysis

GANG: Detecting Fraudulent Users in Online Social Networks via Guilt-by-Association on Directed Graphs

Importance Sketching of Influence Dynamics in Billion-Scale Networks

Bayesian Optimization in Weakly Specified Search Space

Matrix Profile VI: Meaningful Multidimensional Motif Discovery

Large Scale Kernel Methods for Online AUC Maximization

Exploratory Analysis of Graph Data by Leveraging Domain Knowledge

Filter options

Publication date

Content availability

Keywords

INFONA - science communication portal

2017 IEEE International Conference on Data Mining (ICDM) $("#expandableTitles").expandable();

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Content availability

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options

2017 IEEE International Conference on Data Mining (ICDM)