2017 IEEE International Conference on Data Mining (ICDM)

chapter

Telling Cause from Effect Using MDL-Based Local and Global Regression

Alexander Marx, Jilles Vreeken

2017 IEEE International Conference on Data Mining (ICDM) > 307 - 316

We consider the fundamental problem of inferring the causal direction between two univariate numeric random variables X and Y from observational data. The two-variable case is especially difficult to solve since it is not possible to use standard conditional independence tests between the variables. To tackle this problem, we follow an information theoretic approach based on Kolmogorov complexity...

chapter

Distributing Frank-Wolfe via Map-Reduce

Armin Moharrer, Stratis Ioannidis

2017 IEEE International Conference on Data Mining (ICDM) > 317 - 326

2017 IEEE International Conference on Data Mining (ICDM)

Large-scale optimization problems abound in data mining and machine learning applications, and the computational challenges they pose are often addressed through parallelization. We identify structural properties under which a convex optimization problem can be massively parallelized via map-reduce operations using the Frank-Wolfe (FW) algorithm. The class of problems that can be tackled this way...

chapter

Glocalized Weisfeiler-Lehman Graph Kernels: Global-Local Feature Maps of Graphs

Christopher Morris, Kristian Kersting, Petra Mutzel

2017 IEEE International Conference on Data Mining (ICDM) > 327 - 336

2017 IEEE International Conference on Data Mining (ICDM)

Most state-of-the-art graph kernels only take local graph properties into account, i.e., the kernel is computed with regard to properties of the neighborhood of vertices or other small substructures. On the other hand, kernels that do take global graph properties into account may not scale well to large graph databases. Here we propose to start exploring the spacebetween local and global graph kernels,...

chapter

Importance Sketching of Influence Dynamics in Billion-Scale Networks

Hung T. Nguyen, Tri P. Nguyen, NhatHai Phan, Thang N. Dinh

2017 IEEE International Conference on Data Mining (ICDM) > 337 - 346

2017 IEEE International Conference on Data Mining (ICDM)

The blooming availability of traces for social, biological, and communication networks opens up unprecedented opportunities in analyzing diffusion processes in networks. However, the sheer sizes of the nowadays networks raise serious challenges in computational efficiency and scalability. In this paper, we propose a new hyper-graph sketching framework for influence dynamics in networks. The core of...

chapter

Bayesian Optimization in Weakly Specified Search Space

Vu Nguyen, Sunil Gupta, Santu Rane, Cheng Li, more

2017 IEEE International Conference on Data Mining (ICDM) > 347 - 356

2017 IEEE International Conference on Data Mining (ICDM)

Bayesian optimization (BO) has recently emerged as a powerful and flexible tool for hyper-parameter tuning and more generally for the efficient global optimization of expensive black-box functions. Systems implementing BO has successfully solved difficult problems in automatic design choices and machine learning hyper-parameters tunings. Many recent advances in the methodologies and theories underlying...

chapter

Relational Mixture of Experts: Explainable Demographics Prediction with Behavioral Data

Masafumi Oyamada, Shinji Nakadai

2017 IEEE International Conference on Data Mining (ICDM) > 357 - 366

2017 IEEE International Conference on Data Mining (ICDM)

Given a collection of basic customer demographics (e.g., age and gender) andtheir behavioral data (e.g., item purchase histories), how can we predictsensitive demographics (e.g., income and occupation) that not every customermakes available?This demographics prediction problem is modeled as a classification task inwhich a customer's sensitive demographic y is predicted from his featurevector x. So...

chapter

Unsupervised Feature Learning with Discriminative Encoder

Gaurav Pandey, Ambedkar Dukkipati

2017 IEEE International Conference on Data Mining (ICDM) > 367 - 376

2017 IEEE International Conference on Data Mining (ICDM)

In recent years, deep discriminative models have achieved extraordinary performance on supervised learning tasks, significantly outperforming their generative counterparts. However, their success relies on the presence of a large amount of labeled data. How can one use the same discriminative models for learning useful features in the absence of labels? We address this question in this paper, by jointly...

chapter

Learning Doubly Stochastic Affinity Matrix via Davis-Kahan Theorem

Jiwoong Park, Taejeong Kim

2017 IEEE International Conference on Data Mining (ICDM) > 377 - 384

2017 IEEE International Conference on Data Mining (ICDM)

Building an ideal graph which reveals the exact intrinsic structure of the data is critical in graph-based clustering. There have been a lot of efforts to construct an affinity matrix satisfying such a need in terms of a similarity measure. A recent approach attracting attention is on using doubly stochastic normalization of the affinity matrix to improve the clustering performance. In this paper,...

chapter

Adaptive Laplace Mechanism: Differential Privacy Preservation in Deep Learning

NhatHai Phan, Xintao Wu, Han Hu, Dejing Dou

2017 IEEE International Conference on Data Mining (ICDM) > 385 - 394

2017 IEEE International Conference on Data Mining (ICDM)

In this paper, we focus on developing a novel mechanism to preserve differential privacy in deep neural networks, such that: (1) The privacy budget consumption is totally independent of the number of training steps; (2) It has the ability to adaptively inject noise into features based on the contribution of each to the output; and (3) It could be applied in a variety of different deep neural networks...

chapter

A Short-Term Rainfall Prediction Model Using Multi-task Convolutional Neural Networks

Minghui Qiu, Peilin Zhao, Ke Zhang, Jun Huang, more

2017 IEEE International Conference on Data Mining (ICDM) > 395 - 404

2017 IEEE International Conference on Data Mining (ICDM)

Precipitation prediction, such as short-term rainfall prediction, is a very important problem in the field of meteorological service. In practice, most of recent studies focus on leveraging radar data or satellite images to make predictions. However, there is another scenario where a set of weather features are collected by various sensors at multiple observation sites. The observations of a site...

chapter

Scalable Hashing-Based Network Discovery

Tara Safavi, Chandra Sripada, Danai Koutra

2017 IEEE International Conference on Data Mining (ICDM) > 405 - 414

2017 IEEE International Conference on Data Mining (ICDM)

Discovering and analyzing networks from non-network data is a task with applications in fields as diverse as neuroscience, genomics, energy, economics, and more. In these domains, networks are often constructed out of multiple time series by computing measures of association or similarity between pairs of series. The nodes in a discovered graph correspond to time series, which are linked via edges...

chapter

Benchmark Generator for Dynamic Overlapping Communities in Networks

Neha Sengupta, Michael Hamann, Dorothea Wagner

2017 IEEE International Conference on Data Mining (ICDM) > 415 - 424

2017 IEEE International Conference on Data Mining (ICDM)

We describe a dynamic graph generator with overlapping communities that is capable of simulating community scale events while at the same time maintaining crucial graph properties. Such a benchmark generator is useful to measure and compare the responsiveness and efficiency of dynamic community detection algorithms. Since the generator allows the user to tune multiple parameters, it can also be used...

chapter

Accurate Detection of Automatically Spun Content via Stylometric Analysis

Usman Shahid, Shehroze Farooqi, Raza Ahmad, Zubair Shafiq, more

2017 IEEE International Conference on Data Mining (ICDM) > 425 - 434

2017 IEEE International Conference on Data Mining (ICDM)

Spammers use automated content spinning techniques to evade plagiarism detection by search engines. Text spinners help spammers in evading plagiarism detectors by automatically restructuring sentences and replacing words or phrases with their synonyms. Prior work on spun content detection relies on the knowledge about the dictionary used by the text spinning software. In this work, we propose an approach...

chapter

STExNMF: Spatio-Temporally Exclusive Topic Discovery for Anomalous Event Detection

Dear Sungbok Shin, Minsuk Choi, Jinho Choi, Scott Langevin, more

2017 IEEE International Conference on Data Mining (ICDM) > 435 - 444

2017 IEEE International Conference on Data Mining (ICDM)

Understanding newly emerging events or topics associated with a particular region of a given day can provide deep insight on the critical events occurring in highly evolving metropolitan cities. We propose herein a novel topic modeling approach on text documents with spatio-temporal information (e.g., when and where a document was published) such as location-based social media data to discover prevalent...

chapter

A Probabilistic Approach for Learning with Label Proportions Applied to the US Presidential Election

Tao Sun, Dan Sheldon, Brendan OConnor

2017 IEEE International Conference on Data Mining (ICDM) > 445 - 454

2017 IEEE International Conference on Data Mining (ICDM)

Ecological inference (EI) is a classical problem from political science to model voting behavior of individuals given only aggregate election results. Flaxman et al. recently formulated EI as machine learning problem using distribution regression, and applied it to analyze US presidential elections. However, distribution regression unnecessarily aggregates individual-level covariates available from...

chapter

Edge-Based Wedge Sampling to Estimate Triangle Counts in Very Large Graphs

Duru Turkoglu, Ata Turk

2017 IEEE International Conference on Data Mining (ICDM) > 455 - 464

2017 IEEE International Conference on Data Mining (ICDM)

The number of triangles in a graph is useful to deduce a plethora of important features of the network that the graph is modeling. However, finding the exact value of this number is computationally expensive. Hence, a number of approximation algorithms based on random sampling of edges, or wedges (adjacent edge pairs) have been proposed for estimating this value. We argue that for large sparse graphs...

chapter

GANG: Detecting Fraudulent Users in Online Social Networks via Guilt-by-Association on Directed Graphs

Binghui Wang, Neil Zhenqiang Gong, Hao Fu

2017 IEEE International Conference on Data Mining (ICDM) > 465 - 474

2017 IEEE International Conference on Data Mining (ICDM)

Detecting fraudulent users in online social networks is a fundamental and urgent research problem as adversaries can use them to perform various malicious activities. Global social structure based methods, which are known as guilt-by-association, have been shown to be promising at detecting fraudulent users. However, existing guilt-by-association methods either assume symmetric (i.e., undirected)...

chapter

Topological Recurrent Neural Network for Diffusion Prediction

Jia Wang, Vincent W. Zheng, Zemin Liu, Kevin Chen-Chuan Chang

2017 IEEE International Conference on Data Mining (ICDM) > 475 - 484

2017 IEEE International Conference on Data Mining (ICDM)

In this paper, we study the problem of using representation learning to assist information diffusion prediction on graphs. In particular, we aim at estimating the probability of an inactive node to be activated next in a cascade. Despite the success of recent deep learning methods for diffusion, we find that they often underexplore the cascade structure. We consider a cascade as not merely a sequence...

chapter

Multi-task Survival Analysis

Lu Wang, Yan Li, Jiayu Zhou, Dongxiao Zhu, more

2017 IEEE International Conference on Data Mining (ICDM) > 485 - 494

2017 IEEE International Conference on Data Mining (ICDM)

Collecting labeling information of time-to-event analysis is naturally very time consuming, i.e., one has to wait for the occurrence of the event of interest, which may not always be observed for every instance. By taking advantage of censored instances, survival analysis methods internally consider more samples than standard regression methods, which partially alleviates this data insufficiency problem...

chapter

Tracking Hit-and-Run Vehicle with Sparse Video Surveillance Cameras and Mobile Taxicabs

Yang Wang, Wuji Chen, Wei Zheng, He Huang, more

2017 IEEE International Conference on Data Mining (ICDM) > 495 - 504

2017 IEEE International Conference on Data Mining (ICDM)

Due to the sparse distribution of road video surveillance cameras, precise trajectory tracking for hit-and-run vehicles remains a challenging task. Previous research on vehicle trajectory recovery mostly focuses on recovering trajectory with low-sampling-rate GPS coordinates by retrieving road traffic flow patterns from collected GPS information. However, to the best of our knowledge, none of them...

INFONA - science communication portal

2017 IEEE International Conference on Data Mining (ICDM)

Telling Cause from Effect Using MDL-Based Local and Global Regression

Distributing Frank-Wolfe via Map-Reduce

Glocalized Weisfeiler-Lehman Graph Kernels: Global-Local Feature Maps of Graphs

Importance Sketching of Influence Dynamics in Billion-Scale Networks

Bayesian Optimization in Weakly Specified Search Space

Relational Mixture of Experts: Explainable Demographics Prediction with Behavioral Data

Unsupervised Feature Learning with Discriminative Encoder

Learning Doubly Stochastic Affinity Matrix via Davis-Kahan Theorem

Adaptive Laplace Mechanism: Differential Privacy Preservation in Deep Learning

A Short-Term Rainfall Prediction Model Using Multi-task Convolutional Neural Networks

Scalable Hashing-Based Network Discovery

Benchmark Generator for Dynamic Overlapping Communities in Networks

Accurate Detection of Automatically Spun Content via Stylometric Analysis

STExNMF: Spatio-Temporally Exclusive Topic Discovery for Anomalous Event Detection

A Probabilistic Approach for Learning with Label Proportions Applied to the US Presidential Election

Edge-Based Wedge Sampling to Estimate Triangle Counts in Very Large Graphs

GANG: Detecting Fraudulent Users in Online Social Networks via Guilt-by-Association on Directed Graphs

Topological Recurrent Neural Network for Diffusion Prediction

Multi-task Survival Analysis

Tracking Hit-and-Run Vehicle with Sparse Video Surveillance Cameras and Mobile Taxicabs

Filter options

Publication date

Keywords

INFONA - science communication portal

2017 IEEE International Conference on Data Mining (ICDM) $("#expandableTitles").expandable();

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options

2017 IEEE International Conference on Data Mining (ICDM)