2017 IEEE International Conference on Data Mining (ICDM)

rozdział

Discovering Truths from Distributed Data

Yaqing Wang, Fenglong Ma, Lu Su, Jing Gao

2017 IEEE International Conference on Data Mining (ICDM) > 505 - 514

In the big data era, the information about the same object collected from multiple sources is inevitably conflicting. The task of identifying true information (i.e., the truths) among conflicting data is referred to as truth discovery, which incorporates the estimation of source reliability degrees into the aggregation of multi-source data. However, in many real-world applications, large-scale data...

rozdział

Local Bayes Risk Minimization Based Stopping Strategy for Hierarchical Classification

Yu Wang, Qinghua Hu, Yucan Zhou, Hong Zhao, więcej

2017 IEEE International Conference on Data Mining (ICDM) > 515 - 524

2017 IEEE International Conference on Data Mining (ICDM)

In large-scale data classification tasks, it is becoming more and more challenging in finding a true class from a huge amount of candidate categories. Fortunately, a hierarchical structure usually exists in these massive categories. The task of utilizing this structure for effective classification is called hierarchical classification. It usually follows a top-down fashion which predicts a sample...

rozdział

AWDA: An Adaptive Wishart Discriminant Analysis

Haoyi Xiong, Wei Cheng, Wenqing Hu, Jiang Bian, więcej

2017 IEEE International Conference on Data Mining (ICDM) > 525 - 534

2017 IEEE International Conference on Data Mining (ICDM)

Linear Discriminant Analysis (LDA) is widely-used for supervised dimension reduction and linear classification. Classical LDA, however, suffers from the ill-posed estimation problem on data with high dimension and low sample size (HDLSS). To cope with this problem, in this paper, we propose an Adaptive Wishart Discriminant Analysis (AWDA) for classification, that makes predictions in an ensemble way...

rozdział

Generating Medical Hypotheses Based on Evolutionary Medical Concepts

Guangxu Xun, Kishlay Jha, Vishrawas Gopalakrishnan, Yaliang Li, więcej

2017 IEEE International Conference on Data Mining (ICDM) > 535 - 544

2017 IEEE International Conference on Data Mining (ICDM)

Literature based discovery (LBD) is a task that aims to uncover hidden associations between non-interacting scientific concepts by rationally connecting independent nuggets of information. Broadly, prior approaches to LBD include use of: a) distributional statistics and explicit representation, b) graph-theoretic measures, and c) supervised machine learning methods to find associations. However, purely...

rozdział

HistoSketch: Fast Similarity-Preserving Sketching of Streaming Histograms with Concept Drift

Dingqi Yang, Bin Li, Laura Rettig, Philippe Cudre-Mauroux

2017 IEEE International Conference on Data Mining (ICDM) > 545 - 554

2017 IEEE International Conference on Data Mining (ICDM)

Histogram-based similarity has been widely adopted in many machine learning tasks. However, measuring histogram similarity is a challenging task for streaming data, where the elements of a histogram are observed in a streaming manner. First, the ever-growing cardinality of histogram elements makes any similarity computation inefficient. Second, the concept-drift issue in the data streams also impairs...

rozdział

Mining Customer Valuations to Optimize Product Bundling Strategy

Li Ye, Hong Xie, Weijie Wu, John C.S. Lui

2017 IEEE International Conference on Data Mining (ICDM) > 555 - 564

2017 IEEE International Conference on Data Mining (ICDM)

Product bundling is widely adopted for information goods and online services because it can increase profit for companies. For example, cable companies often bundle Internet access and video streaming services together. However, it is challenging to obtain an optimal bundling strategy, not only because it is computationally expensive, but also that customers’ private information (e.g., valuations...

rozdział

Matrix Profile VI: Meaningful Multidimensional Motif Discovery

Chin-Chia Michael Yeh, Nickolas Kavantzas, Eamonn Keogh

2017 IEEE International Conference on Data Mining (ICDM) > 565 - 574

2017 IEEE International Conference on Data Mining (ICDM)

Time series motifs are approximately repeating patterns in real-valued time series data. They are useful for exploratory data mining and are often used as inputs for various time series clustering, classification, segmentation, rule discovery, and visualization algorithms. Since the introduction of the first motif discovery algorithm for univariate time series in 2002, multiple efforts have been made...

rozdział

Deep Similarity-Based Batch Mode Active Learning with Exploration-Exploitation

Changchang Yin, Buyue Qian, Shilei Cao, Xiaoyu Li, więcej

2017 IEEE International Conference on Data Mining (ICDM) > 575 - 584

2017 IEEE International Conference on Data Mining (ICDM)

Active learning aims to reduce manual labeling efforts by proactively selecting the most informative unlabeled instances to query. In real-world scenarios, it's often more practical to query a batch of instances rather than a single one at each iteration. To achieve this we need to keep not only the informativeness of the instances but also their diversity. Many heuristic methods have been proposed...

rozdział

SPTF: A Scalable Probabilistic Tensor Factorization Model for Semantic-Aware Behavior Prediction

Hongzhi Yin, Hongxu Chen, Xiaoshuai Sun, Hao Wang, więcej

2017 IEEE International Conference on Data Mining (ICDM) > 585 - 594

2017 IEEE International Conference on Data Mining (ICDM)

With the rapid rise of various e-commerce and social network platforms, users are generating large amounts of heterogeneous behavior data, such as purchasehistory, adding-to-favorite, adding-to-cart and click activities, and this kind of user behavior data is usually binary, only reflecting a user's action or inaction (i.e., implicit feedback data). Tensor factorization is a promising means of modeling...

rozdział

Supervised Belief Propagation: Scalable Supervised Inference on Attributed Networks

Jaemin Yoo, Saehan Jo, U. Kang

2017 IEEE International Conference on Data Mining (ICDM) > 595 - 604

2017 IEEE International Conference on Data Mining (ICDM)

Given an undirected network where some of the nodes are labeled, how can we classify the unlabeled nodes with high accuracy? Loopy Belief Propagation (LBP) is an inference algorithm widely used for this purpose with various applications including fraud detection, malware detection, web classification, and recommendation. However, previous methods based on LBP have problems in modeling complex structures...

rozdział

BL-MNE: Emerging Heterogeneous Social Network Embedding Through Broad Learning with Aligned Autoencoder

Jiawei Zhang, Congying Xia, Chenwei Zhang, Limeng Cui, więcej

2017 IEEE International Conference on Data Mining (ICDM) > 605 - 614

2017 IEEE International Conference on Data Mining (ICDM)

Network embedding aims at projecting the network data into a low-dimensional feature space, where the nodes are represented as a unique feature vector and network structure can be effectively preserved. In recent years, more and more online application service sites can be represented as massive and complex networks, which are extremely challenging for traditional machine learning algorithms to deal...

rozdział

Data-Driven Immunization

Yao Zhang, Arvind Ramanathan, Anil Vullikanti, Laura Pullum, więcej

2017 IEEE International Conference on Data Mining (ICDM) > 615 - 624

2017 IEEE International Conference on Data Mining (ICDM)

Given a contact network and coarse-grained diagnostic information like electronic Healthcare Reimbursement Claims (eHRC) data, can we develop efficient intervention policies to control an epidemic? Immunization is an important problem in multiple areas especially epidemiology and public health. However, most existing studies focus on developing pre-emptive strategies assuming prior epidemiological...

rozdział

Online and Distributed Robust Regressions Under Adversarial Data Corruption

Xuchao Zhang, Liang Zhao, Arnold P. Boedihardjo, Chang-Tien Lu

2017 IEEE International Conference on Data Mining (ICDM) > 625 - 634

2017 IEEE International Conference on Data Mining (ICDM)

In today's era of big data, robust least-squares regression becomes a more challenging problem when considering the adversarial corruption along with explosive growth of datasets. Traditional robust methods can handle the noise but suffer from several challenges when applied in huge dataset including 1) computational infeasibility of handling an entire dataset at once, 2) existence of heterogeneously...

rozdział

MetaLDA: A Topic Model that Efficiently Incorporates Meta Information

He Zhao, Lan Du, Wray Buntine, Gang Liu

2017 IEEE International Conference on Data Mining (ICDM) > 635 - 644

2017 IEEE International Conference on Data Mining (ICDM)

Besides the text content, documents and their associated words usually come with rich sets of meta information, such as categories of documents and semantic/syntactic features of words, like those encoded in word embeddings. Incorporating such meta information directly into the generative process of topic models can improve modelling accuracy and topic quality, especially in the case where the word-occurrence...

rozdział

Collaborative Filtering with Social Local Models

Huan Zhao, Quanming Yao, James T. Kwok, Dik Lun Lee

2017 IEEE International Conference on Data Mining (ICDM) > 645 - 654

2017 IEEE International Conference on Data Mining (ICDM)

Matrix Factorization (MF) is a very popular method for recommendation systems. It assumes that the underneath rating matrix is low-rank. However, this assumption can be too restrictive to capture complex relationships and interactions among users and items. Recently, Local LOw-Rank Matrix Approximation (LLORMA) has been shown to be very successful in addressing this issue. It just assumes the rating...

rozdział

Exploiting Hierarchical Structures for POI Recommendation

Pengpeng Zhao, Xiefeng Xu, Yanchi Liu, Ziting Zhou, więcej

2017 IEEE International Conference on Data Mining (ICDM) > 655 - 664

2017 IEEE International Conference on Data Mining (ICDM)

With the rapid development of location-based social networks, Point-of-Interest (POI) recommendation has played an important role in helping people discover attractive locations. However, existing POI recommendation methods assume a flat structure of POIs, which are better described in a hierarchical structure in reality. Furthermore, we discover that both users' content and spatial preferences exhibit...

rozdział

AnySCAN: An Efficient Anytime Framework with Active Learning for Large-Scale Network Clustering

Weizhong Zhao, Gang Chen, Xiaowei Xu

2017 IEEE International Conference on Data Mining (ICDM) > 665 - 674

2017 IEEE International Conference on Data Mining (ICDM)

Network clustering is an essential approach to finding latent clusters in real-world networks. As the scale of real-world networks becomes increasingly larger, the existing network clustering algorithms fail to discover meaningful clusters efficiently. In this paper, we propose a framework called AnySCAN, which applies anytime theory to the structural clustering algorithm for networks (SCAN). Moreover,...

rozdział

SCED: A General Framework for Sparse Tensor Decomposition with Constraints and Elementwise Dynamic Learning

Shuo Zhou, Sarah M. Erfani, James Bailey

2017 IEEE International Conference on Data Mining (ICDM) > 675 - 684

2017 IEEE International Conference on Data Mining (ICDM)

CANDECOMP/PARAFAC Decomposition (CPD) is one of the most popular tensor decomposition methods that has been extensively studied and widely applied. In recent years, sparse tensors that contain a huge portion of zeros but a limited number of non-zeros have attracted increasing interest. Existing techniques are not directly applicable to sparse tensors, since they mainly target dense ones and usually...

rozdział

A Randomized Approach for Crowdsourcing in the Presence of Multiple Views

Yao Zhou, Jingrui He

2017 IEEE International Conference on Data Mining (ICDM) > 685 - 694

2017 IEEE International Conference on Data Mining (ICDM)

Driven by the dramatic growth of data both in terms of the size and sources, learning from heterogeneous data is emerging as an important research direction for many real applications. One of the biggest challenges of this type of problem is how to meaningfully integrate heterogeneous data to considerably improve the generality and quality of the learning model. In this paper, we first present a unified...

rozdział

Matrix Profile VII: Time Series Chains: A New Primitive for Time Series Data Mining (Best Student Paper Award)

Yan Zhu, Makoto Imamura, Daniel Nikovski, Eamonn Keogh

2017 IEEE International Conference on Data Mining (ICDM) > 695 - 704

2017 IEEE International Conference on Data Mining (ICDM)

Since their introduction over a decade ago, time series motifs have become a fundamental tool for time series analytics, finding diverse uses in dozens of domains. In this work we introduce Time Series Chains, which are related to, but distinct from, time series motifs. Informally, time series chains are a temporally ordered set of subsequence patterns, such that each pattern is similar to the pattern...

INFONA - portal komunikacji naukowej

2017 IEEE International Conference on Data Mining (ICDM)

Discovering Truths from Distributed Data

Local Bayes Risk Minimization Based Stopping Strategy for Hierarchical Classification

AWDA: An Adaptive Wishart Discriminant Analysis

Generating Medical Hypotheses Based on Evolutionary Medical Concepts

HistoSketch: Fast Similarity-Preserving Sketching of Streaming Histograms with Concept Drift

Mining Customer Valuations to Optimize Product Bundling Strategy

Matrix Profile VI: Meaningful Multidimensional Motif Discovery

Deep Similarity-Based Batch Mode Active Learning with Exploration-Exploitation

SPTF: A Scalable Probabilistic Tensor Factorization Model for Semantic-Aware Behavior Prediction

Supervised Belief Propagation: Scalable Supervised Inference on Attributed Networks

BL-MNE: Emerging Heterogeneous Social Network Embedding Through Broad Learning with Aligned Autoencoder

Data-Driven Immunization

Online and Distributed Robust Regressions Under Adversarial Data Corruption

MetaLDA: A Topic Model that Efficiently Incorporates Meta Information

Collaborative Filtering with Social Local Models

Exploiting Hierarchical Structures for POI Recommendation

AnySCAN: An Efficient Anytime Framework with Active Learning for Large-Scale Network Clustering

SCED: A General Framework for Sparse Tensor Decomposition with Constraints and Elementwise Dynamic Learning

A Randomized Approach for Crowdsourcing in the Presence of Multiple Views

Matrix Profile VII: Time Series Chains: A New Primitive for Time Series Data Mining (Best Student Paper Award)

Opcje filtrowania

Data publikacji

Słowa kluczowe

INFONA - portal komunikacji naukowej

2017 IEEE International Conference on Data Mining (ICDM) $("#expandableTitles").expandable();

Dodaj adresata

Anulowanie wysłania wiadomości

Czy na pewno chcesz anulować wysłanie wiadomości?

Wyślij wiadomość

Opcje filtrowania

Data publikacji

Ustawianie zakresu dat

Podaj zakres dat dla filtrowania wyświetlonych wyników. Możesz podać datę początkową, końcową lub obie daty. Daty możesz wpisać ręcznie lub wybrać za pomocą kalendarza.

Słowa kluczowe

Zgłaszanie błędu / nadużycia

Nieudane wysłanie zgłoszenia

Ułatwienia dostępu

2017 IEEE International Conference on Data Mining (ICDM)