2014 IEEE 30th International Conference on Data Engineering (ICDE)

chapter

KnowLife: A knowledge graph for health and life sciences

Patrick Ernst, Cynthia Meng, Amy Siu, Gerhard Weikum

2014 IEEE 30th International Conference on Data Engineering > 1254 - 1257

Knowledge bases (KB's) contribute to advances in semantic search, Web analytics, and smart recommendations. Their coverage of domain-specific knowledge is limited, though. This demo presents the KnowLife portal, a large KB for health and life sciences, automatically constructed from Web sources. Prior work on biomedical ontologies has focused on molecular biology: genes, proteins, and pathways. In...

chapter

GQBE: Querying knowledge graphs by example entity tuples

Nandish Jayaram, Mahesh Gupta, Arijit Khan, Chengkai Li, more

2014 IEEE 30th International Conference on Data Engineering > 1250 - 1253

2014 IEEE 30th International Conference on Data Engineering (ICDE)

We present GQBE, a system that presents a simple and intuitive mechanism to query large knowledge graphs. Answers to tasks such as “list university professors who have designed some programming languages and also won an award in Computer Science” are best found in knowledge graphs that record entities and their relationships. Real-world knowledge graphs are difficult to use due to their sheer size...

chapter

Guaranteed authenticity and integrity of data from untrusted servers

Rohit Jain, Sunil Prabhakar

2014 IEEE 30th International Conference on Data Engineering > 1282 - 1285

2014 IEEE 30th International Conference on Data Engineering (ICDE)

Data are often stored at untrusted database servers. The lack of trust arises naturally when the database server is owned by a third party, as in the case of cloud computing. It also arises if the server may have been compromised, or there is a malicious insider. Ensuring the trustworthiness of data retrieved from such untrusted database is of utmost importance. Trustworthiness of data is defined...

chapter

XQuery streaming by Forest Transducers

Shizuya Hakuta, Sebastian Maneth, Keisuke Nakano, Hideya Iwasaki

2014 IEEE 30th International Conference on Data Engineering > 952 - 963

2014 IEEE 30th International Conference on Data Engineering (ICDE)

Streaming of XML transformations is a challenging task and only a few existing systems support streaming. Research approaches generally define custom fragments of XQuery and XPath that are amenable to streaming, and then design custom algorithms for each fragment. These languages have several shortcomings. Here we take a more principled approach to the problem of streaming XQuery-based transformations...

chapter

MELODY-JOIN: Efficient Earth Mover's Distance similarity joins using MapReduce

Jin Huang, Rui Zhang, Rajkumar Buyya, Jian Chen

2014 IEEE 30th International Conference on Data Engineering > 808 - 819

2014 IEEE 30th International Conference on Data Engineering (ICDE)

The Earth Mover's Distance (EMD) similarity join retrieves pairs of records with EMD below a given threshold. It has a number of important applications such as near duplicate image retrieval and pattern analysis in probabilistic datasets. However, the computational cost of EMD is super cubic to the number of bins in the histograms used to represent the data objects. Consequently, the EMD similarity...

chapter

Near neighbor join

Herald Kllapi, Boulos Harb, Cong Yu

2014 IEEE 30th International Conference on Data Engineering > 1120 - 1131

2014 IEEE 30th International Conference on Data Engineering (ICDE)

An increasing number of Web applications such as friends recommendation depend on the ability to join objects at scale. The traditional approach taken is nearest neighbor join (also called similarity join), whose goal is to find, based on a given join function, the closest set of objects or all the objects within a distance threshold to each object in the input. The scalability of techniques utilizing...

chapter

Exploration of the effect of Category Match Score in search advertising

Youngchul Cha, Junghoo Cho, Jian Yuan, Tak Yan

2014 IEEE 30th International Conference on Data Engineering > 1156 - 1161

2014 IEEE 30th International Conference on Data Engineering (ICDE)

Categorical (topic) similarity between a web page and an advertisement (ad) text has long been used for contextual advertising. In this paper, we explore the use of the categorical similarity score, referred to as Category Match Score (CMS), in the context of search advertising. In particular, we explore the effect of CMS on various ad-effectiveness prediction tasks, including user-judgment prediction,...

chapter

Random-walk domination in large graphs

Rong-Hua Li, Jeffrey Xu Yu, Xin Huang, Hong Cheng

2014 IEEE 30th International Conference on Data Engineering > 736 - 747

2014 IEEE 30th International Conference on Data Engineering (ICDE)

We introduce and formulate two types of random-walk domination problems in graphs motivated by a number of applications in practice (e.g., item-placement problem in online social networks, Ads-placement problem in advertisement networks, and resource-placement problem in P2P networks). Specifically, given a graph G, the goal of the first type of random-walk domination problem is to target k nodes...

chapter

Scalable serializable snapshot isolation for multicore systems

Hyuck Han, SeongJae Park, Hyungsoo Jung, Alan Fekete, more

2014 IEEE 30th International Conference on Data Engineering > 700 - 711

2014 IEEE 30th International Conference on Data Engineering (ICDE)

Since 1990's, Snapshot Isolation (SI) has been widely studied, and it was successfully deployed in commercial and open-source database engines. Berenson et al. showed that data consistency can be violated under SI. Recently, a new class of Serializable SI algorithms (SSI) has been proposed to achieve serializable execution while still allowing concurrency between reads and updates.

chapter

PHiDJ: Parallel similarity self-join for high-dimensional vector data with MapReduce

Sergej Fries, Brigitte Boden, Grzegorz Stepien, Thomas Seidl

2014 IEEE 30th International Conference on Data Engineering > 796 - 807

2014 IEEE 30th International Conference on Data Engineering (ICDE)

Join processing on large-scale vector data is an important problem in many applications, as vectors are a common representation for various data types. Especially, several data analysis tasks like near duplicate detection, density-based clustering or data cleaning are based on similarity self-joins, which are a special type of join. For huge data sets, MapReduce proved to be a suitable, error-tolerant...

chapter

Geometry approach for k-regret query

Peng Peng, Raymond Chi-Wing Wong

2014 IEEE 30th International Conference on Data Engineering > 772 - 783

2014 IEEE 30th International Conference on Data Engineering (ICDE)

Returning tuples that users may be interested in is one of the most important goals for multi-criteria decision making. Top-k queries and skyline queries are two representative queries. A top-k query has its merit of returning a limited number of tuples to users but requires users to give their exact utility functions. A skyline query has its merit that users do not need to give their exact utility...

chapter

Message from the ICDE 2014 program committee and general chairs

Isabel Cruz, Elena Ferrari, Yufei Tao, Elisa Bertino, more

2014 IEEE 30th International Conference on Data Engineering > i - ii

2014 IEEE 30th International Conference on Data Engineering (ICDE)

Established in 1984, ICDE has become a premier forum for the dissemination of data management research results among researchers, users, practitioners, and developers. The 30th IEEE International Conference on Data Engineering takes place in Chicago, IL, USA, from March 31 to April 4, 2014. We are proud to present its proceedings.

chapter

The end of indexes

2014 IEEE 30th International Conference on Data Engineering > 1

2014 IEEE 30th International Conference on Data Engineering (ICDE)

Presents the blank page that is displayed when users of the electronic proceedings record are at the end of a document.

chapter

Practical k nearest neighbor queries with location privacy

Xun Yi, Russell Paulet, Elisa Bertino, Vijay Varadharajan

2014 IEEE 30th International Conference on Data Engineering > 640 - 651

2014 IEEE 30th International Conference on Data Engineering (ICDE)

In mobile communication, spatial queries pose a serious threat to user location privacy because the location of a query may reveal sensitive information about the mobile user. In this paper, we study k nearest neighbor (kNN) queries where the mobile user queries the location-based service (LBS) provider about k nearest points of interest (POIs) on the basis of his current location. We propose a solution...

chapter

Omid: Lock-free transactional support for distributed data stores

Daniel Gomez Ferro, Flavio Junqueira, Ivan Kelly, Benjamin Reed, more

2014 IEEE 30th International Conference on Data Engineering > 676 - 687

2014 IEEE 30th International Conference on Data Engineering (ICDE)

In this paper, we introduce Omid, a tool for lock-free transactional support in large data stores such as HBase. Omid uses a centralized scheme and implements snapshot isolation, a property that guarantees that all read operations of a transaction are performed on a consistent snapshot of the data. In a lock-based approach, the unreleased, distributed locks that are held by a failed or slow client...

chapter

Complete discovery of high-quality patterns in large numerical tensors

Loic Cerf, Wagner Meira

2014 IEEE 30th International Conference on Data Engineering > 448 - 459

2014 IEEE 30th International Conference on Data Engineering (ICDE)

Many datasets are numerical tensors, i. e., associate n-tuples with numerical values. Until recently, the discovery of relevant local patterns in such numerical and multidimensional data has received little attention despite the broad applicative perspectives offered by this general framework. Even in the simpler 2-dimensional case, almost every proposal so far is either incomplete (i. e., it does...

chapter

In-RDBMS inverted indexes revisited

Ian Rae, Alan Halverson, Jeffrey F. Naughton

2014 IEEE 30th International Conference on Data Engineering > 352 - 363

2014 IEEE 30th International Conference on Data Engineering (ICDE)

Every major open-source and commercial RDBMS offers some form of support for full-text search using inverted indexes. When providing this support, some developers have implemented specialized indexes that adapt techniques from the Information Retrieval (IR) community to work in a database setting, while others have opted to rely on the standard relational query engine to process inverted index lookups...

chapter

Leveraging metadata for identifying local, robust multi-variate temporal (RMT) features

Xiaolan Wang, K. Selcuk Candan, Maria Luisa Sapino

2014 IEEE 30th International Conference on Data Engineering > 388 - 399

2014 IEEE 30th International Conference on Data Engineering (ICDE)

Many applications generate and/or consume multi-variate temporal data, yet experts often lack the means to adequately and systematically search for and interpret multi-variate observations. In this paper, we first observe that multi-variate time series often carry localized multi-variate temporal features that are robust against noise. We then argue that these multi-variate temporal features can be...

chapter

Incremental discovery of prominent situational facts

Afroza Sultana, Naeemul Hassan, Chengkai Li, Jun Yang, more

2014 IEEE 30th International Conference on Data Engineering > 112 - 123

2014 IEEE 30th International Conference on Data Engineering (ICDE)

We study the novel problem of finding new, prominent situational facts, which are emerging statements about objects that stand out within certain contexts. Many such facts are newsworthy—e.g., an athlete's outstanding performance in a game, or a viral video's impressive popularity. Effective and efficient identification of these facts assists journalists in reporting, one of the main goals of computational...

chapter

MassJoin: A mapreduce-based method for scalable string similarity joins

Dong Deng, Guoliang Li, Shuang Hao, Jiannan Wang, more

2014 IEEE 30th International Conference on Data Engineering > 340 - 351

2014 IEEE 30th International Conference on Data Engineering (ICDE)

String similarity join is an essential operation in data integration. The era of big data calls for scalable algorithms to support large-scale string similarity joins. In this paper, we study scalable string similarity joins using MapReduce. We propose a MapReduce-based framework, called MASSJOIN, which supports both set-based similarity functions and character-based similarity functions. We extend...

INFONA - science communication portal

2014 IEEE 30th International Conference on Data Engineering (ICDE)

KnowLife: A knowledge graph for health and life sciences

GQBE: Querying knowledge graphs by example entity tuples

Guaranteed authenticity and integrity of data from untrusted servers

XQuery streaming by Forest Transducers

MELODY-JOIN: Efficient Earth Mover's Distance similarity joins using MapReduce

Near neighbor join

Exploration of the effect of Category Match Score in search advertising

Random-walk domination in large graphs

Scalable serializable snapshot isolation for multicore systems

PHiDJ: Parallel similarity self-join for high-dimensional vector data with MapReduce

Geometry approach for k-regret query

Message from the ICDE 2014 program committee and general chairs

The end of indexes

Practical k nearest neighbor queries with location privacy

Omid: Lock-free transactional support for distributed data stores

Complete discovery of high-quality patterns in large numerical tensors

In-RDBMS inverted indexes revisited

Leveraging metadata for identifying local, robust multi-variate temporal (RMT) features

Incremental discovery of prominent situational facts

MassJoin: A mapreduce-based method for scalable string similarity joins

Filter options

Publication date

Content availability

Keywords

INFONA - science communication portal

2014 IEEE 30th International Conference on Data Engineering (ICDE) $("#expandableTitles").expandable();

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Content availability

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options

2014 IEEE 30th International Conference on Data Engineering (ICDE)