Pietro Michiardi

article

DiNoDB: An Interactive-Speed Query Engine for Ad-Hoc Queries on Temporary Data

Yongchao Tian, Ioannis Alagiannis, Erietta Liarou, Anastasia Ailamaki, more

IEEE Transactions on Big Data > 2017 > 3 > 3 > 320 - 333

As data sets grow in size, analytics applications struggle to get instant insight into large datasets. Modern applications involve heavy batch processing jobs over large volumes of data and at the same time require efficient ad-hoc interactive analytics on temporary data. Existing solutions, however, typically focus on one of these two aspects, largely ignoring the need for synergy between the two...

chapter

Resource Management for Parallel Processing Frameworks with Load Awareness at Worker Side

Son-Hai Ha, Patrick Brown, Pietro Michiardi

2017 IEEE International Congress on Big Data (BigData Congress) > 161 - 168

2017 IEEE International Congress on Big Data (BigData Congress)

Many resource management systems and large-scale data processing frameworks use a reservation-based model for managing resources and scheduling tasks. We observe from the reported traces of Facebook and Google that this model leads to resource being wasted because the tasks do not use effectively the allocated resources. We confirm the problem with a trace of our production cluster. We propose an...

chapter

Too Big to Eat: Boosting Analytics Data Ingestion from Object Stores with Scoop

Yosef Moatti, Eran Rom, Raul Gracia-Tinedo, Dalit Naor, more

2017 IEEE 33rd International Conference on Data Engineering (ICDE) > 309 - 320

2017 IEEE 33rd International Conference on Data Engineering (ICDE)

Extracting value from data stored in object stores,such as OpenStack Swift and Amazon S3, can be problematicin common scenarios where analytics frameworks and objectstores run in physically disaggregated clusters. One of the mainproblems is that analytics frameworks must ingest large amountsof data from the object store prior to the actual computation;this incurs a significant resources and performance...

chapter

On the impact of virtualization on the I/O performance of analytic workloads

Son-Hai Ha, Daniele Venzano, Patrick Brown, Pietro Michiardi

2016 2nd International Conference on Cloud Computing Technologies and Applications (CloudTech) > 31 - 38

2016 2nd International Conference on Cloud Computing Technologies and Applications (CloudTech)

In this work we study the I/O performance of long, sequential workloads that mimic those of Big Data applications, to understand the implications of system virtualization on data-intensive frameworks such as Apache Hadoop and Spark, which are frequently run in clusters of Virtual Machines (VMs). We do so through an experimental measurement campaign that collects low-level traces and metrics, to show...

chapter

PaMPa-HD: A Parallel MapReduce-Based Frequent Pattern Miner for High-Dimensional Data

Daniele Apiletti, Elena Baralis, Tania Cerquitelli, Paolo Garza, more

2015 IEEE International Conference on Data Mining Workshop (ICDMW) > 839 - 846

2015 IEEE International Conference on Data Mining Workshop (ICDMW)

Frequent closed itemset mining is among the most complex exploratory techniques in data mining, and provides the ability to discover hidden correlations in transactional datasets. The explosion of Big Data is leading to new parallel and distributed approaches. Unfortunately, most of them are designed to cope with low-dimensional datasets, whereas no distributed high-dimensional frequent closed itemset...

chapter

A Measurement Study of Data-Intensive Network Traffic Patterns in a Private Cloud

Daniele Venzano, Pietro Michiardi

2013 IEEE/ACM 6th International Conference on Utility and Cloud Computing > 476 - 481

2013 IEEE/ACM 6th International Conference on Utility and Cloud Computing (UCC)

In this work we investigate the impact of virtualization on the raw network performance attainable by "data-intensive" applications deployed in a private cloud. To this end we developed a new software tool, called OSMeF, to take repeatable measurements on our Open Stack-based platform. We also discuss the implications of our measurement results toward informed deployments of distributed...

INFONA - science communication portal

Search results for: Pietro Michiardi

DiNoDB: An Interactive-Speed Query Engine for Ad-Hoc Queries on Temporary Data

Resource Management for Parallel Processing Frameworks with Load Awareness at Worker Side

Too Big to Eat: Boosting Analytics Data Ingestion from Object Stores with Scoop

On the impact of virtualization on the I/O performance of analytic workloads

PaMPa-HD: A Parallel MapReduce-Based Frequent Pattern Miner for High-Dimensional Data

A Measurement Study of Data-Intensive Network Traffic Patterns in a Private Cloud

Filter options

Publication date

Publication type

Keywords

INFONA - science communication portal

Search results for: Pietro Michiardi

DiNoDB: An Interactive-Speed Query Engine for Ad-Hoc Queries on Temporary Data

Resource Management for Parallel Processing Frameworks with Load Awareness at Worker Side

Too Big to Eat: Boosting Analytics Data Ingestion from Object Stores with Scoop

On the impact of virtualization on the I/O performance of analytic workloads

PaMPa-HD: A Parallel MapReduce-Based Frequent Pattern Miner for High-Dimensional Data

A Measurement Study of Data-Intensive Network Traffic Patterns in a Private Cloud

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Publication type

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options