The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
The state-of-the-art scheduler of containerized cloud services considers load-balance as the only criterion and neglects many others such as application performance. In the era of Big Data, however, applications have evolved to be highly data-intensive thus perform poorly in existing systems. This particularly holds for Platform-as-a-Service environments that encourage an application model of stateless...
Large amount of data is being generated every day and is creating new challenges and opportunities which lead to extraordinary new knowledge and discoveries in many application domains ranging from science and engineering to business. One of the main challenges in this era of Big Data is how to efficiently manage and analyse such scale of data. This is challenging not only due to the size of the data,...
With the rise of mobile technologies (e.g., smart phones, wearable technologies) and location-aware Internet browsers, a massive amount of spatial data is being collected since such tools allow users to geo-tag user content (e.g., photos, tweets). Meanwhile, cloud computing providers such as Amazon and Microsoft allow users to lease computing resources where users are charged based on the amount of...
Sensor and smart phone technologies present opportunities for data explosion, streaming and collecting from heterogeneous devices every second. Analyzing these large datasets can unlock multiple behaviors previously unknown, and help optimize approaches to city wide applications or societal use cases. However, collecting and handling of these massive datasets presents challenges in how to perform...
Cloud services allow one to perform intense big data calculations without having to own personally a powerful enough machine. Different cloud-based virtual machines, however, offer different processor speeds at different costs, and the most cost-effective machine size may not always be obvious. We investigated different virtual machine sizes on the Microsoft Azure cloud service and also different...
During the past years the exponential growth of data, its generation speed, and its expected consumption rate presents one of the most important challenges in IT both for industry and research. For these reasons, the ALOJA research project was created by BSC and Microsoft as an open initiative to increase cost-efficiency and the general understanding of Big Data systems via automation and learning...
Workflow makespan is the total execution time for running a workflow in the Cloud. The workflow makespan significantly depends on how the workflow tasks and datasets are allocated and placed in a distributed computing environment such as Clouds. Incorporating data and task allocation strategies to minimize makespan delivers significant benefits to scientific users in receiving their results in time...
The rapid converging of big data and IoT (Internet of Things) technologies provides more opportunities in the area of road traffic applications. In this paper, we discuss a timeline visualization tool which enables us to better understand of traffic behaviors from road traffic big data.
Healthcare applications typically require big data management as well as intensive computation. This is especially true with recently developed next generation sequencing technology which increases interests in processing the huge amount of information in a timely fashion. In this paper, we focus on testing whether the healthcare applications can scale well on commercial big data platforms that implement...
We propose a media storm indexing algorithm using Map-Reduce in our recently proposed CDVC framework. In this study, CDVC is built on Flink, an open-source platform for stream data processing. The question we answer is how to store massive image collections; for instance, with over one million images per second, as well as with varying incoming rate. In our experiments with two benchmark datasets...
Cloud services are widely used across the globe to store and analyze Big Data. These days it seems the news is full of stories about security breaches to these services, resulting in the exposure of huge amounts of private data. This paper studies the current security threats to Cloud Services, Big Data, and Hadoop. The paper analyzes a newly proposed Big Data security system based on the EnCoRe system...
We are developing a new, holistic data management system for genomics, which uses cloud-based computing for querying thousands of heterogeneous genomic datasets. In our project, it is essential to leverage upon a modern cloud computing framework, so as to encode our query expressions into high-level operations provided by the framework. After releasing our first implementation using Pig and Hadoop...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.