The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Projects, in Process-Aware Information Systems (PAIS) for medical scientific research, have as requirements all sorts of data coming from many interdependent processes. More concretely, in autopsy study, data are collected from two main interdependent processes: 1) material collection and processing; and 2) interview. These processes are the groundwork to optimize the association between biological...
Multilingual support in global applications that integrate and filter social media data is a significant challenge due to the cost of manually developing such social media filters for each language. Using LITMUS landslide information system as an experimental platform, we compared six design alternatives with varied combinations of manually developed filters and automatically translated filters for...
High reliability, efficient I/O performance and flexible consistency provided with low storage cost are all desirable properties of cloud storage systems. Due to the inherent conflicts, however, simultaneously achieving optimum on all these properties is impractical. N-way Replication and Erasure Coding, two extensively-applied storage schemes with high reliability, adopt opposite and unbalanced strategies...
With the development of next-generation sequencing (NGS), DNA/RNA sequencing has become cheaper and more efficient. Today, a whole human genome can be sequenced under $1,000, providing opportunities for large-scale bioinformatic analysis on big datasets. However, most of existing bioinformatic analysis tools are programmed for single server based computing platform and not suitable to process such...
With the growing popularity of smart things and the omnipresence of wireless communications, the Internet continues to revolutionize from the Internet of hosts, the Internet of People, to the Internet of Things (IoT). Intelligent and scalable orchestration of massive IoT objects using a multitier architecture are critical to embrace the vision of IoT. This paper presents our vision and our initial...
Soft resource allocation is an important factor of system configuration which plays a critical role in guaranteeing the performance of multi-tier web service systems. There is a tradeoff between real-time performance and resource consumption, and thus the real-time adjustment of soft resource allocation in response to dynamic workload is quite challenging. In this paper, we propose a real-time soft...
LITMUS is a real-time online and openly accessible service that collects high quality information on landslide events from social media. This service uses disaster related keywords, such as "landslide" and "mudslide", to analyze messages posted by English speaking users. However, comprehensive coverage of disasters must include multilingual support as there are events that are...
Modern distributed systems are often considered to be black boxes that greatly limit the potential to understand behaviors at the level of detail necessary to diagnose some of the most important types of performance problems. Recently researchers have found abnormal response time delays, one to two orders of magnitude longer than the average response time, that exist in short periods and cause economic...
Performance analysis is crucial to the successful development of cloud computing paradigm. And it is especially important for a cloud computing center serving parallelizable application jobs, for determining a proper degree of parallelism could reduce the mean service response time and thus improve the performance of cloud computing obviously. In this paper, taking the cloud based rendering service...
The scalability of n-tier systems relies on effective load balancing to distribute load among the servers of the same tier. We found that load balancing mechanisms (and some policies) in servers used in typical n-tier systems (e.g., Apache and Tomcat) have issues of instability when very long response time (VLRT) requests appear due to millibottlenecks, very short bottlenecks that last only tens to...
Long-tail latency of web-facing applications continues to be a serious problem. Most of the previously published research addresses two classes of long latency problems: uneven workloads such as web search, and resource saturation in single nodes. We describe an experimental study of a third class of long tail latency problemsthat are specific to distributed systems: Cross-Tier Queue Overflow (CTQO)...
In multi-tier cloud service systems, performance evaluation relies on numerous experiments in order to collect key metrics such as resources usage. The approach may result in highly time-consuming in practice. In this paper, we propose an automated framework for performance tracking, data management and analysis to minimize human intervention in multi-tier cloud service systems. The framework support...
In this paper we propose and evaluate three approaches for automated classification of texts in over 60 languages without the need for a manually annotated dataset in those languages. All approaches are based on the randomized Explicit Semantic Analysis method using multilingual Wikipedia articles as their knowledge repository. We evaluate the proposed approaches by classifying a Twitter dataset in...
The performance of n-tier web-facing applications often suffer from response time long-tail problem. With relatively low resource utilization (less than 50%) and the majority of requests returning within a few milliseconds, a non-negligible num-ber of normally short requests may take seconds to return. We propose the millibottleneck theory of performance bugs (that lead to long-tail problems). Several...
We study the problem of using Social Media to detect natural disasters, of which we are interested in a special kind, namely landslides. Employing information from Social Media presents unique research challenges, as there exists a considerable amount of noise due to multiple meanings of the search keywords, such as "landslide" and "mudslide". To tackle these challenges, we propose...
Abstract-Modern world data come from an increasing numberof sources, including data from physical sensors like weathersatellites and seismographs as well as social networks and weblogs. While progress has been made in the filtering of individualsocial networks, there are significant advantages in the integrationof big data from multiple sources. For physical events, theintegration of physical sensors...
Rumors may potentially cause undesirable effect such as the widespread panic in the general public. Especially, with the unprecedented growth of different types of social and enterprise networks, rumors could reach a larger audience than before. Many researchers have proposed different approaches to analyze and detect rumors in social networks. However, most of them either study on theoretical models...
Distributed data stream processing has become an increasingly popular computational framework due to many emerging applications which require real-time processing of data such as dynamic content delivery and security event analysis. These distributed data stream processing applications are often run on shared, multi-tenant clusters as companies try to consolidate from dedicated clusters for each application...
Recently, more and more enterprises have adopted hybrid cloud strategies to simultaneously enjoy the security of on-premise clouds and the low cost of public clouds. The key challenge of hybrid clouds, though, stems from the difficulty of specifying where the data should be stored and where the information could flow efficiently. In order to meet security concerns and performance requirements, we...
Modern datacenters employ server virtualization and consolidation to reduce the cost of operation and to maximize profit. However, interference among consolidated virtual machines (VMs) has barred mission-critical applications due to unpredictable performance. Through extensive measurements of RUBBoS n-tier benchmark, we found a major source of performance unpredictability: the memory thrashing caused...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.