The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
File transfers between the decentralized storage sites over dedicated wide-area connections are becoming increasingly important in high-performance computing and big data scenarios. Designing such scientific workflows for large file transfers is extremely challenging as they depend on the file, I/O, host, and local- and wide-area network subsystems, and their interactions. To gain insights into file-transfer...
Distributed computing platforms provide a robust mechanism to perform large-scale computations by splitting the task and data among multiple locations, possibly located thousands of miles apart geographically. Although such distribution of resources can lead to benefits, it also comes with its associated problems such as rampant duplication of file transfers increasing congestion, long job completion...
We study the discursive practices of politicians and journalists on social media. For this we need more annotated data than we currently have but the annotation process is time-consuming and costly. In this paper we examine machine learning methods for automatically annotating unseen tweetsbased on a small set of manually annotated tweets. Forimproving the performance of the learner, we focus onmethods...
The Internet of Things (IoT) describes the emerging paradigm that connects sensors, often located at the edge of the network, to stream processing engines located at the core of the network to enable online data-driven monitoring, management, and control. As IoT applications require increasing volumes of streaming data to be processed by complex workflows in a timely manner, it is becoming important...
With the growth of online services, a large amount of files have been generated by users or by the service itself. To make it easier to service users with different network environments and devices, online services usually keep different versions of the same file with various sizes. For users with high speed network and top of the line displays, a large size file with high precision can be supplied...
Resource selection and task placement for distributed execution poses conceptual and implementation difficulties. Although resource selection and task placement are at the core of many tools and workflow systems, the methods are ad hoc rather than being based on models. Consequently, partial and non-interoperable implementations proliferate. We address both the conceptual and implementation difficulties...
A deep learning technique has emerged as a successful approach for diagnostic imaging. Along with the increasing demands for dental healthcare, the automation of diagnostic imaging is increasingly desired in the field of orthodontics for many reasons (e.g., remote assessment, cost reduction, etc.). However, orthodontic diagnoses generally require dental and medical scientists to diagnose a patient...
The pace of discovery in eScience is increasingly dependent on a scientist’s ability to acquire, curate, integrate, analyze, and share large and diverse collections of data. It is all too common for investigators to spend inordinate amounts of time developing ad hoc procedures to manage their data. In previous work, we presented DERIVA, a Scientific Asset Management System, designed to...
In the collections of natural history, mounted skeletons are among the most complex objects. They are composed of hundreds of different bones, tedious to digitize accurately in 3D because many surfaces remain hidden to the scanning device. A group of researchers from Pierre et Marie Curie (Paris 6) and Grenoble Universities teamed up with researchers from the National Museum of Natural History in...
The emergence of New Data Sources (NDS) in healthcare is revolutionising traditional electronic health records in terms of data availability, storage, and access. Increasingly, clinicians are using NDS to build a virtual holistic image of a patients health condition. This research is focused on a review and analysis of the current legislation and privacy rules available for healthcare professionals...
This paper presents the design and prototyping of hardware and software to address the problem of rapid and reliable 3D digitization of very large collections of pinned insects. Using the collection at the Field Museum of Natural History (FMNH) as a use case, a pipeline to ingest the entire collection of 4.5 million specimens in circa 1-2 years imposes a few second limit on average processing time...
In this paper, we describe the application TaRDIS, a visual analytics system for spatial and temporal data designed for the needs of archaeo-related disciplines that supports domain experts in analyzing their data. The temporal data is visualized in form of an interactive Harris Matrix that illustrates the temporal position of the layers. The 2D and 3D visualization sketches the spatial position of...
Sequence comparison is a fundamental task in computational biology, traditionally dominated by alignment-based methods such as the Smith-Waterman and Needleman-Wunsch algorithms, or by alignment based heuristics such as BLAST, the ubiquitous Basic Local Alignment Search Tool. For more than a decade researchers have examined a range of alignment-free alternatives to these approaches, citing concerns...
The computing systems used by LHC experiments has historically consisted of the federation of hundreds to thousands of distributed resources, ranging from small to mid-size re-source. In spite of the impressive scale of the existing distributed computing solutions, the federation of small to mid-size resources will be insufficient to meet projected future demands. This paper is a case study of how...
Coral reefs are of global economic and biological significance but are subject to increasing threats. As a result, it is essential to understand the risk of coral reef ecosystem collapse and to develop assessment process for those ecosystems. The International Union for Conservation of Nature (IUCN) Red List of Ecosystem (RLE) is a framework to assess the vulnerability of an ecosystem. Importantly,...
Over the last decade, the development of a range of Next Generation Sequencing (NGS) technologies has led to an enormous increase in the size of the data sets available in molecular biology. The scale of these data presents new challenges for researchers, and visualisation is widely regarded as an essential tool for exploration and detailed analysis of candidate relationships. Inevitably, there are...
Significant increases in computational resources have enabled the development of more complex and spatially better resolved weather and climate models. As a result the amount of output generated by data assimilation systems and by weather and climate simulations is rapidly increasing e.g. due to higher spatial resolution, more realisations and higher frequency data. However, while compute performance...
We present a novel computational framework that connects Blue Waters, the NSF-supported, leadership-class supercomputer operated by NCSA, to the Laser Interferometer Gravitational-Wave Observatory (LIGO) Data Grid via Open Science Grid technology. To enable this computational infrastructure, we configured, for the first time, a LIGO Data Grid Tier-1 Center that can submit heterogeneous LIGO workflows...
Presents the introductory welcome message from the conference proceedings. May include the conference officers' congratulations to all involved with the conference event and publication of the proceedings record.
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.