The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Data curation is critical for scientific data digitization, sharing, integration, and use. This paper presents Kurator, a software package for automating data curation pipelines in the Kepler scientific workflow system. Several curation tools and services are integrated into this package as actors to enable construction of workflows to perform and document various data curation tasks. The integration...
Scientific workflow systems are used to integrate existing software components (actors) into larger analysis pipelines to perform in silico experiments. Current approaches for handling data in nested-collection structures, as required in many scientific domains, lead to many record-management actors (shims) that make the workflow structure overly complex, and as a consequence hard to construct, evolve...
Scientific collaboration increasingly involves data sharing between separate groups. We consider a scenario where data products of scientific workflows are published and then used by other researchers as inputs to their workflows. For proper interpretation, shared data must be complemented by descriptive metadata. We focus on provenance traces, a prime example of such metadata which describes the...
Environmental data arriving constantly from satellites and weather stations are used to compute weather coefficients that are essential for agriculture and viticulture. For example, the reference evapotranspiration (ET0) coefficient, overlaid on regional maps, is provided each day by the California Department of Water Resources to local farmers and turf managers to plan daily water use. Scaling out...
XML process networks are a simple, yet powerful programming paradigm for loosely coupled, coarse-grained dataflow applications such as data-centric scientific workflows. We describe a framework called Delta-XML that is well-suited for applications in which pipelines of data processors modify parts ("deltas") of XML data collections while keeping the overall collection structure intact. We...
Although the processing of data streams has been the focus of many research efforts in several areas, the case of remotely sensed streams in scientific contexts has received little attention. We present an extensible architecture to compose streaming image processing pipelines spanning multiple nodes on a network using a scientific workflow approach. This architecture includes (i) a mechanism for...
Data-centric scientific workflows are often modeled as dataflow process networks. The simplicity of the dataflow framework facilitates workflow design, analysis, and optimization. However, modeling "control-flow intensive" tasks using dataflow constructs often leads to overly complicated workflows that are hard to comprehend, reuse, and maintain. We describe a generic framework, based on...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.