The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
The adoption of Big Data technologies can potentially boost the scalability of data-driven biology and health workflows by orders of magnitude. Consider, for instance, that technologies in the Hadoop ecosystem have been successfully used in data-driven industry to scale their processes to levels much larger than any biological-or health-driven work attempted thus far. In this work we demonstrate the...
Inflammatory Bowel Disease (IBD) is an autoimmune condition that is observed to be associated with major alterations in the gut microbiome taxonomic composition. Here we classify major changes in microbiome protein family abundances between healthy subjects and IBD patients. We use machine learning to analyze results obtained previously from computing relative abundance of ∼10,000 KEGG orthologous...
This paper proposes an intercloud brokerage method for system infrastructure deployments of genomic big data analytics workflows. The proposed method utilizes a conjunction of universally quantified atomic formula to describe requirements given by users, and selects combinations of cloud services based on logical reasoning by the replacement of definite clause sets created from conjunction of the...
The Polystore architecture revisits the federated approach to access and querying the standalone, independent databases in the uniform and optimized fashion, but this time in the context of heterogeneous data and specialized analyses. In light of this architectural philosophy, and in the light of the major data architecture development efforts at the US Department of Veterans Administration (VA),...
Assigning global unique persistent identifiers (GUPIs) to datasets has the goal of improving their accessibility and simplifying how they are referenced and reused. However, as repositories receive more and complex data, attesting for the identity of datasets attached to persistent identifiers over time is becoming more challenging. This is due to the nature of scientific research data, which is generated...
We introduce GraphFlow, a big graph framework that is able to encode complex data science experiments as a set of high-level workflows. GraphFlow combines the Spark big data processing platform and the Galaxy workflow management system to offer a set of components for graph processing using a novel interaction model for creating and using complex workflows. GraphFlow contributes an easy-to-use interface...
We propose a novel iterative unified clustering algorithm for data with both continuous and categorical variables, in the big data environment. Clustering is a well-studied problem and finds several applications. However, none of the big data clustering works discuss the challenge of mixed attribute datasets, with both categorical and continuous attributes. We study an application in the health care...
Privacy-preserving string search is a crucial task for analyzing genomics-driven big data. In this work, we propose a cryptographic protocol that uses Fully Homomorphic Encryption (FHE) to enable a client to search on a genome sequence database without leaking his/her query to the server. Though FHE supports both addition and multiplication over encrypted data, random noise inside ciphertexts grows...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.