Distributed gene expression modelling for exploring variability in epigenetic function

David M. Budden; Edmund J. Crampin

doi:10.1186/s12859-016-1313-1

Distributed gene expression modelling for exploring variability in epigenetic function

David M. Budden, Edmund J. Crampin

Source

BMC Bioinformatics > 2016 > 17 > 1 > 1-8

Abstract

Background

Predictive gene expression modelling is an important tool in computational biology due to the volume of high-throughput sequencing data generated by recent consortia. However, the scope of previous studies has been restricted to a small set of cell-lines or experimental conditions due an inability to leverage distributed processing architectures for large, sharded data-sets.

Results

We present a distributed implementation of gene expression modelling using the MapReduce paradigm and prove that performance improves as a linear function of available processor cores. We then leverage the computational efficiency of this framework to explore the variability of epigenetic function across fifty histone modification data-sets from variety of cancerous and non-cancerous cell-lines.

Conclusions

We demonstrate that the genome-wide relationships between histone modifications and mRNA transcription are lineage, tissue and karyotype-invariant, and that models trained on matched -omics data from non-cancerous cell-lines are able to predict cancerous expression with equivalent genome-wide fidelity.

Identifiers

journal e-ISSN :	1471-2105
DOI	10.1186/s12859-016-1313-1

Authors

David M. Budden

Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, USA
Melbourne School of Engineering, the University of Melbourne, Systems Biology Laboratory, Parkville, Australia

Edmund J. Crampin

Melbourne School of Engineering, the University of Melbourne, Systems Biology Laboratory, Parkville, Australia
ARC Centre of Excellence in Convergent Bio-Nano Science and Technology, Parkville, Australia
Department of Mathematics and Statistics, the University of Melbourne, Parkville, Australia
School of Medicine, the University of Melbourne, Parkville, Australia

Keywords

Gene expression Epigenetics Histone modifications MapReduce

Additional information

Publication languages: English

Data set: Springer

Publisher

BioMed Central

Fields of science

No field of science has been suggested yet.

article

Read online
Download
Add to read later
Add to collection
Add to followed
Share

Export to bibliography


Assign to other user
	×
Wrong email address

INFONA - science communication portal

Distributed gene expression modelling for exploring variability in epigenetic function $("#expandableTitles").expandable();

Source

Abstract

Identifiers

Authors

User assignment

Assignment remove confirmation

You're going to remove this assignment. Are you sure?

David M. Budden

Edmund J. Crampin

Keywords

Additional information

Publisher

Fields of science

Fields of science

Share

Export to bibliography

Reporting an error / abuse

Sending the report failed

Accessibility options

Distributed gene expression modelling for exploring variability in epigenetic function