Summary.
EhV-86 is a large double stranded DNA virus with a 407,339 base pair circular genome that infects the globally important microalga Emiliania huxleyi. It belongs to a new genus of viruses termed the Coccolithoviridae within the algal virus family Phycodnaviridae. By plotting the EhV-86 genome against itself in a dot-plot analysis we revealed three families of distinctly different repeat sequences throughout its genome, designated Family A, B and C. Family A repeats are non-coding, found immediately upstream of 86 predicted coding sequences (CDSs) and are likely to play a crucial role in controlling the expression of the associated CDSs. Family B repeats are GC rich, coding and correspond to possible calcium binding sites in 22 proline-rich domains found in the protein products of eight predicted EhV-86 CDSs. Family C repeats are AT-rich, non-coding and are likely to form part of the origin of replication. We suggest that these repeat regions are of fundamental importance during virus propagation being involved with transcriptional control (Family A), virus adsorption/release (Family B) and DNA replication (Family C).