The work of Rosalind Franklin, then Watson and Crick [1], established the architecture of deoxyribose nucleic acid (DNA), carrier of all genetic information. The idea that DNA was structurally organised in the form of a double helix comprising two antiparallel and complementary polymer chains was one of the great scientific discoveries of the twentieth century. It revealed not only the way in which genetic information is stored, but also the mechanism by which the genetic code is read, and the way this code can be faultlessly copied from one cell to another during cell division.
The structural organisation of genomic DNA varies significantly from one organism to another, or from one cell to another, depending as it does on the physiological constraints specific to each organism or tissue. This complexity can be observed in particular in the diversity of genomic sequences, the size of the human genome being something like 3 gigabases for about 30,000 genes, whereas yeast, a lower eukaryotic organism, only possesses 6,200 genes for a size of 13 megabases (see Table 1.1). The fraction of protein-coding sequences is also highly variable (1.4% for the human genome, 68% for the yeast genome), and so too is the size of the genes. Particularly interesting is the variation in the content of G+C bases, which determines the overall stability of the DNA helices. Sequences rich in G+C bases are involved in the key processes regulating gene expression and probably in a dominant way in dynamical processes. An important point is the possibility of methylating cytosines, especially the CpG sequences, a crucial process in the control of gene expression. The presence of alternating sequences of GC base pairs, associated with the methylation of the cytosines in these sequences, favours in particular the transition from the B to the Z conformation (see below). Within a given genome, the G+C content can vary significantly, reaching 80% in some regions of mammal genomes, and there seems to be a correlation between the GC base content (especially GCs3) and the gene density in the relevant region.