Contents
Organisation of DNA
The organisation of DNA in a nucleus or nucleoid is critically dependent on the nature and topology of the nucleic acid. Nucleic acids may occur in double or single stranded forms, most typically dsDNA (double stranded DNA) and ssRNA (single stranded RNA). Single stranded pieces of nucleic acid can pair with themselves to form hairpin loops, cruciforms, internal loops and bulges. These are important in the termination of transcription and restriction enzymes (palindromic sequences pair readily with themselves). The secondary and tertiary structure of RNAs can give it catalytic activity.

Predicted 2° structure for human 5S rRNA. Note the internal loops where
pairing does not occur, two hairpin loops, and a large bulge in the
centre of a 3-way cruciform. Note also that it is actually rather
difficult to predict the folding of 5S rRNA with accuracy, as it has
unusual non-Watson-Crick basepairs, so this structure probably doesn't
correspond too marvellously with reality!
Nucleic acids also exist in at least three helical isomeric forms: A, B and Z.
- B form is typical of dsDNA: 10 bp per turn, 0.33 nm 'rise' in 'height' per base pair.
- A form is typical of dsRNA and DNA/RNA hybrids: 11 bp per turn.
- Z form is probably very rare. It is left handed, only has one groove, and is only stable under conditions of high ionic strength and/or in very GC rich DNA.
.jpg)
A-form in dsRNA duplex (left) from phenylalanine tRNA; A-form in an
RNA/DNA hybrid (middle) - note the wider, more closely packed helix,
with equal minor and major groove, compared to B-form dsDNA. B-form in
dsDNA duplex (right) - note the narrower, less closely packed helix
(compared to the short hybrid strand), with distinctly wider major
groove.
In A and (particularly the) B forms, it is easier to distinguish bases in the major groove: A C≡G cannot be distinguished from a G≡C; nor an A=T from a T=A in the minor groove).


RNA is capable of forming complex secondary and tertiary structures.

A RNA hairpin, showing the unusual base-paring that can occur when the
regularity of a double helix is disturbed by bending.
The self-splicing type I intron is a catalytic RNA (ribozyme) with complex 2° and 3° structures. Secondary structures include A-helices and hairpin loops; tertiary structures include several hairpin-loop to bulge contacts.
Prokaryotic genome structure
Prokaryotic DNA is largely a simple loop of dsDNA, although some viral and other parasitic ssDNA, ssRNA, dsRNA, etc., may be present. The majority of the genome is present on a single chromosome, in Escherichia coli this is about 5 Mbp in sie, and supercoiled.
Supercoiling adds (or removes) extra twists to the already coiled DNA. Negative supercoiling (underwinding) makes the duplex part more easily, a reaction that is catalysed by DNA gyrase. Most chromosomes are negatively supercoiled, particularly bacterial ones. To add a single negative supercoil to a DNA molecule requires energy of the same magnitude required to separate one base pair. Supercoiling can be used to part ('melt') the DNA double helix. Supercoiling may be involved in initiation of replication.
DNA has associated proteins in prokaryotes, but these are not structural, but are rather things such as DNA polymerase, RNA polymerase, repressors and activators. A famous example is the catabolite activator protein, which under conditions of nutrient stress binds cyclic AMP, and then binds to lac operon, promoting the expression of lactase.
Prokaryotes have polycistronic genes, with several related genes in a contiguous block of DNA, under the control of a single promoter. Eukaryotes have discrete genes, each with its own promoter. Polycistrons can be transcribed and translated simultaneously. This is impossible in eukaryotes because mature ribosomes are excluded from the nucleus, and mRNA requires extensive processing before it can be translated.
Eukaryotic genome structure
In eukaryotes, DNA is organised into many, linear chromosomes, containing long stretches of dsDNA. Whereas prokaryotes contain nearly 100% 'useful' DNA, eukaryotes vary widely in the quantity of seemingly functionless DNA they contain. The minimum value of C for a given taxon (the size of the haploid genome in base pairs) is in rough approximation to the complexity of the organism. However, the maximum value is anything up to a thousand times greater than this number, particularly in plants, amphibia, insects and teleost fish. A housefly contains six times as much DNA as a fruitfly!
This extra DNA is mostly selfish i.e. like all DNA, it exists only because it is good at replicating, and can survive despite having no useful (or even detrimental) phenotypic effects. This junk DNA falls in many classes:
- B-chromosomes. Entire spare chromosomes, often segregation distorters (which subvert the meiotic lottery of 50% survival, and get into more than their fair share of gametes).
- Introns. Most introns are junk inserted within genes. A few (such as those in the immune system genes) are responsible for generating variety in antibodies, and for producing membrane bound and cytosolic versions of the same protein, by the optional inclusion of an intron encoding a hydrophobic domain.
- Pseudogenes. 'Dead', non-functional copies of genes present elsewhere in the genome, but no longer of any use.
- Retropseudogenes. Like pseudogenes, but have been processed, i.e. lack introns. Produced by the action of reverse transcriptase (RT) on mRNA, and subsequent incorporation of the cDNA into the genome.
- Transposons. Jumping genes, which splice themselves in and out of the genome (in DNA form) randomly, by the action of transposase.
- Retrotransposons (such as LINEs). Transcribed into an mRNA, which encodes an RT enzyme, which then copies the mRNA back to DNA and incorporates it into the genome. e.g. LINE-1, 15% of the human genome.
- Minisatellites (such as SINEs). These use the RT of other retro-elements to make copies of themselves. e.g. Alu, 10% of the human genome, a tiny pseudogene derived from rDNA. They are therefore hyperparasites: parasitic on the RT of another parasite.
- Microsatellites. Multiply by mispairing during replication of DNA. These can be found within genes, such as the huntingtin gene of Huntington's chorea, and are frequently CXG repeats.
- Retroviruses. Such as HIV. Retrotransposons are probably pseudoretroviruses, i.e. the dead remains of retroviruses that no longer code for protein coats.
- Segmental duplications. Wholesale copies of chunks of genome.
Some of the DNA is 'useful junk'. This sort of DNA is often highly repetitive and heterochromatic, such as:
- Centromeres. Allow the attachment of kinetochore and spindle during division.
- Telomeres. The ends of linear chromosomes.
Other types of DNA that are not transcribed, but which serve a useful 'purpose' are:
- Enhancers. Which allow fine tuning of RNA polymerase recruitment to genes.
- Multicopy genes. rDNA exists in tandem repeats which allows for the vast amount of transcription required for these essential genes. These also tend to multiply due to mispairings at meiosis, like microsatellites.
Unsurprisingly therefore, only 1.5% of the human genome actually codes for protein or functional RNA. Even the majority of an mRNA primary transcript is composed of introns, which are discarded.

Typical values for genome size in eukaryotes and prokaryotes:
|
|
Escherichia coli |
Homo sapiens |
|
Size of genome |
4.6 Mbp |
3.3 Gbp |
|
Size of typical gene |
1 kbp |
10 kbp (4 exons : 1350 bp) |
|
Size of typical polypeptide |
350 aa |
450 aa |
|
Number of genes |
4 377 |
c. 25 000 |
Packaging of DNA
Homo sapiens contain 3 Gbp per (haploid) genome, so there is c. 2 m of DNA per 5 µm diameter nucleus; however, a mitotic chromosome is only around 2 µm long. This is a pretty impressive feat of packaging, especially when you consider that the volume of DNA in a nucleus is about 6 µm3 DNA, and the nucleus itself is only 500 µm3: this is 1% of perfect packing, yet the DNA is readily accessible and untangled.
The initial packaging of DNA in eukaryotes (and archaeons) only is done by histones. There are five highly conserved types, with only 2 amino acid differences between the sequences of H3 in cow and pea.
- H1 - 220 amino acids long.
- H2a, H2b, H3, H4 - c. 120 amino acids long.
Histones are highly basic (they contain a lot of arginine (R) and lysine (K)) so they can bind readily to DNA, which is an acid. All histones possess a common tertiary structure, the 'histone fold' - a Z-shaped helix-turn-helix-turn-helix motif.

A nucleosome contains eight histone proteins, forming a core around
which a double turn of dsDNA is wound.
Histones bind DNA into nucleosomes, which are little beads of histones wrapped by DNA. Each nucleosome has a core of two each of H2a, H2b, H3 and H4, and the DNA wraps round this octamer twice. Nucleosomes are generally spaced every c. 200 bp of DNA in eukaryotes: 146 bp are bound to the histone core, with a 54 bp linker between the nucleosomes. An average eukaryote gene therefore spans c. 50 nucleosomes. The '11 nm fibre' (otherwise known as 'beads on a string' chromatin) that is formed by the DNA/histone complex reduces the length to about one third that of the unpackaged DNA.
As the DNA bound to the nucleosome is less available to proteins than the linker region, this means that packaging of DNA may help regulate transcription and DNA replication. However, most nucleosomes seem to be randomly arranged on DNA and do not show much evidence of specific placement. In rare cases (such as the 5S rRNA gene), the nucleosomes seem to be in phase (i.e. in exactly the same place on the DNA) across different cells. However, the DNA has to be bent quite violently around the nucleosome core, so this seems to be due to A=T pairs (which are easier to bend) being preferred in the minor groove (indicated by the curly braces below).

Most DNA seems to be bound to the nucleosome during transcription, leading one to wonder how RNA polymerase negotiates the twists and turns of the DNA around the histones without falling off:
- Does the nucleosome divide into two halves?
- Is it transiently knocked off?
- Are some of the histones removed?
- Is it pushed along the DNA?
- Is it modified in some more subtle way?

The jury is still out, but it seems 'subtle modification' may be the answer. Histones have 'tails' which are able to interact with other proteins and DNA. These tails can also be acetylated, methylated and/or phosphorylated. Acetylation by histone acetyltransferases reduces binding of tails to DNA, and allows easier access to the DNA by RNA polymerase. This modification can be very specific: gene silencing is caused by methylation of lysine K9 on histone H3. In addition to modification of the histones, chromatin remodelling 'machines' (large protein complexes) also alter the binding of DNA to the nucleosome core. The fact that nucleosomal DNA with cross-linked histones can still be transcribed indicates that the modification is mostly to DNA topology and supercoiling, not to the histone core itself.
The 11 nm fibres is only the first level in DNA packaging. It is itself coiled to form a higher order structure, the 30 nm fibre, whose length is about one twelfth that of the 11 nm fibre within it.
The most recent model of the 30 nm fibres has a double helix of nucleosomes with the DNA zigzagging between them.

Histone H1 binds the 11 nm fibres to make the 30 nm fibre. It appears to alter the direction of exit of DNA from the nucleosome, and 'locks it off'. The change of direction allows the nucleosomes to pack more closely, condensing the 11 nm fibre into the 30 nm. H1 exists as at least 7 isotypes (including H5, which is only expressed in certain tissues).
.jpg)
The binding of H1 reduces gene expression, probably because it slows chromatin remodelling. So called 'active' DNA has a low affinity for H1. In SV40 such active DNA includes the origin of replication, where proteins other than histones need to bind.
Active DNA can be identified because is more readily degraded by DNAse. This can be observed by DNA footprinting, where fragments of a radiolabelled DNA sequence are generated by DNAse in the presence or absence of DNA-binding protein. A footprint is formed because the protein protects the DNA it binds to from degradation. DNA hypersensitive to DNAses can often be shown to be being actively transcribed.
Bands are produced by randomly cutting a DNA molecule labelled at one end by radioactive 35P using DNAse. If the DNAse is present only at low concentrations, sequences of every possible length will be produced, which will show up on an autoradiograph of an electrophoresis as shown in the upper diagram. If part of the DNA is protected by a protein as in the lower diagram, a patch of bands will be missing.
The 30 nm fibre is then packed into looped domains. The 30 nm fibre is looped around a central proteinaceous nuclear scaffold, which contains enzymes such as topoisomerase II, and ribonucleoproteins. Each looped domain is c. 300 kbp long, and contains c. ten eukaryotic genes. It is possible that each domain form some sort of functional unit. Prokaryotes also seem to have looped domains (of DNA, rather than chromatin).

Looped domains can be seen clearly in certain unusual interphase chromosomes, such as lamp-brush chromosomes in salamander eggs. The giant polytene chromosomes of Drosophila larva salivary glands show 'puffing', where looped domains decondense to allow gene expression.
The chromatin in eukaryotes exists in two main forms:
- Euchromatin, which is uncondensed, lightly staining and (relatively) active, with looped domains of twisted 30 nm fibres.
- Heterochromatin, which is highly condensed, heavily staining and inactive, with the looped domains seeming to be 'zipped up' by protein. Barr bodies (the unneeded extra X-chromosome in female human cells) are a good example of an entirely heterochromatic chromosome.
Telomeres are also heterochromatic. This probably helps prevents damage to the loose (and single-stranded) ends of the chromosome. In yeast (at least), the telomere is bound into a specific structure called the telosome, which lacks nucleosomes. Rap and Sir proteins prevent the ssDNA at the telomere from being degraded.

Unsurprisingly, the other major feature of a eukaryotic chromosome, the centromere, is also heterochromatic. This is probably to withstand stresses from the kinetochore during mitosis. 15 nucleosomes are precisely placed around the centromere sequence in yeast, and these nucleosomes also contain a modified histone H3 (called CENP-A).The centromere in humans may form spontaneously on chromosome fragments (microdissected chromosomes may spontaneously form new centromeres), although specified 'α satellite DNA' sequences appear to promote centromere formation.
During mitosis, the interphase chromosomes condense even further. Phosphorylation of five serine residues in the H1 protein appears to promote this. Condensin proteins then hydrolyse ATP to make the looped domains condense into chromatids and bind ribonucleoproteins on their surfaces. The exact nature of higher order packaging is still not clear: rosettes and coils and other higher order structures have been proposed, but none have been well attested yet.

The highest order packaging of DNA is the synaptonemal complex, which binds together two chromosomes during meiosis and is responsible for crossing over. Lateral elements (cohesin proteins) bind two sister chromatids as a single chromosome, and central elements bind two homologous chromosomes and carry recombination nodules.
Summary
- Nucleic acids are not bland helices: they may have strandedness, supercoiling, 2°/3° structure and helical isomers.
- Prokaryotic DNA is not packaged by histones.
- DNA in eukaryotes is packaged by histones into a 11 nm fibre, by H1 to a 30 nm fibres, then by proteins into looped domains and higher order structures.
- Access to RNA polymerase is limited by packaging: acetylation of histones and modification of DNA topology by remodelling machines is needed to allow transcription.
- Telomere, centromere and some other DNA is highly compacted into transcriptionally inactive heterochromatin.
Test yourself
- How do transcriptionally active and inactive euchromatin differ?
- How does telomeric heterochromatin differ from euchromatin?
- What is the approximate length of 60 000 µm of DNA (i.e. about one chromosome's worth) at the following levels of packing: 11 nm fibre, 30 nm fibre, looped domain, metaphase chromosome?
- Prove that a diploid human cell contains 2 m of DNA.
- Compare and contrast prokaryotic and eukaryotic gene structures.
- When is junk DNA useful?
Bibliography
- Alberts, B., et al. (2002). Molecular biology of the cell. 4th edition. Garland Science, New York. 13-28. "The diversity of genomes and the Tree of Life"
- Alberts, B., et al. (2002). Molecular biology of the cell. 4th edition. Garland Science, New York. 192-197. "The structure and function of DNA"
- Alberts, B., et al. (2002). Molecular biology of the cell. 4th edition. Garland Science, New York. 198-216. "Chromosomal DNA and its packaging in the chromatin fiber"
- Alberts, B., et al. (2002). Molecular biology of the cell. 4th edition. Garland Science, New York. 216-233. "The global structure of chromosomes"
- Berg, J. M., Tymoczko, J. L. and Stryer, L. (2006). Biochemistry. 6th edition. W. H. Freeman and Company, New York. 111-117. "A pair of nucleic acid chains with complementary sequences can form a double-helical structure"
- Hayes, J. J., Pruss, D. and Wolffe, A. P. (1994). Contacts of the globular domain of histone H5 and core histones with DNA in a "chromatosome". Proceedings of the National Academy of Science USA 91:7817-7821. http://dx.doi.org/10.1073/pnas.91.16.7817
- Pereira, S. L., et al. (1997). Archaeal nucleosomes. Proceedings of the National Academy of Science USA 94:12633-12637. http://dx.doi.org/10.1073/pnas.94.23.12633
- Schalch, T., et al. (2005). X-ray structure of a tetranucleosome and its implications for the chromatin fibre. Nature 436:138-141. http://dx.doi.org/10.1038/nature03686



