Genetics

Contents

Basic genetic concepts

Genetics is the study of the inheritance of characteristics. The things that influence an organism's characteristics are external factors, such as the environment, internal factors, which are called genes, and their interactions. Genes are made of DNA, or RNA in some viruses.

A gene may exist in one of several forms, called alleles, but only one may be present at a single place in the genome (a locus, pl. loci). A gene will affect a character (or several characters), and each of its alleles may have different effects on this character. The sum total of all the alleles an organism possesses at its loci are called its genotype. The resulting characteristics you can directly observe are called its phenotype. Diploid organisms, such as humans and fruitflies, have two copies of each chromosome (which are huge linear DNA molecules containing many genes). One of these is inherited from each parent, and therefore humans and fruitflies and all other diploids have two loci in which alleles of a particular gene may reside. Consequently, diploid organisms may be homozygous for a given gene (i.e. have the same allele at both loci, 'AA', or 'aa'), or be heterozygous (different alleles at the two loci, 'Aa').

Drosophila melanogaster, the fruitfly: beloved of geneticists.
Drosophila melanogaster, the fruit fly beloved of the geneticist.

Mendelian genetics and crosses

If a population of sexual organisms are repeatedly inbred (crossed with themselves), variation will be lost until only one allele exists at every locus in every organism in the population (the allele becomes 'fixed'). Such inbred (parental or 'P') populations, which are very nearly clonal, are exceedingly useful for genetic experiments, as they will be homozygous at almost all loci. If two such populations are created, one homozygous for an allele at a particular locus (AA), the other homozygous for another allele at the same locus (aa), crosses between them will yield useful results, which may be analysed to determine how the genes and characteristics are inherited. Parental populations may differ from each other at just one locus, or at many (AABBCCdd and AAbbccdd differ at the B and C loci). When parental organisms are crossed, the offspring will be heterozygous at every locus the parents differ at (AABBCCdd × AAbbccdd → AABbCcdd). The offspring of this cross are called the first filial generation, or F1. Crossing the F1 with itself (selfing it), or crossing the F1 back to a multiply recessive parent (test crossing it) generates an F2 generation, which is also useful for genetic analysis.

If we consider the inheritance of just one gene, e.g. one determining the height of a plant, the F1 will all have the same genotype and phenotype (AA (tall) × aa (short) → Aa (tall) ). The phenotype of the F1 is very important:

When the F1 are selfed, the genes from the parents will recombine with each other, and may produce a characteristic ratio. If only one locus is 'segregating' (i.e. if we're only looking at one locus and one character, as with the A/a locus and tall/short phenotype above), the F2 phenotypes will usually have a 1:2:1 or 3:1 ratio. This can be demonstrated by a punnet square:

 

Gametes from second parent

A (p = 0.5)

a (p = 0.5)

Gametes from first parent

A (p = 0.5)

AA (p = 0.25)

Aa (p = 0.25)

a (p = 0.5)

Aa (p = 0.25)

aa (p = 0.25)

Since an Aa individual will produce haploid sperm and eggs containing just an A or an a gene in equal proportions (p=0.5), then all things being equal, the offspring will be produced in a 1 AA : 2 Aa: 1 aa ratio:

If A is dominant to a, AA and Aa phenotypes will be indistinguishable, and a 3:1 ratio occurs. If Aa organisms are distinguishable from both parents (e.g. if there is partial dominance, and Aa are intermediate between AA and aa), a 1:2:1 ratio will result. These ratios are called monohybrid ratios, because only one gene ('mono') is segregating.

However, if two loci are segregating, various dihybrid ratios may occur in the F2:

 

Gametes from second parent

AB

Ab

aB

ab

Gametes from first parent

AB

AABB

AABb

AaBB

AaBb

Ab

AABb

AAbb

AaBb

Aabb

aB

AaBB

AaBb

aaBB

aaBb

ab

AaBb

Aabb

aaBb

aabb

The gametes AB, aB, Ab and ab are produced in a 1:1:1:1 ratio. In the simple case, where A and B are dominant to a and b respectively, a dihybrid ratio of:

9 A-B- : 3 A-bb : 3 aaB- : 1 aabb

occurs. The - represents either allele: if A is dominant to a, it doesn't matter what the second allele is if the organism has an A allele already: since the Aa and AA genotypes are indistinguishable phenotypically, they may be pooled for the purposes of counting. If there is incomplete dominance, other ratios will occur, but we will not go into these. The 9:3:3:1 ratio shows a number of things about the two genes: firstly they are independently assorted i.e. they are inherited separately (a gamete possessing A is no more likely than chance to also have B), and secondly, there is no interaction between the loci, i.e. A and B do not interact to give some interesting change to the phenotype. If these two assumptions are violated, other ratios occur.

When an F1 is backcrossed to a multiply recessive homozygous inbred line in a 'test cross':

AaBb × aabb → 1 AaBb : 1 aaBb : 1 Aabb : 1 aabb

the typical ratio is 1:1:1:1 when there are two genes segregating (it is 1:1 for a single gene, i.e. Aa × aa → 1 Aa : 1 aa). Deviations from this ratio are also caused by violations of the two assumptions mentioned above in the dihybrid cross. Test crosses to a multiply recessive parent are useful because they show the ratio of gametes produced in the F1 during meiosis. Deviations from the F2 dihybrid ratio are more difficult to analyse.

Linkage and epistasis

If the assumption of independent assortment (Mendel's second law) is violated, the loci are said to show linkage. Such linkage occurs because the loci are close together on the same chromosome (they are both close and syntenic). If genes are on different chromosomes (nonsyntenic), they will be inherited independently, because the chromosomes are segregated into gametes during meiosis completely independently of each other. Likewise, syntenic genes on the same chromosome that are far from each other will almost always be recombined by crossing over (the swapping of bits of parental chromosomes during meiosis), and will segregate independently too. However, the closer two loci are to each other, the less likely a crossover becomes, and hence, the gametes produced will not be in a simple 1:1:1:1 ratio.

If for example, a plant with genotype AaBb is test crossed to aabb, the offspring will be 1 AaBb: 1 aaBb : 1 Aabb : 1 aabb if there is no linkage, as shown above. However, if there is linkage, there will be fewer recombinant offspring. For the test cross, we will show the actual associations of the genes : a slash will divide the genes inherited from the maternal and paternal parents:

P: AB / AB × ab / ab

F1: AB / ab

Gametes produced by the F1 will mostly have chromosomes still carrying the parental AB gene combination (AB and ab). Only a few Ab and aB gametes will be produced by recombination. If we then do a test cross (all the gametes from the test cross parent are ab), the majority of the F2 will be either AaBb or aabb, with only a few Aabb and aaBb progeny. Say we have the actual numbers:

The thing to notice is that because A and B are tightly linked (as are a and b), they tend to be inherited together, as an intact unit, down the generations. Consequently, there will be an excess of the parental arrangement of the genes in the gametes of the F1, and hence an excess of AaBb and aabb organisms in the test cross progeny. The excess can be tested for using a χ2 test, to see how far the observed ratio deviates from the expected 1:1:1:1 ratio. The recombination frequency is defined as the number of recombinants divided by the total number of organisms in the test cross progeny. Here it is ( 1 + 3 ) ⁄ ( 24 + 1 + 3 + 28 ) = 7%. The recombination frequencies are a measure of how far two genes are apart on a chromosome: nearby loci have low recombination frequencies. A map of all the loci on a chromosome can be constructed from such test crosses (particularly those involving three loci (3-point test cross), rather than the two here). A genetic map is simply a line with the distances between genes marked in recombination frequencies:

<-----4%-------><----------7%------------->

<--------------------10.5%-------------------->

A-------------------B--------------------------C

Note that that AB + BC is slightly more than AC: these recombination frequencies show good, but not perfect additivity, because of undetected multiple crossovers between the loci.

A special sort of linkage is sex linkage. Genes carried on the non-sex chromosomes (autosomes) show autosomal linkage, as described above. Genes carried on X and Y chromosomes, which determine sex in mammals (XX is female, XY is male. It is the opposite way round in birds) behave unusually because the Y chromosome carries almost no genes: an organism is effectively haploid for the genes carried on the X. Hence if a female carrier of the recessive gene for colour blindness (c) mates with a non-colour-blind man, the following results:

XCXc mates with XCY gives offspring in 1:1:1:1 ratio of:

A typical 3:1 ratio results (3 non-colour-blind/carriers : 1 sufferer), but it is noted that all sufferers from such crosses are male. Several other diseases are sex-linked, including haemophilia and Duchenne muscular dystrophy.

The second reason for deviations from 9:3:3:1 and 1:1:1:1 is gene interactions between loci. If a gene P is required to make a colourless precursor chemical into a purple pigment, and a further gene, R is required to turn the purple pigment red, the relevant dihybrid F2ratio is:

9 P-R- (red) : 3 ppR- (colourless) : 3 P-rr (purple) : 1 pprr (colourless)

Because the R allele can only do its job if the P allele is also present, we get a modified 9:3:3:1 ratio, here a 9:4:3 ratio. Other ratios are possible depending on the exact interactions. The phenomenon is called epistasis, and in this case P is said to show recessive epistasis over R, because if two p alleles are present, it doesn't matter what is at locus R, the colour will be lacking. Notice that linkage characteristically modifies a simple dihybrid a:b:c:d Mendelian ratio away from the expected values; epistasis characteristically merges one or more of the character classes to produce a:b:c or a:c ratios. The main sorts of gene interaction are:

To determine whether a ratio is deviant from that which is expected, we can use a χ2 test.

χ2 = ∑ ( O − E )2 ⁄ E

That is, the sum (for each character class) of the observed minus the expected all squared divided by the expected. If we have a sample size of 40 peas, and we're expecting a 3:1 ratio of green to yellow skins, we expect to get 40 × ¾ = 30 green, and 40 × ¼ = 10 yellow. If we actually get 33 green and 7 yellow, the χ2is

χ2 = ∑ ( O − E )2 ⁄ E = (33 − 30)2 ⁄ 30 + (7 − 10)2 ⁄ 10 = 0.3 + 0.9 = 1.2

This value of 1.2 has one degree of freedom (this is generally the number of categories minus one, i.e. 2 (yellow/green) − 1 ), and is not significant at p = 0.05 since it is smaller than the value from tables, 3.84. This means that 33 green to 7 yellow is an acceptable fit to a 3:1 ratio.

Quantitative genetics

Most of what has so far been discussed is qualitative genetics: comparing red and blue flowers, or dwarf and giant beans, or colourblind and colour vision. However, what about continuous characteristics like human height, where (the qualitative exceptions of dwarves and acromegalic giants aside), height is continuously distributed? Such characteristics are often under the influence of many genes (sometimes called polygenes), each having a small, usually additive effect. Such effects are often well modelled by the following:

AAB′B′CC (10 cm) × A′A′BBCC (10 cm) → AA′B′BCC (10 cm)

Each primed (′) allele gives 5 cm of height. If the F1 were selfed, a large number of different offspring would be produced, some of whom would inherit A′A′B′B′CC, and so be taller than either parent at 20 cm, and some would be just AABBCC (at 0cm). This is called transgression: offspring more extreme than either parent, in either direction. The opposite is regression, where two extreme individuals are crossed and produce less extreme offspring. Because quantitative features like this are common and easy to select for in breeding programs, a special measurement called narrow-sense heritability (h2) is created from them.

h2 = variation from additive genetic effects ⁄ total variation from additive genetics, dominance genetics, environment and their interactions.

Heritability is used in breeding programs for improving crops and livestock, since it can be used to predict the characteristics of the offspring of crosses between different breeds. Broad sense heritability (H2) includes both additive and dominance effects, and may be though of as a measure of how strong 'nature' is versus 'nurture'. It is less useful in crop breeding programs because the effects of dominance are more difficult to exploit. Heritability is usually measured using twin studies: observing how similar identical twins raised in different environments are.

Mutation

The change of a gene from one allele to another can occur in a vast number of ways and be caused by a vast number of physical (radiation), chemical (mutagen) and biochemical (base tautomery) mechanisms. This damage may be classified as:

At the level of a single base of DNA, the possible mutations are a substitution (e.g. a change from A to G) or a frameshift (indel, short for insertion or deletion), which is the loss or addition of a single base or two. Substitutions may be sense mutations (changing an amino acid in the specified protein for another), or nonsense (changing to a stop codon, and causing premature termination of protein synthesis). Frameshifts change the reading frame of the gene, rendering all that follows the mutation gibberish.

Ethidium bromide is a powerful mutagen, it is also used as a DNA stain in molecular biology.
Ethidium bromide is both a mutagen and a useful tool: it binds to DNA and makes it fluorescent.

At the level of portions of chromosome, large deletions, duplications, inversions (the 'turning around' of sections of chromosome), translocations (movement of a portion of chromosome to another), deficiencies (loss of the end of a chromosome), fusion and fission of entire chromosomes, and various aberrant chromosomes are possible.

At the whole genome level, duplications of the entire set of chromosomes (polyploidy) may occur, and the loss or duplication of one or more individual chromosomes (aneuploidy) also occur. In humans, a number of syndromes are due to chromosome aberrations: Downs syndrome (trisomy 21: three copies of this chromosome, a case of aneuploidy; also translocation of part of 21 to one of the other chromosomes), Turner's syndrome (XO: only one X chromosome), Klinefelter's syndrome (XXY), and others. Polyploidy in plants is rather common, and often results in larger, more vigorous plants, although fertility is often impaired:

Ancient polyploidy events (palaepolyploidy) can be detected in many plant (and some animal) lineages.

Mutation contributes to variation in a population's gene pool. μ is the symbol given to a particular gene's mutation rate, or to the total mutation rate (in mutations per organism per generation).

Selection, drift and Hardy-Weinberg

Selection is the loss of certain alleles from a gene pool based on their fitness (i.e. how many copies of themselves they leave to future generations). This may occur naturally or artificially. Selection may be stabilising (loss of those organisms with extreme phenotypes), disruptive (loss of those with typical phenotypes, i.e. survival of extreme individuals only), or directional (like disruptive, but in one direction only). Selection tends to use up variation in a population's gene pool. It is symbolised by s, which is the proportion of organisms carrying the gene that fail to reproduce above that of a reference allele. s=1 is therefore a lethal allele.

Not all change is selective. Particularly in small populations, random, non-selective change in a gene frequency is termed genetic drift. Drift, selection and other factors may be tested for by the Hardy-Weinberg equilibrium model. In a sexually reproducing population, the gamete pool for a locus A/a will consist of the gamete A at a frequency p, and gamete a at a frequency q. Consequently, the frequency of homo- and heterozygotes in the next generation will be:

AA : p2 Aa : 2pq aa : q2

if matings occur at random. If the combinations are not at these values, it may be inferred that either there is considerable drift (if the population is small, many A gametes may die by pure bad luck), or selection against a particular phenotype is occurring (aa individuals die in the womb), or mutation rates are particularly high and biased, or matings do not occur at random (AA individuals seek out other AA individuals, or all AA individuals are at one end of the species' range).

DNA recombination and repair

DNA recombination occurs during meiosis, and is basically an elaborated form of double-stranded damage repair, which may well be its molecular origin (and therefore the origin, if not the reason for, sex).

The initial step in DNA recombination involves making a single stand nick in one of the chromatids involved in the crossover event. DNA polymerase enters the gap, and synthesises a new strand of DNA, displacing the old strand as it goes.. This old strand then invades the duplex of a neighbouring chromatid, forming asymmetric hybrid DNA with one strand, and causing a displacement loop in the other. The displaced DNA in the invaded chromatid is nicked at its base, and degraded. The end of this nick is then joined to the end of the newly synthesised strand on the other chromatid, making a IXI shape called a Holliday junction, which is then able to migrate downwards for some distance, creating symmetric hybrid DNA.

After some length, the X is broken and rejoined. The other two strands are then broken and rejoined, and the crossover is complete. The hybrid DNAs will contain mismatched basepairs, which may be corrected, leading to the phenomenon of gene conversion.

DNA can be damaged in a number of ways. Ionising radiation, chemical mutagens (of the intercalating variety, which cause frameshifts, or of the base analogue variety, which cause substitutions), retroviruses, transposons, and UV light can all cause damage. An inability to fix such damage can lead to diseases such as cancer, and cancer-predisposing syndromes such as xeroderma pigmentosum (faulty in the dithymidine dimer correction pathway) and ataxia telangiectasia (similar to XP, but less well understood), and so on.

DNA may be repaired in a number of ways. Some enzymes are capable of catalysing the reverse reaction that caused the mutation (dithymidine dimers), other enzymes are capable of nicking DNA at a mutation, and repairing the damaged strand by synthesising a new strand complementary to the undamaged one. In diploids, it is possible for double-strand damage to be corrected by recombination with the undamaged copy from the other 'half' of the genome. All these require the cell to be able to recognise damage: some damage causes unusual DNA topology (dithymidine, mispairings and intercalations), but some is not so obvious.

Test yourself

  1. What is the difference between the terms allele, locus and gene?
  2. Human are diploid and have only two copies of each gene, allowing them to have one of three genotypes AA, Aa or aa if only two alleles can be present at the locus. How many combinations of two alleles could a hexaploid species such as wheat have?
  3. Explain the sex ratio in human beings based on the existence of a single sex-determining locus, with two alleles (X and Y), for which heterozygous XY are male, homozygous XX are female, and homozygous YY is lethal.
  4. What is the expected ratio in a test cross between a multiply recessive strain, and an F1 multiple heterozygote produced by crossing a pea with round (RR), green (yy) seeds with a pea having shrunken (rr), yellow (YY) seeds?
  5. If the genes were closely linked, which of the character classes would be present in excess, and which would there be fewer of?
  6. In maize, the kernels on a corn-cob may have many different phenotypes. They may be coloured purple (under the influence of an allele at the colour locus, which we will term C), or colourless (c). Due to their starch and sugar content, when dry, they may appear smooth (Sh at the 'shrunken' locus) or shrunken and wrinkly (sh). The data below are kernel-counts from a test cross between a multiply heterozygous F1 and a multiply recessive (sh/sh,c/c) test-cross parent. Are the data indicative of linkage? Prove this using χ2, and calculate the distance between the genes in recombination units. Were the P generation (sh/sh,c/c) and (Sh/Sh,C/C) (a coupling cross) or (sh/sh,C/C) and (Sh/Sh,c/c) (a repulsion cross)?

    Phenotype

    Count

    Coloured, smooth

    432

    Colourless, smooth

    23

    Coloured, wrinkled

    14

    Colourless, wrinkled

    459

  7. Are two very intelligent people more or less likely to produce children more intelligent than themselves, than two people of average intelligence, if IQ is assumed to be under highly-heritable polygenic control?
  8. How heritable is the number of legs a human being has?
  9. Why are foeti (foetuses) with sex chromosome aberrations, such as Klinefelter syndrome (XXY) and Turner syndrome (X0), viable, whereas all autosomal aberrations except Down syndrome (trisomy-21), are generally lethal before, or shortly after, birth?
  10. Cystic fibrosis is caused by a recessive mutation in an autosomal gene that regulates chloride transport in the intestinal epithelium. Carriers (Cf/cf) compose about 2.96% of the population, sufferers (cf/cf) about 0.00093%. Is this population at Hardy-Weinberg equilibrium? If not, what reasons can you suggest for it not being at equilibrium?
  11. Why is the deamination of 5-methylcytosine to thymine far more likely to lead to mutation in DNA that the deamination of cytosine to uracil?

Answers

Peer Review.
This page has been peer reviewed by 1 person.