Contents
- Manipulating cells.
- Protein purification.
- Protein assay.
- Determining protein and gene sequences.
- Manipulating genes.
Manipulating cells
This page will concentrate on a long-term experiment that tries to find out where DNA polymerase is located in the cell, and how it gets there. We will do this from first principles, just as you would have to if you were working on an entirely novel protein, and explain the techniques required at each stage.
We would like to know how DNA polymerase gets into the nucleus. Questions of where things are located in the cell are the domain of cytochemistry, which investigates what is present in a cell. Cytochemical investigation can answer questions of the "how many cells do we have," and, "what is in them?" sort. Simple staining with dyes can be used to locate some chemicals: haematoxylin and methylene blue bind DNA, RNA, acidic proteins, etc; however, it is rare to find stains specific for a single protein like DNA polymerase. Consequently, we need to rely on molecular biological techniques to manipulate protein and DNA to investigate their localisation and properties.
The first thought for finding DNA polymerase in a cell might be to use fluorescent antibodies against it. Just like immunogold staining, we can use primary antibodies raised to the protein in e.g. a mouse, then use secondary anti-mouse antibodies raised in e.g. a rabbit to stain these. Rather than colloidal gold, we can tag the anti-mouse antibodies with a fluorophore like rhodamine or fluorescein. However, we need lots of purified protein to raise an antibody. If we don't have that, we'll need to make some…
Proteins are usually investigated in model organisms, whose biochemistry and physiology is well understood, and which reproduce rapidly:
- Escherichia coli.
- Saccharomyces cerevisiae (yeast).
- HeLa human carcinoma cell line. HeLa cells are named after Henrietta Lacks, an American woman who died of a particularly invasive cervical cancer. This cell line was derived from this carcinoma.
Cells are cultured in growth media. Chemically defined media (made from exactly specified ingredients) can be used for yeasts and bacteria; but protein growth factors are required for mammalian cells, so undefined 'serum' is often added for human cell lines. Culture media will usually contain:
- Source of carbon and energy: 10 mM glucose.
- Source of nitrogen: 5 mM glutamine.
- Salts: 50 mM NaCl, 5 mM KCl, 1 mM CaCl2, MgCl2, Na2HPO4, NaH2PO4, 1 µM Fe, Zn, Se, Cu salts.
- Buffer (pH 7.4): 30 mM NaHCO3, phenol red indicator, 5% CO2 atmosphere.
- Essential amino acids (0.2 mM each Pvt. Tim Hall): phenylalanine, valine, tryptophan, threonine, isoleucine, methionine/cysteine, histidine, arginine, leucine, lysine.
- Vitamins (1 µM each): thiamin, riboflavin, nicotinamide, pantothenate, pyridoxal, biotin, folate, cobalamin, choline, inositol.
- Antibiotics (0.1%): streptomycin, penicillin.
- Growth factors and protease inhibitors: insulin, EGF, FGF, etc., and/or whole serum
Proteins may be secreted extracellularly, but DNA polymerase is not. Consequently, we must disrupt the cells containing it to release it. The method used to break open a cell depends on its robustness. Animals have a weak plasmalemma, which can be easily burst using osmotic stress or sonication. Plants, bacteria and fungi (including yeasts) have a tougher cell wall, which requires enzymatic digestion, or bead-beating, or even the use of a French press, which squeezes the cells at high pressure through a tiny hole to make them explode. PMSF (a protease inhibitor), protamine sulfate (which precipitates nucleic acids), detergent (which disrupts membranes) and dithiothreitol (a reducing agent) may be added at this stage.
The cell lysate contains a vast array of subcellular debris. Centrifugation separates items by their speed of sedimentation by using a rapidly rotating rotor to pellet-down particles (e.g. organelles) from suspensions (e.g. culture medium).

Simple velocity sedimentation simply spins things down, depending on how long the centrifuge is run for, and how fast. Density gradient centrifugation (using gradients of CsCl or sucrose solutions) allows particles to come to equilibrium according to their buoyancy in the gradient. The speed of sedimentation is measured in Svedberg units - the rate of sedimentation scaled for the radial acceleration of the rotor: this is the 'S' in 70S ribosome, etc.
Typical centrifugation speeds and times are:
- 1000 × g for 10 min : eukaryotic cells, nuclei, cytoskeletons.
- 10 000 × g for 20 min : mitochondria, bacteria.
- 100 000 × g for 60 min : vesicles.
- 150 000 × g for 3 hr : macromolecules.
DNA polymerase is found in nuclei, so after lysing our yeast cells, we can spin out the nuclei, break them open with a little detergent, then use a sucrose density gradient to separate the enzymes in the nucleus (left in the supernatant) from the other junk (which will be pelleted down).
Protein purification
Centrifugation will separate organelles from macromolecules, but the supernatant will still contain many, many enzymes other than the one we wish to work on. Consequently, the supernatant will need to be purified to separate these different proteins. Chromatography is the most important method for this, although salting out and dialysis may be used to give an initial cut.
Salting out with ammonium sulfate separates proteins on the basis of their solubility. Adding large quantities of a chaotropic salt like ammonium sulfate will weaken the hydrogen bonds between proteins and water, making them drop out of solution over a relatively small concentration range, where they can be harvested by centrifugation. This can be used to effect a crude initial fractionation of proteins from a cytosolic extract.
Dialysis separates proteins on the basis of their size. Protein solutions that have been salted out or have come off a chromatography column are usually full of salt, which must be removed. Dialysis is the usual way to do this: the proteins are put in a semi-permeable bag of Visking membrane, which is then put in a big bucket of buffer. Salt and water can cross the membrane, but proteins are retained, hence the protein is desalted. A variation on this is to add the solution to a centrifuge tube containing a dialysis membrane half way down it, and spin it. You can buy different membranes with different sized holes, so you can even achieve a little purification too (small proteins get spun out into the lower fraction, which is discarded).
A dialysed ammonium sulfate cut can then be subjected to several rounds of chromatography. Chromatography is the separation of substances due to their differential solubility in mobile and stationary phases: the retention time (how long a molecule stays stuck to a column) varies, depending on how firmly bound it is to the stationary phase, as the mobile phase washes through the column. The mobile phase may be varied to wash off (elute) strongly bound species. Often, ion exchange chromatography is used for the initial separation, followed by gel filtration and affinity chromatography.
Ion exchange separates proteins by charge: anion exchange groups like DEAE (diethylaminoethyl) are positively charged, so will reversibly bind negatively charged anions. Cation exchangers like CMC (carboxymethyl) are negatively charged. These groups are covalently bound onto a cellulose or Sephadex (modified starch) matrix. Proteins can be eluted with increasing salt concentrations, whose ions compete for binding to the stationary phase. The eluate will contain a lot of salt, so needs to be dialysed afterwards.

Because DNA polymerase is full of basic residues like arginine, lysine and histidine, at pH 7.2, it is positively charged, so we could use carboxymethylcellulose to purify it.
Size exclusion chromatography (a.k.a. gel filtration) uses crosslinked dextran, agarose or polyacrylamide beads with pores that trap proteins below a particular size. Small molecules are retarded, whilst larger molecules fall through more quickly.

Affinity chromatography separates proteins by specific binding. Antibodies (for known proteins) or substrates (for enzymes) can be used to specifically bind proteins of interest. These can be eluted ionically, or by flooding with other species that compete for binding. For DNA polymerase, cellulose coated with denatured ssDNA will work well.

Protein assay
During purification, it is essential to monitor two things: the total quantity of protein in the eluate, and the enzyme activity of the eluate. The whole point of purification is to increase the ratio of the activity to the protein.
Identifying a functional enzyme requires an assay. An enzyme is defined by its catalytic activity, so to see if we are actually purifying the enzyme during chromatography, we must measure its activity. DNA polymerase makes dsDNA from ssDNA and dNTPs, and dsDNA is insoluble in cold trichloroacetic acid. Therefore, 3H-dTTP can be used: we measure the radioactivity in the precipitate in cold TCAA after a set time to see how active our enzyme is. Spectrophotometric assays are easier though (e.g. most oxidoreductases can be monitored by their effect on NADH, which has a characteristic absorbance at 340 nm).
Assaying the total mass of proteins in an eluate may be achieved in several ways, depending on the sensitivity and accuracy required:
- Amino acid analysis. This is probably the most accurate method. The sample is hydrolysed using 6M HCl at 100°C in a sealed tube for 22 hours and the levels of each amino acid is measured using chromatography. This method is subject to several errors. Firstly, cysteine, cystine and methionine are partially destroyed during the hydrolysis, to correct for this samples are also hydrolysed for 48 and 72 hours and the levels of these amino acids are extrapolated to zero time to obtain their initial levels. Secondly, it is subject to interference by non-protein amino acids in the sample, which have to be removed beforehand analysis by dialysis. Finally, the hydrolysis deaminates asparagine and glutamine, however, the free ammonia can be detected on the column and appropriate corrections made.
- Total nitrogen estimations (Kjeldahl method). This is another reasonably accurate method, which is particularly useful if you have many samples. The nitrogen of the protein is converted to ammonia by boiling with sulphuric acid, which converts it to ammonium sulphate. The conversion is speeded up by a catalyst of ferrous sulphate and selenium. Once the conversion is complete the solution is made alkaline, the ammonia is distilled into a known volume of standard acid and the neutralisation of acid is then determined by titration. This allows the amount of nitrogen in the sample to be calculated. The total nitrogen is then multiplied by a factor to convert it to protein. The factor normally used is 6.25, which represents the average level of nitrogen in proteins (i.e. 16%). This method is also subject to several errors. Firstly the factor 6.25 is an average value and not all proteins contain an equal percentage of nitrogen. Secondly the presence of nitrogen in the -NH3+ associated with the asparagine or glutamine cannot be distinguished from that in the peptide bond and consequently this leads to overestimation. Finally, the presence of non-protein nitrogen causes problems, although this can be minimised using methods already described for the amino acid analysis.
- Both of these methods are very time consuming, and require large and complex equipment. Consequently, efforts have been made to develop simple and rapid spectroscopic methods. None of these methods is entirely satisfactory. The choice of the method depends on the nature of the protein, the nature of other components in the sample, the desired speed, accuracy and sensitivity required. Furthermore, to obtain accurate estimates of protein concentrations using these spectrophotometric methods it is important to select an appropriate standard to calibrate the method, which has a similar amino acid composition to the protein being analysed.
- Biuret method. Compounds that contain two or more peptide bonds (i.e. proteins) develop a characteristic purple Biuret complex when treated with dilute copper sulphate in alkaline solution. The Biuret test is the most accurate of the spectrophotometric assays and is fairly reproducible with many different proteins. However, it is relatively insensitive and requires large amounts of protein (1 to 20 mg per assay) to form the coloured complex.
- Lowry method. This is a much more sensitive method than the Biuret test and will detect as little as 5 mg of protein. This method uses an alkaline copper reagent that detects the peptide bonds, as in the Biuret test, and the reduction of a phosphomolybdate-phosphotungstate complex by the tyrosine and tryptophan derivatives, as originally described by Folin and Ciocalteau. This method is less reliable because it assumes all proteins contain equal proportions of tryptophan and tyrosine, which is not always the case.
- Bradford (Bio-Rad) method. This assay is based on the observation that the absorbance maximum of acidic solutions of the anionic dye Coomassie Brilliant Blue G-250 shifts from 465 nm to 595 nm when it binds electrostatically to -NH3+ groups in the side chain of amino acids in proteins. This method is very simple, requiring the addition of only one reagent and is very sensitive, capable of measuring as little as 1 mg of protein. However, it is not very reliable because it relies on the assumption that all proteins have the same number of lysine or arginine amino acids with suitable -NH3+ groups in their side chains. In particular, the common calibrant BSA (bovine serum albumin) is packed full of -NH3+ groups, hence using it as a calibrant will grossly underestimate the amount of less endowed proteins. Use immunoglobulin G (IgG) instead.
- Ultraviolet absorption method. This is a very rapid method, which depends upon the absorption of light at 280 nm by the aromatic amino acids tyrosine and tryptophan. If the absorption coefficient of a protein is known then it is possible to calculate its concentration directly by determining the absorbance at 280 nm in polycarbonate or quartz (UV transparent) cuvettes. The method is simple to use since no reagents are necessary, and as the protein is not modified it can be used for subsequent experiments. It is also fairly sensitive and capable of measuring as little as 10 μg of protein. However it is the most unreliable of the spectroscopic methods since it depends upon the assumption that all proteins contain the same proportion of tyrosine and tryptophan. It is also subject to severe interference by nucleic acids, which have a maximal absorbance at 260 nm that trails into the 280 nm region. In order to reduce the interference from nucleic acids a correction factor is applied by measuring the absorbance not only at 280 nm but also 260 nm which quantifies the level of nucleic acids. The level of protein is then calculated using a fudge factor based on the ratio of A280 ⁄ A260. In general an absorbance of 1 unit is equivalent to a concentration of 1 mg mL−1.
The UV method is generally used in molecular biology, because it is cheap, quick and non-destructive.
During purification, we would also like to know how many proteins are still contaminating our enzyme preparation. SDS-PAGE (sodium dodecyl sulfate polyacrylamide gel electrophoresis) is used for this.

Hot mercaptoethanol is used to break disulfide bonds and denature the protein. SDS (a detergent) is then used to swamp the protein's natural charge, so that all proteins have same charge to mass ratio (the SDS binds stoichiometrically, 1 SDS per 1.4 amino acids). When these proteins are loaded onto a gel, and an electric field is applied, the proteins migrate through the pores in the polyacrylamide gel: the gel separates the proteins based on how well they can move, which is proportional to the (logarithm of their) molecular mass alone.
At each chromatographic step, the number of protein peaks decreases, both spectrophotometrically, and as measured by SDS-PAGE. The activity also increases, as measured by the cold TCAA assay.

Determining protein and gene sequences
We now have a purified DNA polymerase extract. Now we are in a position to find its amino acid sequence, and thence its gene.
Protein sequencing is performed by the Edman degradation using phenylthiocyanate. This can be used to determine the entire 1° structure, but more usually, it is used to get the N-terminal sequence of the protein, because it is slow and not 100% efficient. Often, it is easier to probe a DNA library for the gene, and then generate the protein sequence from that.

Edman degradation systematically forms amino acid derivatives from the
N-terminal end of the protein, which can be
identified by chromatography.
Genomic DNA libraries are made by digesting all the DNA from an organism using restriction enzymes (defensive enzymes from bacteria that cut up DNA at specific sequences).

Restriction enzymes are dimeric endonucleases that cut DNA at specific
(usually palindromic) sequences. This is the EcoRI enzyme from E.
coli, which cuts G|AATTC.
The library is made by inserting these millions of fragments of DNA en masse into λ bacteriophage plasmids. This allows the genes to be grown up (cloned) in E. coli. If you are interested in what genes are being expressed in a cell, a cDNA library can be made instead by using reverse transcriptase to generate DNA from the total mRNA pool in the cell (cDNA libraries have the additional advantage of not containing introns, so they can be expressed in bacteria directly).

Libraries are then screened with degenerate probes. If the Edman degradation generated a protein sequence of NH3+-SRVIVHVD, there are many different DNA sequences that could code for this (because of the redundancy of the genetic code). A 'degenerate' mixture of oligonucleotide probes can be synthesised, which would bind to any possible sequences for the DNA polymerase gene.
Note that either of the two sequences below would produced the same SRVIVHVD peptide, so the degenerate mixture must contain both (and several hundred others!)
5′-AGCAGAGTAATAGTCCACGTTGAC-3′
5′-TCTCGTGTCATTGTACATGTGGAT-3′
After growing up the library phages on bacteria, the plaques (areas of dead cells) they produce can be transferred to a nitrocellulose membrane, and washed with the radiolabelled probe mixture. Only plaques that contain the DNA polymerase gene will bind the probe strongly. After washing, the binding can be easily observed by autoradiography onto a photographic film.

So now we have our gene (inside a phage), but we need to find its sequence. To do this, we need a lot of it, so we can use PCR to multiply it up.

The polymerase chain reaction (PCR) uses the enzymes involved in cellular DNA replication to multiply up a chosen DNA sequence to many million-fold times its original concentration. DNA polymerase from Thermus aquaticus (Taq polymerase) is used because of its stability at very high temperatures. DNA polymerases require four things:
- A polymerase (obviously), and any magnesium, pH and temperature requirements to be fulfilled.
- A single stranded DNA (ssDNA) template.
- A short section of double stranded (dsDNA), usually provided by adding primers, short lengths of ssDNA complementary to the beginning (and end) of the template(s).
- dNTPs, the building blocks of DNA.
The procedure is designed to create exponential amounts of dsDNA, consequently two primers are needed: one for the negative DNA strand, and one for the positive. These primers together delimit the portion of the template DNA to be multiplied. The procedure is as follows:
- DNA is heated to 95°C to separate (melt) the duplex into two ssDNA strands.
- The temperature is reduced to 50°C, where annealing of the primers to the two ssDNA strands occurs
- Polymerisation to dsDNA ('elongation') is then made to occur by raising the temperature to 70°C. The polymerase may overshoot the required endpoint (i.e. polymerise beyond the portion of the template delimited by the other primer), but as more cycles of PCR take place, these over-long chains are completely overwhelmed by correct length strands, since both the beginning and end of the sequence are delimited by primers.
- 30 cycles of this will give a 109 fold increase in the gene we want.
- The PCR products are purified to remove remaining template and dNTPs. The products may then be run on an agarose gel and stained with the intercalating agent ethidium bromide to show bands of different RMM. Alternatively, the primers may be radiolabelled for identification of the dsDNA.
After producing a large amount of DNA, the next objective is often to sequence it. For this, a quantity of ssDNA is prepared from dsDNA, usually by a PCR-like method, but only using a single primer rather than two. ddNTP sequencing uses dideoxynucleotides, which can be incorporated into a DNA strand as usual by DNA polymerase, but due to their lack of the 3′-OH group, they prevent continued polymerisation thereafter. Therefore, they terminate the polymerisation at the base that they complement. ddNTP sequencing requires:
- An ssDNA template.
- A primer complementary to the topmost bases of the template.
- Polymerase, etc.
- dNTPs.
- Low concentrations of ddATP, ddTTP, ddCTP or ddGTP.
Four reactions are carried out: one contains a trace of ddATP, one a trace of ddTTP, etc. The ddNTPs terminate sequencing at the base they complement. The concentration of the specific ddNTP is carefully set so that polymerisations are terminated at all possible sites along the newly synthesised strand: too high a concentration would terminate all polymerisations at the first complementary base, too low a concentration would cause no terminations at all. So, adding a little ddTTP to the reaction will yield DNA fragments of all lengths up to an A on the parent strand, and likewise for the other three bases. When all four reactions have been carried out, the products are run on a gel: shorter DNA strands migrate to a lower position on the gel, so the bands are effectively sorted by DNA length as you move up the gel. Consequently, in the ddTTP lane, bands will show up wherever an A occurred in the original strand, and so on. The sequence can be read off very simply.
In a useful modification, which can be automated, ddNTPs can be tagged with fluorescent dyes: ddATP can be tagged with fluorescein, etc. The sequencing reactions can be carried out together in the same pot, and the bands identified by their colour rather than by which lane (ddATP, ddTTP, etc.) they came from.

ddTTP will terminate the synthesis of a strand complementary to the
ACGACT… parental strand when it binds to an A. By using low
concentrations of ddTTP, some chains are terminated at the first A,
some at the second, and some at the third, so we get a range of
fragments, whose lengths correspond to the positions of A in the parent
strand. The DNA fragments can be separated on the basis of length by
agarose electrophoresis (very similar to SDS-PAGE). This can be done
for all four ddNTPs separately, or together by using ddNTPs that have
been linked to fluorescent molecules.
We now have the gene sequence of our DNA polymerase, and can predict the protein 1° structure, We can also use our phage library to make masses of the protein for analysis. If we are lucky, we can make enough of the protein to form a crystal for X-ray crystallography. X-rays diffract passing through a protein crystal. The diffraction pattern can be number-crunched to produce a 3D electron density map, and further crunching fits the 1° structure.
.jpg)
From protein crystal to diffraction pattern, to protein structure.
Manipulating genes
So now we have the gene sequence and protein structure for DNA polymerase. Now we want to see how it works, and (to answer our original question), where it is located in the cell, and how it gets there.
The process described above is a right pain. Since we now have the gene in a phage, we can now make our lives slightly easier by modifying it:
- The DNA for a histidine (H6-COO−) tag, which binds Ni2+, can be added to the end of the gene, making chromatography much easier (a nickel column can be used to purify the protein very easily).
- The gene for green fluorescent protein (GFP), which glows under blue light, can be linked to our gene so a 'fusion protein' is produced when the gene is expressed, so wherever the protein goes, it takes a little fluorescent 'tail' with it.
- The gene for glutathione-S-transferase, which binds glutathione (GSH), can be fused to our gene, allowing our protein to be detected chemically or by antibodies to GST.
- The promoter can be chosen to allow the gene to be switched on and off: IPTG (a lactose-like sugar) can be used as a chemical switch for the lac promoter, allowing over-expression of the protein on demand.

GFP is an extremely useful protein from a jellyfish that glows green under blue light. It can be used to tag proteins and show where they go; or can be used downstream of a promoter of unknown function, to show where/when the promoter is active.
Cells expressing fluorescent proteins can also be readily automatically sorted by fluorescence-activated cell sorters:

Site directed mutagenesis can now be used to modify signals and active sites. DNA polymerase gets into the nucleus because it has a nuclear localisation sequence, which is a stretch of highly basic amino acids that target it for karyopherin-catalysed import into the nucleus. We can identify which of the (probably several) basic stretches on the enzyme is the NLS by modifying them so they are no longer basic. Site directed mutagenesis uses an oligonucleotide with a mismatch to add specific mutations to a gene. After the oligo is annealed to the DNA we wish to mutate, it is extended using the Klenow fragment of DNA polymerase. This allows us to modify our putative nuclear localisation signals, e.g. by converting lysine to isoleucine. If we use a GFP-fusion protein, we can then simply look and see where our tagged DNA polymerase ends up.
5′---GGA GTA AAA AAA TTA ATG AAC---3′
3′---CCT CAT TAT TTT AAT TAC TTG---5′
AAA codes for lysine. By adding an oligo with a deliberate mismatch (TAT), we can introduce a mutation. Klenow fragment extends the gene from the oligo to form the lower partial sequence shown below.
5′---GGA GTA ATA AAA TTA ATG AAC---3′
3′---CCT CAT TAT TTT AAT TAC TTG---5′
After one round of replication, and synthesis of the upper complementary strand, we have a specifically modified protein with isoleucine (ATA) rather than lysine (AAA).
The GFP tag allows us to show very easily that the lysine in the putative NLS is required for nuclear import, by comparing the lysine wildtype to a isoleucine mutant we have created.


Left of each pair of images shows GFP fluorescence. Right of each pair shows location of nuclei.
Summary
- Centrifugation and chromatography can be used to harvest and purify organelles and enzymes.
- SDS-PAGE and enzyme assay are used to monitor the purification.
- Partial protein sequences can be obtained from Edman degradation, allowing probes to be made to for DNA libraries.
- Full gene and protein sequences are determined from ddNTP sequencing, and protein structure obtained from X-ray crystallography.
- Site-directed mutagenesis and tagging can be used to study the roles of parts of the protein, and make manipulation easier.
Test yourself
- Why is fluorescent ddNTP DNA sequencing more convenient than conventional ddNTP sequencing?
Bibliography
- Alberts, B., et al. (2002). Molecular biology of the cell. 4th edition. Garland Science, New York. 470-478. "Isolating cells and growing them in culture"
- Alberts, B., et al. (2002). Molecular biology of the cell. 4th edition. Garland Science, New York. 478-491. "Fractionation of cells"
- Alberts, B., et al. (2002). Molecular biology of the cell. 4th edition. Garland Science, New York. 491-513. "Isolating, cloning and sequencing DNA"
- Alberts, B., et al. (2002). Molecular biology of the cell. 4th edition. Garland Science, New York. 513-525. "Analyzing protein structure and function"
- Alberts, B., et al. (2002). Molecular biology of the cell. 4th edition. Garland Science, New York. 525-546. "Studying gene expression and function"
- Alberts, B., et al. (2002). Molecular biology of the cell. 4th edition. Garland Science, New York. 574-575. "Green fluorescent protein can be used to tag individual proteins in living cells and organisms"
- Baril, E. F., Scheiner, C. and Pederson, T. (1980). A β-like DNA polymerase activity in the slime mold Dictyostelium discoideum. Proceedings of the National Academy of Science USA 77:3317-3321. http://dx.doi.org/10.1073/pnas.77.6.3317
- Berg, J. M., Tymoczko, J. L. and Stryer, L. (2006). Biochemistry. 6th edition. W. H. Freeman and Company, New York. 66-78. "The purification of proteins is an essential first step in understanding their function."
- Berg, J. M., Tymoczko, J. L. and Stryer, L. (2006). Biochemistry. 6th edition. W. H. Freeman and Company, New York. 78-84. "Amino acid sequences can be determined by automated Edman degradation."
- Berg, J. M., Tymoczko, J. L. and Stryer, L. (2006). Biochemistry. 6th edition. W. H. Freeman and Company, New York. 96-101. "Three dimensional protein structure can be determined by NMR spectroscopy and X-ray crystallography."
- Berg, J. M., Tymoczko, J. L. and Stryer, L. (2006). Biochemistry. 6th edition. W. H. Freeman and Company, New York. 53-55. "Protein misfolding and aggregation are associated with some neurological diseases"
- Berg, J. M., Tymoczko, J. L. and Stryer, L. (2006). Biochemistry. 6th edition. W. H. Freeman and Company, New York. 138-139. "DNA can be sequenced by controlled termination of replication"
- Berg, J. M., Tymoczko, J. L. and Stryer, L. (2006). Biochemistry. 6th edition. W. H. Freeman and Company, New York. 140-141. "Selected DNA sequences can be greatly amplified by the polymerase chain reaction"
- Berg, J. M., Tymoczko, J. L. and Stryer, L. (2006). Biochemistry. 6th edition. W. H. Freeman and Company, New York. 147-148. "Proteins with new functions can be created through directed changes in DNA"


