Contents
- Molecular representation.
- IUPAC nomenclature.
- Functional groups.
- Geometric isomery.
- Optical isomery.
Molecular representation
Sometimes you want to completely represent a molecular structure down to the nitty-gritty detail, as with the very complex haemoglobin structure showing every individual atom below:
Sometimes, it may be more useful to bring out particular highlights, like the structure of haemoglobin's peptide backbone and prosthetic groups:
However we chose to represent our molecules, we have many choices available to us (right click on the above structure and you can change the view to see many of the below):
- Ball-and-stick models, where atoms are represented by balls, and bonds by sticks.
- Spacefilling models, where the atoms are represented by their Van Der Waals' radii.
- Extended formulae, such as CH3COOH.
- Line bond notation.
Line bond notation is a quick way of representing the molecular structure of biochemicals. We miss out most of the detail, so we only see important things like functional groups:
- Carbon and (most) hydrogens are not drawn explicitly. A carbon atom is implied at any angle and at the end of any line, with as many hydrogen atoms as are needed to make sure the carbon has four things bonded to it.
- Functional groups are drawn explicitly, but in a condensed way.
- Rings with alternating double and single bonds may be written with
a circle inside the ring (as in benzene). Heterocyclic rings (those
ones with some not-carbon atoms in them, like quinoline below):

are merely written with the heterocyclic atom in the ring.
Here are some examples of line bond notation:
.png)
An extended structural formula for Ecstasy, converted into linebond
notation
.png)
Extended structural formula for aspartame, converted into linebond
notation
.png)
Linebond notation for limonene, converted into extended structural
formula
IUPAC nomenclature
The international union on pure and applied chemistry (IUPAC) publish recommendations on how chemicals should be named. The IUPAC rules for naming compounds are extremely long, as they have to deal with everything from the correct name for table salt (sodium chloride), to the systematic name for cotton boll worm pheromones ((7E, 11Z)-hexadeca-7,11-dienyl ethanoate). IUPAC systematic names are supposed to replace the frightening plethora of trivial names applied to compounds. For example something as simple as CH3CN, which is properly called ethanenitrile, has also been called methyl cyanide, cyanomethane, ethanonitrile, ethyl nitrile and methylcarbonitrile, and that's just the English trivial names for it. IUPAC nomenclature is supposed to do away with such redundancy so that a chemical has one correct name, and no other. It's also supposed to make naming compounds systematic (i.e. a few simple rules should generate the name of any compound). IUPAC naming has gone some ways to making the world of the inorganic chemist simpler, and has made some headroads into the world of the organic chemist.
Unfortunately, IUPAC has run head long into the wonderful world of biochemistry, and the 'few simple rules' have mushroomed into a lot of complex rules with exceptions and special cases everywhere. Biochemical compounds have been named after the things they were found in (digitoxin from Digitalis), for their odour (putrescine), for their colour (bromophenol blue), amongst other less sensible ideas. So although the IUPAC would probably dearly love you to call alanine (S)-2-aminopropanoic acid it's not going to happen any time soon! However, it's worth familiarising yourself with the IUPAC naming scheme, so you can work out what (2R, 3S)-threonine and its like mean.
The first thing to do when giving an organic chemical an IUPAC name is to find the longest chain of carbons (ignoring any in benzene rings) in the molecule:
dec-3-enal.png)
The longest chain here is ten carbons long (not 6, or anything else, look closely), so it is considered a derivative of decane, CH3(CH2)8CH3. So we start with 'dec' as the beginning of this chemical's name. For other lengths of skeleton, you will need this table:
|
Length of carbon skeleton |
Root name |
|---|---|
|
1 |
meth |
|
2 |
eth |
|
3 |
prop |
|
4 |
but |
|
5 |
pent |
|
6 |
hex |
|
7 |
hept |
|
8 |
oct |
|
9 |
non |
|
10 |
dec |
The second thing to do is work out where to start numbering the carbon chain from, so you can specify where all the bits that aren't decane are. There is no simple system for this, just two rules, and a list of priorities. The first rule: if there are no interesting substituents (i.e. no OH or CHO or COOH groups, just carbon, hydrogen and nothing else in a long unbranching chain), then the numbering scheme starts from whichever end gives the lowest numbers to multiple bonds, if any are present. So…

is non-2-ene, not non-7-ene. The -ene means there is a double bond (here from carbon 2 to carbon 3), making this chemical an alkene (old name olefins), which are chemicals that have some carbon-carbon double bonds. If it were a triple bond instead, the -yne suffix should be used, and the compound would be termed an alkyne, or acetylene. If there are no double or triple bonds at all, the -ane suffix is used (as in decane), and the compound would be called an alkane (old name paraffin). If there were two double bonds (say at carbons 2 and 4), you'd write nona-2,4-diene.

Here comes another table, this time for what to put in front of three occurrence of a double bond (-triene), or three chlorine groups (trichloro-) or two double bonds and three triple bonds (-dienetriyne), etc.
|
Number of occurrences |
Prefix |
|---|---|
|
2 |
di (bis) |
|
3 |
tri (tris) |
|
4 |
tetra (tetrakis) |
|
5 |
penta |
|
6 |
hexa |
|
7 |
hepta |
|
8 |
octa |
|
9 |
nona |
|
10 |
deca |
The ones in brackets are for ambiguous cases: triphosphate is a chain of three phosphates (like in ATP), trisphosphate is a molecule with three separate PO4 groups in different places.
So that deals with simple hydrocarbons, those chemicals containing just hydrogen and carbon. However, most biochemicals are more interesting than this, so the second rule is that if there are interesting substituents, like OH, Cl, CHO, etc., you need find the most important one in the list of priorities below, and number from the end that gives that group the lowest number, rather than from the end that gives the lowest number to multiple bonds. Here is yet another list of the more important substituents, in descending order of priority:
|
Group |
Chemical type |
Prefix |
Suffix |
|---|---|---|---|
|
-COOH |
Carboxylic acid |
N/A |
-oic acid |
|
-CHO |
Aldehyde |
oxo- |
-al |
|
-CO- |
Ketone |
oxo- |
-one |
|
-OH |
Alcohol |
hydroxy- |
-ol |
|
-Cl |
Chloride |
chloro- |
-chloride |
|
-NH2 |
Amine |
amino- |
-amine |
Our molecule's highest priority group is the CHO at the end. Hence, it is an aldehyde, and gets the suffix -al. So it is a derivative of decanal (you could call it decan-1-al, but this is unnecessary, as a CHO group always has to be at the end of a molecule). The other groups (OH, etc.) will be added onto this root as prefixes. However, first we need to deal with the double bond. For this, we will need to explicitly number the molecule from the CHO group, as we just decided:
dec-3-enal_(numbered).png)
The double bond runs from carbon 3 to 4, so this is a derivative of dec-3-enal. ((3E)-dec-3-enal to be precise, as this is the E isomer, see below). We now need to deal with the three groups attached to carbons 5, 6 and 7. The OH group at carbon 7 makes this 7-hydroxydec-3-enal, as you can work out from the table of group priorities just given. The CH3CH2- group at carbon 6 is an ethyl group: the eth comes from the fact it is a two-carbon chain (see the table of carbon skeleton names), and the -yl means 'stick this onto the rest of the molecule'. So this is a derivative of 6-ethyl-7-hydroxydec-3-enal. The 'ethyl' bit comes before the 'hydroxy' bit because 'e' comes before 'h' in the alphabet.
The final group to add is the benzene ring connected to the ethyl group connected to the rest of the molecule. Knowing that benzene rings are called 'phenyl' in the same way OH groups are called 'hydroxy', it should come as no surprise that this group is called 2-phenylethyl. The 2- comes from the fact that the phenyl group is stuck to the second carbon along, counting away from the place the ethyl group is attached to the main part of the molecule. In some older schemes, these 'sub-numberings' are often indicated with a prime ('), like on the 3' and 5' of DNA. So, this molecule's (nearly) full name is 6-ethyl-7-hydroxy-5-(2-phenylethyl)dec-3-enal. The observant amongst you may have noticed the possibility of optical isomery at carbons 5, 6 and 7. The configuration can be found using the rules below, making this molecule:
3E, 5S, 6R, 7R-6-ethyl-7-hydroxy-5-(2-phenylethyl)dec-3-enal
That's it, you'll be happy to hear. The IUPAC rules are much more extensive than this, but you should now be able to work out exactly why, for example, the IUPAC-preferred name for threonine:

is 2-amino-3-hydroxybutanoic acid, but don't hold your breath thinking anyone will actually call it that!
Functional groups
We briefly touched on functional groups above, giving a list of some of the more common ones. We will now discuss a few of them in more detail.
Alcohols are chemicals containing a hydroxyl group or groups (-OH). Alcohols are weakly acidic (-OH → -O− + H+). This is especially true of aromatic alcohols, those with an OH directly attached to a benzene ring, like phenol and resorcinol.

Primary alcohols have an OH attached to a carbon attached to just one other carbon atom (CH3CH2OH). Secondary alcohols have an OH attached to a carbon attached to two other carbon atoms (CH3CHOHCH3). Tertiary alcohols have an OH attached to a triply alkylated carbon atom (CH3)3COH. Water could be considered the lowest in the alcohol series…H2O water, CH3OH methanol, CH3CH2OH ethanol, etc. Alcohols are very important in biological systems: carbohydrates (sugars) are bristling with OH groups, as is the glycerol of triglyceride fats.
Amines possess an amino group (-NH2), and are moderately basic (-NH2 + H+ → -NH3+). Primary amines (RNH2), secondary amines (R2NH), tertiary amines (R3N) and quaternary ammonium compounds (R4N+), differ by the number of carbon containing R groups that are attached to the amino group. Ammonia (NH3) may be considered the smallest amine, like water was the smallest alcohol.

Amino acids are all amines (except proline, which is a special sort of secondary amine called an imine).

Alanine.

Proline.
Alkaloids are another group of amine-like compounds, which contain a heterocyclic nitrogen, i.e. one that is in a ring.

They are common toxins in plants, and many drugs are derived from them. The bases of the DNA and RNA nucleotides are closely realted to the alkaloids, and the hallucinogen psilocybin is based on the alkaloid indole. Indole is also the basis of the plant growth hormone auxin.
The carbonyl group, C=O, is very common in biology, and occurs in a number of important functional groups.
- Ketones (R-CO-R) - sugars (ketoses).
- Aldehydes (R-HO) - sugars (aldoses).
- Carboxylic acids (R-COOH) - amino acids, .
- Acid chlorides (R-COCl) - reaction intermediates in chiral separations.
- Thiolate esters (R-COS-R) - coenzymes.
- Thiocarboxylic acids (R-COSH) - coenzymes.
- Esters (R-COO-R) - pheromones.
- Amides (R-CO-NH-R) - proteins. Amides are formed by the reaction of an amine with a carboxylic acid.
The carbonyl group is readily attacked by negatively charged species like hydroxyl ions (OH), because the C=O bond is very polar. Halogenated groups like -Cl, -Br and -I are found in many compounds, many of which are pesticides: these also make compounds susceptible to this sort of nucleophilic attack.
Geometric isomery
In a molecule held flat by double bonds, it is possible to construct two different structures when there are four different groups attached to the C=C of the double bond. Because there is no rotation about a double bond at room temperature, these two 'geometric isomers' are both physically and chemically different.

E-2-chloropent-2-ene

Z-2-chloropent-2-ene
Above are the two geometric isomers of 2-chloropent-2-ene. Note the Cl and CH3 are on different sides of the double bond in these two geometric isomers.
- Different physical properties : boiling point, melting point, crystalline form.
- Different chemical properties : reactivity with other chemicals.
- Different biological properties : enzyme inhibition.
- Different biological effects : odour, drug effects.
In the case of multiple double bonds, two double bonds means the possibility of four geometric isomers, and generally 2n isomers if there are n double bonds. To name these isomers, we used to use cis and trans, but these don't actually work very well, since they are ambiguous for all but the simplest cases. Instead we have the new descriptors 'E' and 'Z', which are completely general. To name geometric isomers, we need to assign Cahn-Ingold-Prelog (CIP) priorities to the four groups attached to the C=C double bond, which is about as complicated as it sounds ☺
The CIP rules:
- Look at the atoms directly attached to each side of the double bond (the C of CH3- and the Cl- on the right hand side, or the C of CH3CH2- and the H- on the left hand side).
- The atom with the highest priority is the one with the highest
atomic number.
- Right hand side, Cl has higher priority than C.
- Left hand side, C has higher priority than H.
- In cases like this, where the atoms directly attached to the C=C of the double bond are different, the priority of the whole group is the same as the priority of the atoms.
- So here, Cl > CH3 and CH3CH2 > H.
- The isomer is Z (zusammen) if the highest priority group on the left and the highest priority group on the right are on the same side of the double bond. Otherwise it is the E (entgegen) isomer.
- So for the isomers shown above the one on the left is the E form, the one on the right is the Z form.
However, it's perfectly possible that you can't assign priorities on the atoms directly attached to the double bond (if they are both carbon for example). In such a case, you need to follow down the chain of atoms until you can. So if you are comparing the groups -CH2Cl and -CH2OH, we notice that the carbons have equal priority, so we have to go down the chain to the next atoms, here Cl and O, before we can say the -CH2Cl group has a higher priority than the -CH2OH group. One further complication is dealing with double bonds in the side groups. The most common are C=O groups in carboxylic acids, etc. Conventionally, you expand out C=O groups as shown:

Where multiple bonds are present, we can label them with IUPAC numbering. This is (7E, 11Z)-hexadeca-7,11-dienyl ethanoate.

Optical isomery
Optical isomery (or chirality), occurs because a tetrahedral carbon atom can have four groups attached to it. If these groups are all different, they can take up two distinct conformations around the carbon, which is then termed a chiral centre. Chiral molecules therefore have 'left and right handed' forms, which are nonsuperimposable mirror images of one another, like these two generic amino acids:

D-amino acid

L-amino acid
They are termed optical isomers because they rotate the plane of polarised light in equal but opposite directions. Unfortunately, there are (at least) three naming schemes, all of which we'll come across:
- S/R based on configuration. S and R are used to describe the actual arrangement of the groups around the chiral carbon atom.
- +/− based on effect on light. These describe what the isomer actually does to light.
- D/L based on biology/guesswork/dodgy lists of rules. These are purely conventional.
Dot and wedge notation can be used to show the arrangements of the groups in space: wedges pop out of the page,dots go away from you. Again, note the two nonsuperimposable mirror images.
.png)
If a carbon is attached to four different groups, it can be chiral, and potentially have optical isomers. A single chiral centre gives only two nonsuperimposable mirror images called enantiomers. The above are the two enantiomers of glyceraldehyde. However, two chiral centres will give four isomers (two pairs of enantiomers), and in general, the maximum number of isomers is 2n where n is the number of chiral centres. However, there may be fewer if there is internal symmetry, as below:

In general, if there are an odd number of chiral centres, the compound must be chiral, and have at least two isomers. If there is an even number of centres, the compound will usually be chiral, but it may have fewer isomers than expected if there is internal symmetry, or it may even be non-chiral, with all the isomers being equivalent. These 'should be chiral but are not' isomers are often called meso forms. In fact, you don't even need chiral centres to form 'handed' compounds, as hexahelicene shows:
To name optical isomers, we need to assign CIP priorities to the four groups around the chiral centre, as we did for geometric isomers. We the hide the lowest priority group behind the chiral centre, and count round 1…2…3 from highest to lowest priority.

If this describes a clockwise movement, it is the R (rectus) isomer. If anticlockwise, it is the S (sinister) isomer. This is therefore the S isomer of bromochlorofluoromethane. The R/S tells you absolutely nothing about the +/- effect on polarised light, which we'll cover in a moment. To name multiple centres, we just assign each an S/R descriptor, and IUPAC number the carbons: (2S, 3R)-threonine, etc. (Note that the numbering for threonine runs from the COOH group to the CH3 group).

(2S, 3S)-threonine (2R, 3S)-threonine
(2S, 3R)-threonine (2R, 3R)-threonine
The tragic case of thalidomide shows what can go wrong when you don't understand chirality properly. Animal tests showed (S)-thalidomide was a sedative, but that (R) was a teratogen. Unfortunately, it turned out that even the pure (S) isomer racemised (converted to the (R) form) slowly in the body, enough to cause birth defects.

(S)-thalidomide
Polarimetry can be used to measure the extent to which optical isomers rotate the plane of polarised light. Enantiomers rotate the plane of polarised light to equal but opposite extents.
- (+) isomer rotates the plane to the right.
- (−) isomer rotates the plane to the left.
A polarimeter measures the rotation caused to polarised light from the D-line of sodium emission (as used in street lamps) The specific rotation, α is defined as the angle of rotation (in degrees) divided by the cuvette path length times the sample concentration. This requires rather stupid units: the cuvette path in dm and the concentration in g mL−1. The specific rotation of a 50:50 mixture of two enantiomers of a compound is 0, since the two isomers will exactly balance one another's equal but opposite effects on the rotation of polarised light.
From this, we can define the optical purity (OP) of a mixture of enantiomers. The OP equals αmixture ⁄ αpure (+) isomer. The OP also happens to equal the proportion of (+) enantiomer minus the proportion of (−) enantiomer. A racemic mixture contains equal concentrations of (+) and (−) isomers, hence has an OP of 0 (or 0%).
Here is a sample calculation for specific rotation and optical purity. Specific rotation is:
α = rotation (°) ⁄ path (dm) × concentration (g mL−1).
So it's just a matter of plugging in the figures given. Note that path lengths are often quoted in cm, so divide them by 10 to give them in dm. Concentrations are usually quoted in molar, so you may need to divide by the Mr (RMM) to give you the concentration in g L−1.
To calculate optical purity:
OP = αmixture ⁄ α+
To calculate the proportion of an enantiomer from a specific rotation, we note that OP is also equal to:
OP = proportion of (+) enantiomer − proportion of (−) enantiomer = (+) − (−).
So to calculate the proportion of (+) enantiomer in a sample from its optical purity just takes a little bit of algebra. Say I have a mixture of enantiomers with a specific rotation of +25°. The pure (+) enantiomer has a specific rotation of +50°. What is the proportion of (+) enantiomer in the sample?
First work out the optical purity from the equation in 2 above:
-
OP = 25 ⁄ 50 = 0.5 (or 50%).
This is equal to the proportion of (+) minus the proportion of (−), which obviously must add up to 1. So:
-
(+) + (−) = 1.
-
(−) = 1 − (+).
You can substitute this into the equation for OP and rearrange it.
-
OP = (+) − (−) = (+) − (1 − (+)) = 2(+) − 1.
-
(+) = (OP + 1) ⁄ 2.
So solving for (+).
-
(+) = (OP + 1) ⁄ 2 = (0.5 + 1) ⁄ 2 = 1.5 ⁄ 2 = 0.75 = 75%.
So the answer is the proportion of (+) isomer is 75% and that of (−) isomer 25% (since they must add to 1 or 100%). One thing to be careful about: the (+) isomer is the one with a positive rotation, by definition. Beware of trick-questions where you are given the specific rotation of the (−) enantiomer (which is obviously negative): the specific rotation of the (+) isomer is equal but opposite in sign, so you can still do the calculation. Also be very careful about whether the (+) isomer and the (R) or (D) isomer are the same thing, since they only are the same thing about 50% of the time.
We will often want to separate the two members of an enantiomeric pair, however this is easier said than done because their physical properties are identical (bar their interaction with polarised light). Say we want to isolate pure (S)-warfarin from a racemic mixture: this is because the S isomer is ten times more effective than the R isomer at being a blood thinner, rat poison and stroke treatment. How can we separate the enantiomers?
Well, if there is more than one chiral centre in a compound, any given pair of these optical isomers may be either:
- Enantiomers, or
- Diastereomers
These are relative relationships between isomers, not absolute descriptions. Enantiomers have the opposite configuration around every chiral centre, and are nonsuperimposable mirror images. Chemically, these will only differ in their interactions with other chiral compounds, e.g. (2S, 3S)-threonine and (2R, 3R)-threonine in the diagram below.

(2S, 3S)-threonine (2R, 3S)-threonine
(2S, 3R)-threonine (2R, 3R)-threonine
Diastereomers on the other hand, have the same R/S configuration at at least one chiral centre, and they are not mirror images of one another. They have slightly different properties (full stop) from one another, e.g. (2S, 3S)-threonine and (2S, 3R)-threonine.
Enantiomers have identical physical properties unless examined in a chiral environment, i.e.:
- Interactions with polarised light.
- Interactions with other chiral molecules.
- Interactions with chiral macromolecules, like enzymes: all amino acids but glycine are chiral and all are present in only one enantiomeric form in protein (although bacteria have some odd amino acids in their cell walls). An enzyme is a chiral environment because (R)Enzyme(R)substrate is a 'diastereomer' of (R)Enzyme(S)substrate and these complexes will have different properties (e.g. ability to form a product of interest).
However, since the properties of diastereomers differ, they are easily separated. So to separate our warfarin enantiomers, all we need do is convert them to diastereomers.
To do this, we just need to react the racemic enantiomeric mixture of (S)-Warfarin & (R)-Warfarin with the purified (S) or (R) enantiomer of another chiral compound, e.g. pure (2S)-butanol. This will form a diastereomeric mixture of (S, S)-butyl-warfarin. and (S, R)-butyl-warfarin, whose properties will differ sufficiently to allow separation. The plan of action is therefore to add a suitable chiral resolving reagent (single enantiomer), then separate by conventional techniques like chromatography or fractional crystallisation. There are few chemical considerations, for example, warfarin might not react with butanol, so we may need to use some chemical, such as thionyl chloride (SOCl2), to convert it to something more reactive.
For carboxylic acids, we need to activate with thionyl chloride first:
- Racemic mixture of a chiral carboxylic acid + thionyl chloride → racemic acid chloride mixture
- Racemic acid chloride mixture + single enantiomer of a chiral alcohol/amine → diastereomeric mixture of esters/amides.
Alcohols and amines are usually reactive enough and can be mixed with resolving agent immediately:
- Racemic alcohol/amine + single enantiomer of chiral acid chloride → diastereomeric mixture of esters/amides.
Aldehydes and ketones are also reactive enough:
- Racemic aldehyde/ketone + single enantiomer of chiral alcohol → diastereomeric mixture of acetals.
Besides using a chiral resolving agent, you can also separate enantiomers directly using a chiral chromatography, a chromatography column with a chiral stationary phase, so it forms a 'temporary diastereomer' in situ. You may also be wondering where we get our pure (S)-isomer of butanol from in the first place: in fact, many biochemicals are pure enantiomers, by virtue of being formed by chiral enzymes, so we can extract them from bacteria.
Since warfarin is a ketone, we can react it with pure (S)-butanol, to from two diastereomeric derivatives, which we can separate by conventional chromatography.
Test yourself
- Draw the following molecule in line-bond notation, and name it.
.png)
- How many optical isomers are there of the compound cyclohexane-1,3-diol? What sort of stereoisomeric relationships do they have to one another?
- Which of the following statements are true?
- Diastereomers are superimposable mirror images of each other.
- Diastereomers have different melting points.
- Diastereomers are different from one another at all chiral centres.
- Enantiomers are have different binding constants to enzymes.
- Enantiomers are superimposable mirror images of each other.
- A 50:50 mixture of two diastereomers will have no effect on the plane of polarised light.
- Is this the R or S isomer of erythulose? Name this sugar with an
IUPAC systematic name.
.png)
- The pure (R) isomer of alanine has a specific rotation of −14.2 (please note the minus!). A mixture of isomers rotates the plane of polarised light by 0.0669° at a concentration of 3.00 g 100 mL−1 using a 15 cm (note cm!) pathlength cuvette. What is the specific rotation of this mixture? What is the ratio of (R) isomer to (S) isomer in the mixture (careful about +/− R/S)?
- Why would a racemic mixture of glucose exposed to bacteria rotate the plane of polarised light to the left?
- How could one separate a racemic mixture of the two enantiomers of alanine?


.jpg)






