[Skip to list of Topics for this Category →]
Proteins are amino acid polymers with myriad roles in creating and maintaining cell shape and structure, recognising molecules within cells or at the cell membrane and critically in catalysing biochemical reactions. More than half the dry weight of every cell is protein, with each individual protein composed of specific amino acids in a sequence and structural arrangement that confers properties vital for correct function.
Amino acids are small organic molecules characterised by an amine group (NH2), carboxylic acid group (COOH) and a specific side-group or 'residue' joined to a central or 'alpha' carbon atom. All living organisms share 20 common amino acids, each encoded within messenger RNA (mRNA) sequences (after transcription from DNA) by nucleotide base triplets called codons. A few organisms use an extra amino acid, selenocysteine, and some archaeal bacteria use pyrrolysine. mRNA sequences are read or 'transcribed' within ribosomes, RNA-protein complexes that bind to mRNA and move along it, recruiting transfer RNA (tRNA) molecules complementary to each codon, each tRNA bound to a specific amino acid. A polypeptide chain forms at the ribosome as each new amino acid is linked to the previous one before the corresponding tRNA is released and the ribosome moves to the next stretch of mRNA. Amino acids are linked by peptide bonds through condensation between amine and carboxylic acids groups, with resulting polypeptide sequences described from the amine or 'N' terminal on the left to the carboxylic acid or 'C' terminal on the right. Whereas the building blocks of DNA and RNA are chemically very similar, each amino acid is chemically distinct, which explains why proteins have evolved (from a primordial "RNA world" with few polypeptides) to become the main catalysts of cellular reactions.
The basic subunits (or monomers) of protein structure are termed 'domains' and consist of several distinct structural motifs (e.g. alpha-helix or beta-pleated sheet) combined in a single, highly folded globule. Amino acids have non-polar, acidic, basic or uncharged polar side-groups. Those with non-polar side-groups are hydrophobic and so orient themselves away from water, on the inside of the protein globule, while more polar, hydrophilic ones array themselves towards the external, aqueous environment. At binding sites, specific external residues are arranged to allow an exact fit with another molecule, termed a ligand. When catalytic proteins, or enzymes, subsequently modify the bound ligand (or substrate) they accelerate reactions, sometimes to astinishing rates. They owe this capacity to the close-fitting structure of the binding site and by using energy released at unstable intermediate stages of the reaction pathway. Protein subunits may assemble to form sheets, tubes, rings or helices, and functionally related proteins may also bind together to form 'multi-enzyme complexes', with the binding sites for substrates resulting from a chain of reactions all located in close proximity to localise relevant substrate concentrations and so promote optimal reaction rates.
Chemically active proteins such as enzymes or respiratory proteins provide many examples of convergence. Among enzymes, identical active site structure may evolve, such as the serine-histidine-aspartate 'triad' in trypsin and subtilisin, or different routes may taken by structurally distinct proteins to catalyse an identical substrate (e.g. β-lactamases can function via serine protease or zinc 'metalloprotease' groups). Carbonic anhydrase (CA) converts CO2 + H20 → HCO3- (bicarbonate) + H+, a reversible reaction critical for processes as diverse as photosynthesis, respiration, biomineralisation and kidney function. Not surprisingly, therefore, CA enzymes are ubiquitous, and yet they appear to have evolved at least five (or six) times independently, and various CA families are known in animals, plants, bacteria and certain algae (e.g. diatoms, haptophytes). Peroxidases are typical anti-microbial enzymes, creating toxic oxidising conditions via generation of highly reactive hydrogen peroxide (H2O2) or organic hydroperoxidases. An array of peroxidases is known to have evolved in animals, plants, fungi and bacteria, variously depending on heme (iron-based) cofactors, active cysteine or selenocysteine to function. Luciferases have evolved at least 30 times to power bioluminescence, being recruited each time from pre-existing enzymes such as oxidases and synthetases. Astonishingly, the copper-based respiratory protein haemocyanin evolved independently in arthropods and molluscs, while iron-based β-haemoglobins show convergent gene duplication patterns in birds and mammals (as well as between monotreme and therian mammals). Lipocalin proteins have evolved independently in many animals, plants and gram negative bacteria, and in many cases act to bind with and transport hydrophobic molecules such as lipids and sterols. Lipocalins have been found in the milk secretions of placental mammals (e.g. ruminants) and marsupials (e.g. kangaroos, wallabies and possums) as well as insects such as the viviparous cockroach Diploptera punctata. Lipocalins are secreted in insects and some mammals (e.g. hamsters and mice) to carry critical pheromones, and have been observed to elicit allergenic activity in mammals and cockroaches.
A range of proteins with key structural or anatomical roles demonstrate convergence clearly. Eye lenses are transparent due to the properties of crystallin proteins, independently recruited from microbes many times in animal evolution and co-opted from roles in stress resistance (e.g. heat shock proteins). Inriguingly, genes for crystallins made of very different proteins are driven by almost identical promoter regions in the genomes of scallops and vertebrates! Collagen is an essential structural protein in the ligaments and skin of animals, composed of three intertwined helices with frequent triplet-repeats of glycine-proline/hydroxyproline and a third residue. Surprisingly a collagen-like protein with proline-theronine-glycine triplets occurs in Bacillus anthracis (anthrax) spores, and a collagen gene was acquired by lateral gene transfer in the cyanobacterium Trichodesmium erythraeum, where its protein assists in cellular adhesion during blooms. Organised layers of high and lower refractive index collagen fibrils evolved independently at least 50 times in birds, resulting in structural colouration of feathers by 'thin film reflection' of incident light. Although fully reflective thin-film interfaces such as cats' eyes are typically built of insoluble guanine, collagen-based thin-film reflection is found in the deep-sea squid Vampyroteuthis whereas in the reverse-eye of the squid Euprymna convergence on reflective properties is based on entirely different proteins called reflectins. Proteins that form the celebrated bacterial flagellar motor are now well understood, in terms of the genes that code for them and their co-option to novel roles in the motor complex. Significantly, the flagellar motor has evolved more than once and is convergent in the eubacteria and archael bacteria. Proteins with elastic properties show convergences as various levels, from elastins in the vertebrate and cephalopod aorta to the shared characteristics of resilin, abductin, elastin and even gluten from plant seeds. Bivalves such as the well-known mussels have attachment structures with both elastic and more rigid silk fibroin-like regions, and the 'pen shell' Pinna has a byssus of 'sea-silk' threads, reminiscent of arthropod silk. Silk evolved for various functions at least three times in arachnids (spiders, spider-mites and some pseudoscorpions) and many more times in the insects, especially at larval stages. To name but a few examples: spiders, spider-mites and caddis fly larvae build webs; silk nests are built by weaver ants, leafhoppers, 'webspinners' and pseudoscorpions; the silk-worm Bombyx makes silk cocoons; a few (brave) spiders and hilarinid flies offer silken 'nuptial gifts', and spiders, spider mites and moth larvae use silk lines for 'ballooning' into the air.