X - Services
Degenerate PCR is in most respects identical to ordinary PCR, but with
one major difference. Instead of using specific PCR primers with a given sequence,
you use mixed PCR primers. That is, if you do not know exactly the sequence
of the gene you are going to amplify, you insert "wobbles" in the PCR primers
where there is more than one possibility. For instance, if you just have a
protein motif, you can back-translate the protein motif to the corresponding
nucleotide motif. (Protein --> Sequence).
Example of a degenerate PCR primer designed after a protein motif:
Trp Asp Thr Ala Gly Gln Glu
5' TGG GAY ACN GCN GGN CAR GA 3' (This gives a mix of 256 different oligonucleotides.)
where the Y = C or T,
R = G or A,
N = G, A, T or C.
The more wobbles you introduce in the PCR primer the more degenerate it gets.
(The degeneracy of the primer is produced during DNA synthesis, you
do not need to order 256 different primers to get a 256 mix, that's a lot
of paper work, and also expensive!)
Why use degenerate PCR?
Degenerate PCR has proven to be a very powerful tool to find "new"
genes or gene families. Most genes come in families which share structural
similarities. By aligning the sequences from a number of related
proteins you can determine which parts are conserved and which are
variable. Based on this information you can use conserved protein motifs
for starting points for designing degenerate PCR primers.
Degenerate PCR applies to a number of scientific settings:
You have isolated a protein and managed to sequence
some amino acids from it. You want to find the corresponding gene!
Why not try with degenerate PCR?
You have found a human gene and want to clone the homolog gene
from e.g. mouse or Drosophila. Of course, you can try
with low stringency hybridizations, but how many false positives do you
have to sequence before you find the correct one?
You have found an interesting gene in yeast or C. elegans and want
to find the human homolog (if it exists). Why not
try degenerate PCR?
Phylogenetic and evolutionary studies of genes: e.g. you can find
specific orthologous genes from a number of related species and compare them.
This type of information can reveal potential active sites,
regulatory regions and much more.
Studies of gene families. E.g. "How many members of the Rab family
exists in green algae?", "Do they differ when compared with the higher plants?".
These are just a few examples of possible applications of degenerate PCR.
Requirements: What kind of sequence information do you need to get started?
Two blocks of conserved amino acids/DNA sequence. The length of the primers should
be a minimum of 20 bp.
|The protein motif does not have to be 100%
conserved. Sometimes a partially conserved protein motif is sufficient. Examples
of common found substitutions are Glu --> Asp and Arg --> Lys.
If you use the degenerate codon GAN, it covers both Glu and Asp. Similarily,
if you use the MGN codon (M = C or A), where you know there should be a basic
amino acid (Arg or Lys), the MGN codon covers partially the Lys codon AAR.
However, if there is a Lys residue you will have a G/T mismatch in the
second base. This is normally no problem as long as this mismatch occurs
in the middle or the 5' part of the primer. (Remember your biochemistry,
the enol form of thymidine can pair with guanine).
The N-terminal part of a protein (obtained from protein sequencing) often gives enough
sequence information to be used for degenerate PCR.
|If the the N-terminal sequence is 20-30 amino
acids, it is often possible to make two degenerate primers, and you can
amplify a 50-90 bp cDNA fragment which you can use as probe to screen a cDNA
library. Alternatively, you can make two degenerate primers and try a 3' RACE
to amplify the rest of the cDNA.
Hint: The easiest way is normally to amplify a fragment of the N-terminal,
sequence this fragment and then make specific primers for 3' RACE.
cDNA or genomic DNA?
In general, cDNA works best as a template because of its' lower complexity
(in eukaryotes a small percentage of the genomic DNA encodes proteins).
Also, the size of the PCR fragment is "predictable", because there are no introns.
If you are uncertain as to whether your gene is expressed in the tissue/developmental stage
you have chosen, genomic DNA can be used as a starting material. The drawback is that you
often have to sift through a lot of junk DNA before you find your gene.
How degenerate can PCR primers be and still function?
1000 - 10.000 fold degeneracy is not uncommon.
The degeneracy of the primers can be kept down by substituting four-base wobbles with inosines,
i.e. CGI instead of GGN.
Example motif: CVGG(M/L)NRRP (found in p53 proteins).
Without inosines: 131072 mix: 5' TGY GTN GGN GGN MTN AAY MGN MGN CC 3'
With inosines: 512 mix: 5' TGY GTI GGI GGI MTI AAY MGN MGN CC 3'
Choosing PCR conditions
- Try standard conditions with slightly lower annealing temperature, 35-50 cycles.
If standard conditions fail, run the first four cycles at 5-10° C lower than
recommended, i.e. 42-46° C. (PCR primers with multiple mismatches will be extended,
and hopefully some stick to your gene).
If the primers are very degenerate (512 mixes or more), competitive inhibition can lead to
problems. (Primers bind the correct template but are not extended by the polymerase
because of unstable 3' ends.) This means that the first PCR cycles are very inefficient,
and you sometimes have to run 50 cycles to see a even faint band of your gene.
Remember not to use a DNA polymerase with 3' --> 5' exonuclease activity, (PCR primers will
be degraded). Taq polymerase works OK.
What types of genes are "easy" to find by degenerate PCR?
Many proteins have structural similarities with other proteins and often share a common
Proteins with ancient conserved motifs (ACM's) are in general "easy"
to find. More than 500 families of proteins with ACM's are known! (Some of
these families are huge: Ser- Thr- Tyr- kinases in humans number around 1000
genes.) By 2002 the complete sequence of 8 eukaryotic genomes are known. (Human,
melanogaster (fruit fly),
Anopheles gambiae (the mosquito),
C. elegans (nematode),
S. cerevisiae (baker's yeast),
Schizosaccharomyces pombe (fission yeast),
Arabidopsis thaliana (plant),
Plasmodium falciparum (protist)
and pretty soon
In addition, tens of bacterial genomes are completed. These genomes provide a wealth of information
regarding the evolution of various gene families and can be used as a starting point to find genes
in even more obscure organisms. Start by making a protein alignment of your protein of interest.
Include as many proteins as you can find. If the protein is not well conserved, try to find regions
that have some conserved amino acids, and if you know the sequence from a closely related organism,
use this as a guiding sequence. Sometimes you can gamble on the sequence with great luck.
By using degenerate PCR you can find most genes from yeast and animals
irrespective of organism (cow, frog, snail, beetle, worm or fungi). Problems
may arise if you try to catch the fast evolving genes. If not, you are pretty
sure to find what you are looking for by using degenerate PCR. The case may
be a bit harder if you look for genes in protists, such as the cryptomonads,
where many genes have undergone massive genetic drift and have changed a
lot compared to other eukaryots. Apart from that, limitations are in general
The conserved amino acids you try to design the PCR primers after are composed mainly of
Ser, Arg and Leu. These are the amino acids that give most wobbles, and this might result in
primers so degenerate they amplify virtually anything, normally just a lot of junk.
There is a limitation in size of the region you try to amplify. As a rule of thumb, avoid
PCR products larger than 1000 bp.
If the DNA of the organism you try to amplify the fragment from has a very high GC content
you might end up amplifying a lot of incorrect fragments. This is also the case if the DNA
of the organism has a very low GC content, and you have designed your primers too short
(a high TA content gives a low melting point for your primers).
If the gene you are looking for does not exist in the organism you have chosen, you are out of luck.
There are a few examples of "dinosaur genes" genes which have disappeared in certain lineages
during evolution (for instance Rac genes in S. cerevisiae).
The menu on the right contains links to short presentations of some methods and
protocols used at NTNU CMB.
- Degenerate PCR
- Quick prep - DNA isolation