I read a couple of articles a couple of years ago explaining endogenous retroviruses as evidence for evolution - it seemed quite convincing to me.

I have just read Coyne's and Dawkins book on the evidence for evolution and the ERV line of evidence does not appear in either of them. Is this because there is a problem with this or what. Are ERV's really as convincing evidence for evolution/common descent as it seemed to me at the time?

No rush for an answer but I would just like to know whether I am barking up the wrong tree when using this as evidence to my creationist relatives.

No, you aren't barking up the wrong tree (phylogenetically or otherwise).  In fact, don't limit yourself to just ERVs.  ERVs are only one type of transposable element in the genomes of organisms that provide evidence for evolutionary relationshps.  I hope you don't mind if I include a little background for those who might not have read as much as you.

Transposable elements are DNA sequences that can make copies of themselves in a genome.  They come in two types (Class I and Class II).  Class II's aren't important for this discussion.  Class I elements are also known as retrotransposons.  This means that they make an RNA copy of themselves and then that RNA copy is converted back into DNA and placed back into the genome in some random spot.  The end result is that the original copy of the element is still where you left it and a new copy is added somewhere else in the genome.  This is the reason why nearly half of our genomes are made up of retrotransposons - every time they mobilize, they create more DNA sequence in the genome.  There are several subtypes of Class II elements.  There's the ERVs you've already mentioned.  There are also LINEs, long interspersed elements, and SINEs, short interspersed elements.  In primate genomes the most common element by far is a SINE known as Alu.  They make up 10% of human genomes.  Each Alu is up to ~300 bp long.  If you do the math, you can see that that means there are around 1 million copies in the 3 billion bp human genome (3 billion bp genome / 300 bp per Alu = 1 million copies).  

So, lets examine why these are such powerful evidence of common descent.  As I mentioned, when retrotransposons mobilize they deposit a copy at a random location in the genome.  This means that when an Alu is looking for a place to put a new copy, it has ~3 billion places to choose from - between any two bp in the genome.  The chances of that same location being the home for an Alu in a chimp or gorilla or a monkey or any other primate is essentially 1 out of 3 billion, pretty small.  That's not to say it can't happen but that it is very, very unlikely.

What's more likely is that if you find an Alu at exactly the same location in two or more genomes, it got there as a single event and that occurred in the genome of the common ancestor of the organisms.  In other words, if humans and chimpanzees share a common ancestor, there should be some Alu elements that are shared by them.  Those elements should be in the exact same locations in both genomes.  They inserted there and they've just been carried along for the ride as the genomes have diverged into 'human' and 'chimp'.  At the same time, there should be some elements that are in unique locations in both genomes.  These are the elements that mobilized after humans and chimpanzees diverged from that common ancestor.  There are analyses you can do to determine the ages of these elements.  And, as you'd expect, the ones that are shared by humans and chimps are older than the ones that are unique to human or unique to chimp. 

I and others have examined just these sorts of questions during our research.  You can find copies of some of these papers at my website (http://www.crocoduck.bch.msstate.edu/Pubs.htm).  Just look for citations 6, 11, 14, 21, 23, 27, and 30.  These particular papers deal only with primates for the most part.  Others have done the same using different elements including ERVs, LINEs and SINEs for whales, fish, rodents, mammals in general and several other organisms. Norihiro Okada has directed many of these studies and you can find examples of his papers at his website (http://www.evolution.bio.titech.ac.jp/r … 010_e.html)

Regarding why Coyne and Dawkins left out this kind of data, I can't be sure.  I loved both books but was also disappointed not to find it there.  Both of these authors were taking very broad looks at the evidence ranging from biogeography to fossils to developmental data.  It could just be that they had to take space into consideration and had to cut out something.  There are several books that do include this information.  In particular, Sean Carroll's book, The Making of the Fittest.  This book focuses on DNA-based evidence for common descent and includes one of the figures from citation 6 above.

So, the short answer comes down to this:
No, you aren't barking up the wrong tree.  All types of retrotransposons including ERVs are solid evidence for evolution.

Good luck with the creationist relatives.  I have some myself.

Last edited by David Ray (22nd Apr 2010 17:40:54)

I got this in a private e-mail:

I noted this answer you gave at:


“This means that they make an RNA copy of themselves
and then that RNA copy is converted back into DNA and placed back into the
genome in some random spot. “

“As I mentioned, when retrotransposons
mobilize they deposit a copy at a random location
in the genome. “.

My question is - how do you know the placement is “random”?

Here was my response for anyone who's interested:

I'm glad you asked.  This gives me an excuse to edit the post.  I had
considered including what I'm about to write in my original answer. 
However, based on what I've seen on this site my answer was already
overlong, so I didn't..

The truth is that it isn't absolutely random.  For ERV's its very close
to random because their preferred target site is very short (4-5 bp)
and the sequence depends on the element family.  Alu elements and LINEs
have a loosely defined target site of TTAAAA.  I say loosely defined
because, while TTAAAA is the preferred site, it is entirely possible
for an element to integrate into a site that is TTGGAA, or TCAATA, or
ATAAAA, or CTAAAG.  It just needs to be close to the consensus sequence
of TTAAAA.  As a result of this loose target site preference, they tend
to integrate into AT-rich regions.  This tendency does reduce the
actual number of potential insertion sites but only slightly.  The
result is that, while it isn't truly random, it is close enough to
random for all practical purposes.  In other words, it may not be a one
in 3 billion chance of two elements integrating into a given site in
two different genomes.  Instead it may be a one in 2.5 billion chance.

Now, on to the real question.  How do we know it's essentially random? 
We do this by examining the way elements have accumulated in various
genomes over time.  If you look at the locations of the elements that
have integrated relatively recently, within the last 2-3 million years,
you can actually take a look at all of the potential sites that are
available and at the sites where the recent elements have inserted and
there is no particular pattern to it when you compare genomes.  Yes,
there is a slight preference for AT-rich regions of the genome but
apart from that, no pattern is observable with regard to insertion site
preference. The data looks like it we would expect it to look if there
were a set of sites available and the element just inserted into the
first one it happens to find and latch onto. 

You might notice that I said that we know this from looking at the
newer elements.  There's a reason for looking at the newer elements. 
If you look at older elements, you will see that they have had a
tendency to accumulate in GC-rich regions of the genome.  In other
words, the older elements tend to be found in GC-rich regions and the
newer elements tend to be found in the AT-rich regions.  The reason for
this difference appears to be a result of the only process available to
genomes to get rid of these elements.  There is no specific mechanism
to remove them exactly (just the element and no sequence on either
side).  The only way to get rid of them is when large deletions occur,
usually through a process called non-homologous recombination.  This
process of deletion is tolerated much better by genomes when they occur
in regions that are gene poor (not well populated with coding
sequences).  You can imagine why.  If such a deletion occurs in a
region with few genes, its unlikely to cause a problem.  This deletion
process is not tolerated well when it occurs in gene rich regions. 
This is because you are more likely to cause functional damage in these
parts of the genome.  Now, I bet you can guess what parts of the genome
are more gene rich.  If you guessed the GC-rich regions, you're right. 
That's why transposable elements tend to accumulate in these
regions.  They don't preferentially insert, they just tend to stay
there because its more dangerous for the genome to remove them from
GC-rich regions because it is more likely to result in functional

Thanks for asking.  I'm new to this site and need to find the happy
medium between simplifying the concepts enough for everyone to
understand and being so simplistic that I leave out important