Simple Sequence Repeats
A substantial fraction of many eukaryote genomes consists of simple sequence repeats (SSRs), tandemly repeated arrays of short sequences, from two to several hundred base pairs long (Box 13.3; Fig. 21.19). (Note that sequenced genomes are deliberately chosen to be small, and highly repeated regions may be omitted; organisms with larger genomes contain much larger fractions of SSRs.) Large tandem arrays were originally termed satellite DNA, because they were detected as “satellite” bands with a distinct GC:AT content and hence a distinct density. Short arrays are known as microsatellites or minisatellites; their variability in length makes them useful genetic markers (Box 13.3). These relatively short tandem arrays may be interspersed among coding sequences, whereas longer arrays are condensed into heterochromatin.
Simple sequence repeats can change length in several ways (Fig. 12.18). Strands may slip during replication, leading to duplication or deletion; unequal crossing over produces two recombinant products, one shorter and one larger than the original; and large arrays may produce extrachromosomal circles of DNA that replicate and then reinsert. All three processes lead to random increases and decreases in length. Eventually, the array must go extinct when it drifts down to one copy. However, there is some chance that it will drift to very high copy number during its life (Fig. WN21.3). This process is essentially the same as the process of random drift in allele frequency (Chapter 15), except that there is no definite upper limit to copy number.
Tandem repeats do not evolve independently of each other. Often, different elements may vary in sequence as they accumulate mutations, but these differences are much less than would be expected from the age of the arrays. This phenomenon, known as concerted evolution, implies that some mechanism exists for transfer of sequence information between repeats. This is thought to occur primarily through the processes, such as unequal crossing over, that cause length variation (Fig. 12.18B). Even low rates of these processes lead to substantial homogeneity, because they are typically faster than mutation and may involve multiple repeats.
Simple sequence repeats tend to accumulate in regions of low recombination—for example, around the centromeres—for similar reasons as for transposable elements. Weak selection against excess DNA is less effective when recombination rates are low (the Hill–Robertson effect, pp. 536 and 676–677). Also, the array is less likely to drift to extinction when rates of unequal cross-over are low and, on average, will be longer.