.
. CSHL Press .
. . . . .
.
 
.
.
. .
 

Request an Exam Copy of Evolution
 

.
. . .
.  BOOK COVER .
. . .
.
.  cover .
.
CLICK TO ENLARGE
 
Buy the Book
 
.
. Register at our site
www.cshlpress.com
to join our
Discount Program
and receive 10% discounts
on all website purchases.
.
.
 

CSH Protocols

 

You may also be interested in:

Evolution: The Molecular Landscape

Cold Spring Harbor’s 74th Symposium
EVOLUTION
The Molecular Landscape
Edited by Bruce Stillman,
David Stewart, and
Jan Witkowski,
Cold Spring Harbor Laboratory

   
 

Comparing the Power of Gene Mapping Using Pedigrees versus Association Studies

What size of effect could be detected by looking for linkage within families, as compared with association studies that depend on linkage disequilibrium in the population as a whole? We will compare two widely used methods, assuming a very simple genetic model. We suppose that each copy of an allele A increases the risk of some rare disease by a factor γ, relative to the alternative allele, a. Thus, the three genotypes aa, Aa, AA have relative risk 1: γ: γ2, respectively.

Allele Sharing: A simple and commonly used method for detecting linkage within families is based on a sample of families that contain two affected siblings. If a marker is not linked to any gene that influences disease risk, then there is a 50% chance that each sibling will inherit one or another marker allele from a heterozygous parent. However, if allele A (say) is associated with increased disease risk, then it is likely that both siblings will inherit the same A allele from the heterozygous parent. For example, if one parent is heterozygous, and the other is homozygous, then the two affected siblings can have four possible genotypes (Fig. WN26.1). If the siblings inherit the same allele from the heterozygous parent, then they have the same genotype, and we say that they share alleles. If they inherit different alleles from the heterozygous parent, then they have different genotypes, and we say that they do not share alleles. If carrying A versus a does not affect disease risk, then the average allele sharing is 1/2. (This can be understood from the probability of identity by descent of genes carried by siblings, which is 1/2; see Box 15.3.) If allele A increases disease risk by a factor γ, then the average probability of allele sharing is (1 + γ2)/(1 + γ)2. (For example, if γ = 2, the probability of allele sharing is 5/9 = 0.556, and if γ = 4, it is 17/25 = 0.680.) If both parents are heterozygous, then the siblings may share all their alleles (if both are AA or both are aa); they may share half their alleles (if one or both siblings are heterozygous); or they may share no alleles (if one is aa and the other AA). Again, however, the average allele sharing is (1 + γ2)/(1 + γ)2. Note that the probability of allele sharing is the same whether a or A are associated with increased risk: This method can detect linkage even when the association varies from family to family.

Transmission of Alleles: Now, suppose that A is associated with disease risk in the population as a whole, either because of linkage disequilibrium or because it actually causes disease. Then, an affected offspring is more likely to have inherited A than a from a heterozygous parent. Because each copy of A increases risk by a factor γ, the ratio of transmission probabilities is γ:1, and the probability that A was transmitted, given that the offspring is affected, is γ/(1 + γ). The transmission disequilibrium test (TDT) looks for alleles that are transmitted in excess to affected offspring. Figure WN26.2 shows that the bias in transmission (upper curve) is much stronger than the bias in the probability of allele sharing (lower curve), especially for modest effects (γ < 2, say).

Power to Detect Effects: Now, suppose that we score a very large number of markers, for example, we might score five markers for each of 30,000 genes, making 150,000 marker loci in all. Almost all of these will have a probability of transmission or of allele sharing of 1/2, but there will be random variation around this expectation. We choose a threshold for significance that is high enough that there is a very low chance (say, 5%) of mistakenly finding that any one of the very large number of markers has an effect on disease risk. This requires that we choose the threshold so that the chance that any one marker exceeds the threshold is 0.05/(150,000) = 3.3 × 10–7, if it actually has no effect. We now ask, what is the chance that a marker associated with effect γ will give allele sharing above this threshold? (This is known as the statistical power to detect an effect of size γ.)

Figure 26.4 shows the probability of detecting an association plotted against its effect, γ. To allow a fair comparison between the two methods, we assume that 1000 families containing a pair of affected siblings are sampled. Both parents and both affected siblings are genotyped; note that families can only be informative if at least one parent is heterozygous. We assume that the alleles A, a are at equal frequency, but as long as the disease allele is not common, or very rare, allele frequency has little effect on power. We can see immediately that the transmission disequilibrium test (left), which is based on a population-wide association, is much more powerful than allele sharing (right curve): it can detect effects of γ ~ 1.3, whereas allele sharing can only detect large effects, of at least γ ~ 3.

There are two reasons for this difference. First, in a family of two parents and two offspring, there are four potential transmission routes that can potentially be informative (from either parent to either offspring), but only one pair of affected siblings that can share alleles. Second, the deviation of transmission from 50% is larger than the deviation in allele sharing, especially when effects are modest (γ < 2 say). That in turn is because the TDT test looks at a population-wide association rather than an association that changes from family to family.

A further advantage of tests for association in the population as a whole is that they can map genes much more precisely. If we scored an enormous number of markers for allele sharing, then very many of them would be linked to the causal allele, and so would show a significant excess sharing. (For the human genome, about 1/350 markers, ~150 in our example, would be within 5 cM of a causal locus, and so would all be detected as significant.) In contrast, linkage disequilibrium extends over much shorter distances, and so fewer markers in a smaller region of genome would be likely to be detected by the TDT test.

This calculation is optimistic, because it assumes a very simple null model. Allele sharing might vary because in some families markers might have effects in some environments or in some genetic backgrounds; and transmission probability might vary because a marker could be associated in some populations but not others. This kind of heterogeneity could inflate the tail of unusually large values, and so generate an excess of false positives.

 
 
 

 
. .