10-01-2023, 07:34 PM
Three common protein isoforms of apolipoprotein E (apoE), encoded by the ε2, ε3, and ε4 alleles of the APOE gene, differ in their association with cardiovascular and Alzheimer's disease risk and COVID-19.
![[Image: AJHGv67p881fg1.jpg]](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1287893/bin/AJHGv67p881fg1.jpg)
fig. Locations of polymorphic variants in and around the APOE gene. The genomic location of the 23 DNA variants, identified by sequencing 5.5 kb in 96 individuals, is shown, below the exon-intron structure of the APOE gene. An asterisk (*) marks the new variant identified in the Mayan sample from Campeche at position 3701, and x’s show the population distribution of the observed polymorphisms (J = Jackson, C = Campeche, N = North Karelia, and R = Rochester). Variants that result in amino acid substitutions are boxed.
The global ubiquity of the three common apoE isoforms and, in particular, the persistently high relative frequency of the ε3 allele in all human populations, has puzzled investigators and prompted much speculation regarding the evolutionary forces responsible for the observed polymorphism
Therefore, although we did not find statistically significant evidence of the effects of natural selection on APOE variation, it is not unreasonable to infer that selection has acted. The value of examining this possibility closely is heightened by the variety of phenotypes with which the protein isoforms have been associated.
Precisely when the new variant at site 3937, and its associated haplotypes, began to expand in frequency relative to the ancestral allele is uncertain and is unlikely to be resolved using analytical methods that assume selective neutrality (such as those employed here). The presence of both ε3 sublineages in each of the four populations investigated, in the absence of evidence for recurrent mutation at the sites involved, is consistent with site 3937 having arisen prior to the major population expansions that accompanied the spread of anatomically modern humans <100,000 years ago. Alternatively, strong selective pressure may have facilitated the global distribution of the variant much more recently. Indirect evidence in support of the latter hypothesis is the fact that the basal haplotype of the ε3 clade (haplotype 6) is absent in the European samples from Rochester and North Karelia (hence, if present, likely to be at low frequency in those populations), suggesting that the rise in frequency of the ε3s in Europe may have occurred after the differentiation of the clade into two primary lineages. The low overall level of population heterogeneity (FST = 0.06) is also consistent with a recent rapid expansion of ε3. In either case, the lower relative frequency and patchy geographical distribution of the ε2 haplotypes, as well as their clear derivation from the ε3 clade, suggest that the variant defining these latter alleles (at site 4075) arose subsequently.
A relevant question is whether known phenotypic effects associated with the observed variation contribute to current differences in reproductive fitness. There is strong evidence that the inheritance of an ε4 allele places carriers at a higher risk of succumbing to CAD or AD, at least in European and Asian populations (Davignon et al. 1988; Roses 1996). The deleteriousness of ε4, relative to that of ε3, is consistent with the history of long-term genetic change at the APOE locus, but CAD and AD are both diseases of late adulthood or old age, and increased susceptibility to these conditions would not be expected to result in important differences in reproductive success (for an opposing argument, see Finch and Sapolsky [1999]). The widely expressed protein does play many different roles in the body, including facilitation of lipid absorption, neural growth and regeneration, and immune function (Mahley and Huang 1999), one or more of which could have a direct effect on fertility or could contribute to differential survival and reproductive success. One study has suggested that men carrying at least one ε3 allele have, on average, more children than men with other APOE genotypes (Gerdes et al. 1996a). Alternatively, APOE variation may reflect an adaptation to changing diets (Hanlon and Rubinsztein 1995), such as those which accompanied the transition from subsistence to agricultural economies (Corbo and Scacchi 1999), may play a key role in neurological response to head injury (Friedman et al. 1999), or may mediate susceptibility to lipophylic pathogens (Martin 1999).
Population-level differences in the distribution of APOE variation are relevant to a comprehensive prediction of risk and understanding of disease etiology. Despite the low overall level of polymorphism at the APOE gene, considerable heterogeneity characterizes each of the three common alleles at the sequence level, heterogeneity which helps explain previously perplexing association results. For example, in a 5-year prospective epidemiological study of AD incidence among different ethnic groups, Tang et al. (1998) found that, compared to ε3/ε3 homozygotes, the relative risk (RR) of AD associated with one or more copies of the ε4 allele was significantly increased among whites (RR 2.5) but not among blacks (RR 1.0) or Hispanics (RR 1.1). These results confirmed previous reports suggesting that the association of ε4 with AD is weaker or nonexistent among blacks living in New York City (Tang et al. 1996) and Indiana (Sahota et al. 1997), as well as among black Nigerians (Osuntokun et al. 1995) and East Africans (Sayi et al. 1997). Similar variation is found regarding lipids (Xu et al. 1999). These observations take on new significance in the light of our finding that geographic and/or ethnic differences exist in the distribution of haplotypes within the ε4 class (fig. 3). If, as our results suggest, different types of ε4 alleles are found at different relative frequencies in different geographic regions, this heterogeneity (which may be related to nonneutral forces acting on the locus) can be—indeed, must be—accounted for.
Similarly, our data provide an invaluable evolutionary context in which to interpret more circumscribed analyses. There has, for instance, recently been much interest in characterizing APOE promoter-region polymorphism and examining the association of such variants with AD and CAD risk. These analyses have met with only limited success. Either the observed variants have turned out not to explain any greater proportion of the observed variance in phenotype than explained by the common allelic variants or positive associations in one population have failed to be replicated in subsequent studies. The most problematic discrepancies in this regard have centered on the role of the −491 variant (Artiga et al. 1998b). Some workers have suggested that variation at this site is a strong determinant of AD risk (Artiga et al 1998a; Bullido et al 1998), whereas other researchers have either questioned the importance of the polymorphism relative to other regulatory region variants (Lambert et al. 1998b; Town et al. 1998), or failed to replicate the association altogether (Roks et al. 1998; Song et al. 1998). Our analysis suggests a possible explanation: −491 (site 560 here) appears to be particularly susceptible to recurrent mutation and/or gene conversion, placing it in association with different allelic backgrounds, with different functional effects, in different populations. In this context, it is unsurprising that results have conflicted. On the other hand, the prominent placement of site 832 (−219) in our inferred haplotype network, as a site defining major subtypes of both ε3 and ε4 haplotypes in multiple populations, appears consistent with the association of this site with differences in both AD risk (Lambert et al. 1998a, 1998b) and myocardial infarction (Lambert et al. 2000). The relationship of other variants with unique positions in the APOE gene tree (particularly sites 1163 and 2440 among ε3s and site 1998 among ε4s) clearly merit detailed investigation. When large samples are typed at these variable sites, and relevant phenotypes are scored, it will be possible to test directly the independence of effects of the variable sites on lipid phenotypes.
from www.ncbi.nlm.nih.gov/pmc/articles/PMC1287893/
![[Image: AJHGv67p881fg1.jpg]](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1287893/bin/AJHGv67p881fg1.jpg)
fig. Locations of polymorphic variants in and around the APOE gene. The genomic location of the 23 DNA variants, identified by sequencing 5.5 kb in 96 individuals, is shown, below the exon-intron structure of the APOE gene. An asterisk (*) marks the new variant identified in the Mayan sample from Campeche at position 3701, and x’s show the population distribution of the observed polymorphisms (J = Jackson, C = Campeche, N = North Karelia, and R = Rochester). Variants that result in amino acid substitutions are boxed.
The global ubiquity of the three common apoE isoforms and, in particular, the persistently high relative frequency of the ε3 allele in all human populations, has puzzled investigators and prompted much speculation regarding the evolutionary forces responsible for the observed polymorphism
Therefore, although we did not find statistically significant evidence of the effects of natural selection on APOE variation, it is not unreasonable to infer that selection has acted. The value of examining this possibility closely is heightened by the variety of phenotypes with which the protein isoforms have been associated.
Precisely when the new variant at site 3937, and its associated haplotypes, began to expand in frequency relative to the ancestral allele is uncertain and is unlikely to be resolved using analytical methods that assume selective neutrality (such as those employed here). The presence of both ε3 sublineages in each of the four populations investigated, in the absence of evidence for recurrent mutation at the sites involved, is consistent with site 3937 having arisen prior to the major population expansions that accompanied the spread of anatomically modern humans <100,000 years ago. Alternatively, strong selective pressure may have facilitated the global distribution of the variant much more recently. Indirect evidence in support of the latter hypothesis is the fact that the basal haplotype of the ε3 clade (haplotype 6) is absent in the European samples from Rochester and North Karelia (hence, if present, likely to be at low frequency in those populations), suggesting that the rise in frequency of the ε3s in Europe may have occurred after the differentiation of the clade into two primary lineages. The low overall level of population heterogeneity (FST = 0.06) is also consistent with a recent rapid expansion of ε3. In either case, the lower relative frequency and patchy geographical distribution of the ε2 haplotypes, as well as their clear derivation from the ε3 clade, suggest that the variant defining these latter alleles (at site 4075) arose subsequently.
A relevant question is whether known phenotypic effects associated with the observed variation contribute to current differences in reproductive fitness. There is strong evidence that the inheritance of an ε4 allele places carriers at a higher risk of succumbing to CAD or AD, at least in European and Asian populations (Davignon et al. 1988; Roses 1996). The deleteriousness of ε4, relative to that of ε3, is consistent with the history of long-term genetic change at the APOE locus, but CAD and AD are both diseases of late adulthood or old age, and increased susceptibility to these conditions would not be expected to result in important differences in reproductive success (for an opposing argument, see Finch and Sapolsky [1999]). The widely expressed protein does play many different roles in the body, including facilitation of lipid absorption, neural growth and regeneration, and immune function (Mahley and Huang 1999), one or more of which could have a direct effect on fertility or could contribute to differential survival and reproductive success. One study has suggested that men carrying at least one ε3 allele have, on average, more children than men with other APOE genotypes (Gerdes et al. 1996a). Alternatively, APOE variation may reflect an adaptation to changing diets (Hanlon and Rubinsztein 1995), such as those which accompanied the transition from subsistence to agricultural economies (Corbo and Scacchi 1999), may play a key role in neurological response to head injury (Friedman et al. 1999), or may mediate susceptibility to lipophylic pathogens (Martin 1999).
Population-level differences in the distribution of APOE variation are relevant to a comprehensive prediction of risk and understanding of disease etiology. Despite the low overall level of polymorphism at the APOE gene, considerable heterogeneity characterizes each of the three common alleles at the sequence level, heterogeneity which helps explain previously perplexing association results. For example, in a 5-year prospective epidemiological study of AD incidence among different ethnic groups, Tang et al. (1998) found that, compared to ε3/ε3 homozygotes, the relative risk (RR) of AD associated with one or more copies of the ε4 allele was significantly increased among whites (RR 2.5) but not among blacks (RR 1.0) or Hispanics (RR 1.1). These results confirmed previous reports suggesting that the association of ε4 with AD is weaker or nonexistent among blacks living in New York City (Tang et al. 1996) and Indiana (Sahota et al. 1997), as well as among black Nigerians (Osuntokun et al. 1995) and East Africans (Sayi et al. 1997). Similar variation is found regarding lipids (Xu et al. 1999). These observations take on new significance in the light of our finding that geographic and/or ethnic differences exist in the distribution of haplotypes within the ε4 class (fig. 3). If, as our results suggest, different types of ε4 alleles are found at different relative frequencies in different geographic regions, this heterogeneity (which may be related to nonneutral forces acting on the locus) can be—indeed, must be—accounted for.
Similarly, our data provide an invaluable evolutionary context in which to interpret more circumscribed analyses. There has, for instance, recently been much interest in characterizing APOE promoter-region polymorphism and examining the association of such variants with AD and CAD risk. These analyses have met with only limited success. Either the observed variants have turned out not to explain any greater proportion of the observed variance in phenotype than explained by the common allelic variants or positive associations in one population have failed to be replicated in subsequent studies. The most problematic discrepancies in this regard have centered on the role of the −491 variant (Artiga et al. 1998b). Some workers have suggested that variation at this site is a strong determinant of AD risk (Artiga et al 1998a; Bullido et al 1998), whereas other researchers have either questioned the importance of the polymorphism relative to other regulatory region variants (Lambert et al. 1998b; Town et al. 1998), or failed to replicate the association altogether (Roks et al. 1998; Song et al. 1998). Our analysis suggests a possible explanation: −491 (site 560 here) appears to be particularly susceptible to recurrent mutation and/or gene conversion, placing it in association with different allelic backgrounds, with different functional effects, in different populations. In this context, it is unsurprising that results have conflicted. On the other hand, the prominent placement of site 832 (−219) in our inferred haplotype network, as a site defining major subtypes of both ε3 and ε4 haplotypes in multiple populations, appears consistent with the association of this site with differences in both AD risk (Lambert et al. 1998a, 1998b) and myocardial infarction (Lambert et al. 2000). The relationship of other variants with unique positions in the APOE gene tree (particularly sites 1163 and 2440 among ε3s and site 1998 among ε4s) clearly merit detailed investigation. When large samples are typed at these variable sites, and relevant phenotypes are scored, it will be possible to test directly the independence of effects of the variable sites on lipid phenotypes.
from www.ncbi.nlm.nih.gov/pmc/articles/PMC1287893/
Salkhit 625 SNP, Otzi 803 SNP, Mik15 798 SNP, RISE493 1335 SNP, I11456 1024 SNP, I7718 980 SNP, I9041 512S
Target: tipirneni:dante
Chebyshev distance: 0.64%
79.0 IRN_SIS_BA2
12.4 ITA_Daunian
8.6 Poland_Viking.SG
Target: tipirneni:dante
Chebyshev distance: 0.64%
79.0 IRN_SIS_BA2
12.4 ITA_Daunian
8.6 Poland_Viking.SG