Botanical Studies (2010) 51: 27-36.
molecular biology
Geographical variation and differential selection modes of paralogs of chloroplast small heat shock protein genes in Machilus kusanoi (Lauraceae)
Jeng-Der CHUNG1, Tsan-Piao LIN2, Yu-Pin CHENG3, and Shih-Ying HWANG4 *
1Division of Silviculture, Taiwan Forestry Research Institute, 53 Nanhai Road, Taipei 100, Taiwan
2Institute of Plant Biology, National Taiwan University, 1 Roosevelt Road, Section 4, Taipei 106, Taiwan
3Division of Forest Biology, Taiwan Forestry Research Institute, 53 Nanhai Road, Taipei 100, Taiwan
4Department of Life Science, National Taiwan Normal University, 88 Tingchow Road, Section 4, Taipei 116, Taiwan
(Received July 3, 2008; Accepted April 22, 2009)
ABSTRACT. The hydrophobicity of amino acid residues in the a-crystallin domain (ACD) of small heat shock proteins (sHSPs) is thought to be important for polypeptide binding and play key roles in substrate protection. A molecular population approach was applied in an attempt to understand the geographic variations and selective forces exerted on the chloroplast sHSP gene duplicates (CPsHSP-1 and CPsHSP-2) of Machilus kusanoi. In total, 84 individuals from 17 sampling sites were used in this investigation. Five haplotypes, including synonymous and nonsynonymous substitutions, were found for CPsHSP-1. However, only one synonymous substitution was found for CPsHSP-2. The conservation of CPsHSP-2 might be related to the action of purifying selection. In contrast, the wide distribution of several CPsHSP-1 haplotypes in different populations and the significantly positive value of Tajima's D suggested that balancing selection has governed the evolution of CPsHSP-1. The functional novelty of the paralogs of CPsHSPs was inferred from the hydrophobicity profile analysis and functional divergence test. Changes in hydrophobicity profile are also observed for the allelic variants of CPsHSP-1, and the hydrophbicity shift might have played important roles in expanding substrate specificity of the CPsHSP-1 as molecular chaperones.
Keywords: Chloroplast small heat shock protein; Gene duplication; Hydrophobicity shift; Machilus kusanoi; Paralogs.
introduction
Most new genes are thought to arise following a duplication event (Ohno, 1970). Natural selection can maintain both duplicates either for neofunctionalization or subfunctionalization. The fitness of duplicated genes is the most important factor determining their fixation (Kondrashov and Kondrashov, 2006). Individuals from different populations carry multiple copies of duplicated genes for the sake of enhancing population fitness in the process of adaptation to environmental heterogeneity across broad geographic scales. Maintenance of a stable intermediate frequency of both alleles of a locus is believed to be governed by the action of balancing selection (Kreitman and Di Rienzo, 2004). The neutral theory of molecular evolution is a null hypothesis for testing the compliance of genetic polymorphism either by balancing or directional selection. The strength of a selection at a particular locus can be affected by changes in
*Corresponding author: E-mail: hsy9347@ntnu.edu.tw; Tel: +886-2-2932-6234; Fax: +886-2-2931-2904.
the environment experienced (Dykhuizen and Hartl, 1980; Hartl et al., 1985; Dykhuizen et al.1987). Intraspecific comparisons can identify differences in the evolutionary trajectories of duplicated loci. The understanding of the fates of duplicated genes under selection is strengthened by comparing patterns of variation across the organism's distributional range because the historical relatedness, population size, and mutation rates can also affect the fates of duplicated genes (Walsh, 2003). From a population genetics approach, Moore and Purugganan (2003) found that selection rather than drift is the key force in the evolution of duplicate gene loci in Arabidopsis.
Among the wide array of mechanisms that plants have for adapting to stressful environments is the induction of small heat shock proteins (sHSPs) (Vierling, 1991). It is thought that sHSPs function as molecular chaperones, preventing intracellular proteins from misfolding and forming inappropriate aggregations (Sun et al" 2002). The nuclear-encoded chloroplast sHSP (CPsHSP) exhibits a high degree of conservation in three consensus regions, one at methionine-rich amphipathic a-helix and two located in the a-crystallin domain (ACD), in which the
28
Botanical Studies, Vol. 51, 2010
methionine-rich domain is the most conserved (Chen
and Vierling, 1991; Vierling, 1991; Waters, 1995).
Maintenance of a robust photosynthetic system under heavy metal contamination is reported to be relevant to CPsHSPs (Heckathorn et al" 2004). Recently, the evolutionary and ecological roles of HSPs have been explored (S0rensen et al., 2003). Among eight closely related species from the genus Ceanothus, the expression of C PsHS P was found to be associated with the photosynthetic thermal tolerance (Knight and Ackerly, 2001). Interestingly, Barua et al. (2003) reported that polymorphism in the expression levels of CPsHSPs has played a key role in the population fitness of Chenopodium album. Subfunctionalization of CPsHSPs has been investigated in bentgrass (Agrostis stolonifera) (Wang and Luthe, 2003). Recently, extensive silent mutations have been documented in nucleotide sequences of the animal nucleophosmin/nucleoplasmin types of chaperones, indicating strong purifying selection at the protein level (Eirin-Lopez et al., 2006). However, the population evolutionary dynamics of CPsHSP genes have not been investigated in plants; therefore, it is unclear whether balancing or purifying selection is the predominant force shaping the molecular evolution of CPsHSP genes in plant populations.
Previously, two copies of CPsHSP genes were identified in the genus Machilus (Wu et al., 2007). These two copies of CPsHSPs displayed high levels of conservation in the amino acid sequences of the methionine-rich domain but were highly diversified in the region of the ACD. Machilus (Lauraceae) are evergreen trees or shrubs which consist of about 100 species distributed mainly in tropical and subtropical areas of Asia (Liu et al., 1994). Machilus kusanoi Hayata is widely distributed but mainly i n the lowland river regions in Taiwa n. Int erspecific comparisons can identify differences in the evolutionary history of loci between species, but might not reveal the changes that are shaping the evolutionary dynamics of each locus throughout the adaptive landscape. The recent copy is postulated to have evolved under positive selection for different chaperonin activities in comparison with the ancient copy due to a hydrophobicity shift in the methionine-rich domain and the ACD domain of the CPsHSP (Wu et al., 2007). The hydrophobicity of amino acid residues in the ACD is thought to be important for polypeptide binding (Sharma et al., 1998). sHSPs usually form large oligomeric complexes and provide a means to rapidly expose subunits, a process which offers hydrophobic surfaces for the binding of misfolded denatured substrate proteins, thereby protecting them from inappropriate aggregation (Ganea, 2001; Sun et al., 2002).
The frequency of specific alleles and/or phenotypes in a broad geographic region has long been used to infer plant adaptation to climatic variations in an array of taxa. In this study, we present an analysis of DNA sequence variations
across the ACD of the CPsHSP-1 and CPsHSP-2 loci of
Machilus kusanoi using samples collected from across its distributional range, in an attempt to address several
questions related to the evolutionary history of CPsHSP protein polymorphism. By using a nucleotide sequence approach in randomly selected individuals, we hope to determine whether there are amino acid replacement polymorphisms at the nucleotide level within and among populations. Additionally, we were interested in whether patterns of nucleotide diversity at CPsHSP are consistent with balancing selection.
materials and methods
Plant materials and DNA purification
Sequence variation was surveyed from DNA samples of 84 individuals randomly collected from 17 populations in Taiwan encompassing the entire distributional range of Machilus kusanoi. The population code, sample size, longitude, and latitude of each population within Taiwan are shown in Table 1, and collection sites are depicted in Figure 1. Total DNA was extracted from ground-up leaf-powder according to a cetyltrimethyl ammonium bromide
(CTAB) procedure (Doyle and Doyle, 1987). DNA was
precipitated with ethanol and, after washing with 70% ethanol, was dissolved in 200 fiL TE buffer (pH 8.0) and
Figure 1. Map of Taiwan showing the sampling sites of Machilus kusanoi. The shaded area indicates the Shueshan Range (SR) and the Central Mountain Range (CMR). The longitude and latitude of each population are listed in Table 1. The number designates the population code and corresponds to that which appears in Tables 1 and 3.
CHUNG et al. ― Differential selection modes of chloroplast small heat shock protein genes in Machilus kusanoi 29
stored at -20°C. The DNA concentration was determined for each sample using the GeneQuant II RNA/DNA Calculator (AmershamBiosciences).
Primers, PCR amplification, and sequencing
T he PC R amplification of partial sequences of CPsHSP was performed with locus-specific primers designed from the published Machilus CPsHSP sequences (Wu et al., 2007). Locus-specific primers for CPsHSP-1 were 5'-AGGATAATGGAGGACCCCTC TACATA-3' and 5'-CTTGATTTTCTCCATCTCAA TGTTC-3', and locus-specific primers for CPsHSP-2 were 5'-GACCGGCTGTTCGAGGACGCGTG-3' and
5'-TGCTTTGATCTTATCTTTCTCG-3'. Due to the fact
that CPsHSPs are nuclear-encoded genes, it was necessary to clone the CPsHSP-1/CPsHSP-2 genes. PCR products were first sequenced directly after purification with a QiaGen kit (QIAGEN) and sequenced in both directions using a BigDye® Terminator V3.1 Cycle Sequencing Kit and a model ABI3 73 A automated sequencer (Applied Biosystems). PCR products were also cloned with a yT&A cloning kit (Yeastern Biotech) following the manufacturer's protocol. Amplified DNA of three plasmids screened using colony PCR was purified with a QiaGen kit (QIAGEN). Heterozygotes were identified by comparing the sequences of the PCR products and cloned sequences. For sequencing, the CPsHSP locus-specific PCR primers and the M13 forward and reverse primers of the cloning vector were used for amplification. All sequence polymorphisms were visually rechecked from the chromatograms.
Sequence alignment, phylogenetic reconstruction and statistical analyses
The deduced amino acid sequences of CPsHSP of Machilus kusanoi were aligned using the program CLUSTAL X (Thompson et al., 1997). Nucleotides were subsequently aligned manually based on the amino acid alignment. The number of haplotypes was measured within each sampling locality. Haplotype diversity (h), nucleotide diversity (n) (Nei, 1987), and Tajima's D (Tajima, 1989) for departure from neutrality on the total number of segregating sites were calculated using DnaSP version 4.0 (Rozas et al., 2003). The DNA sequence divergence and amino acid sequence divergence were estimated based on the Kimura's two-parameter (K2P) model for DNA (Kimura, 1980) and the Jones, Taylor, and Thronton (JTT) distance matrix for amino acids (Jones et al., 1992) using the MEGA version 3.0 (Kumar et al., 2004).
A Neighbor-joining (NJ) tree was generated with the MEGA 3.0 software (Kumar et al., 2004) using the aligned amino acids of CPsHSP sequences from Machilus as well as CPsHSP amino acid sequences of other species acquired from GenBank. CPsHSP sequences of Funaria hygrometrica were used as the outgroups. Other CPsHSP sequences are used in order to clearly separate the two types of CTsHSPs from M. kusanoi in the gene tree.
30
Botanical Studies, Vol. 51, 2010
Maximum parsimony (MP) analysis of the aligned amino acid sequences was conducted with PAUP* 4.0 (beta version 4. 0b10, Swofford, 2002). Heuristic searches with 1000 random entries were performed using the
ACCTRAN, MULPARS, and TBR options in PAUP*.
Gaps were treated as missing data, and all characters were accorded equal weights. To assess the confidence of the branching patterns, a bootstrap analysis (Felsenstein, 1985) was performed with 1000 pseudoreplicates. The consistency index (CI; Kluge and Farris, 1969) and retention index (RI; Farris, 1989) were also computed using the PAUP* program.
To test for an association between CPsHSP-1 and CPsHSP-2, an analysis was performed using the linkage disequilibrium (LD) exact test between polymorphic sites of the two CPsHSP loci using the Arlequin program (Schneider et al., 2000). Potential recombinations and gene conversions within each locus and between loci
were estimated using the RDP, BootScan, GENECONV,
MaxChi, Chimaera, and SiScan methods implemented in
the RDP2 software package (Martin et al., 2005).
Hydrophobicity profile and functional divergence analyses
Mean hydrophobicity profiles of the comparisons of allelic variants of both types of CPsHSPs were generated using the software Bioedit (Hall, 1999). The maximum-likelihood (ML) analysis (Gu, 2001) was performed with
the DIVERGE software (Gu and Velden, 2002), using the
phylogenetic tree from Figure 2a. DIVERGE calculates a theta ML value indicative of the level of functional divergence between proteins in different clusters of the tree, and a posterior probability to trace the amino acid positions that are likely to be responsible for the functional divergence between proteins in both clusters (CPsHSP-1
and CPsHSP-2 clusters). results
Sequence variation, genotype composition of CPsHSPs, and the LD test
All PCR amplifications resulted in only one single amplification product, which was then used for subcloning and sequencing. For each individual, at most two alleles for each type of CPsHSP were found from the three cloned sequences. The deduced amino acid sequences of five alleles in CPsHSP-1 and two alleles in CPsHSP-2 surveyed from 17 Machilus kusanoi populations were determined and aligned for analysis. The nucleotide sequences were aligned according to the alignment of amino acid sequences. Within the 285-bp region sequenced for CPsHSP-1 , there were five polymorphic sites, three of which resulted in an amino acid replacement (Table 2). Within the 287-bp region sequenced for CPsHSP-2, only one polymorphic site was found, and it was a synonymous substitution. Two synonymous and four nonsynonymous mutations were found when comparing CPsHSP-1A to
Figure 2. Phylograms generated by (a) Neighbor-joining (NJ) and (b) maximum parsimony (MP) (50% majority-rule consensus) methods. The gene trees were generated based on the alignment of deduced amino acid sequences of Machilus kusanoi and amino acid sequences acquired from GenBank. The NJ tree was generated according to the amino acid JTT matrix (Jones et al., 1992) using the MEGA program. The MP tree was generated using the computer program PAUP* 4.0 (beta version 4.0b10, Swofford, 2002). Four equally parsimonious trees were found with a tree length of 324, including 83 parsimoniously informative sites, with a consistency index (CI) of 0.8488, and a retention index (RI) of 0.8753. Bootstrap values of >50% are shown in both trees. The corresponding nucleotide sequences of M. kusanoi have been deposited in the EMBL database, and their accession numbers are provided in the tree. GenBank accession numbers of acquired chloroplast small heat shock protein amino acid sequences are also provided in the tree.
other alleles of CPsHSP-1 (Table 2). Within CPsHSP-1,
the percent K2P divergence for DNA ranged 0.4%1.8% and averaged 1.0%; the percent JTT divergence for amino acid ranged 1.1%3.3% and averaged 2.1%. The K2P and JTT divergence for DNA and amino acids were minimal or none for CPsHSP-2. The percent K2P and JTT divergences
between CPsHSP-1 and CPsHSP-2 were 75.1% and
85.1%, respectively.
Genotype compositions of the CPsHSP-1 locus were
all heterozygotes (Table 1). CPsHSP-1A, CPsHSP-1C,
CHUNG et al. — Differential selection modes of chloroplast small
heat shock protein genes in Machilus kusanoi
31
and CPsHSP-lE were the most common alleles for the CPsHSP-1 locus, and the CPsHSP-lA allele occurred in all populations examined (Table 1). The CPsHSP-1C allele had a lower frequency in comparison with
those of CPsHSP-lA and CPsHSP-lE and failed to
appear in the Lienhuachih and Jiashien populations. In contrast, CPsHSP-lB and CPsHSP-lD were rare alleles and occurred in some populations only in relatively low frequencies. No allele in the CPsHSP-1 locus was fixed in any population. On the other hand, both homozygotes and heterozgygotes were observed for the CPsHSP-2 locus and the CPsHSP-2a allele occurred throughout the distributional range of M. kusanoi. Further, the CPsHSP-2a allele was fixed in the Yangmingshan and Wulai populations. Relatively speaking, CPsHSP-2a was the most frequent allele for the CPsHSP-2 locus. However, nucleotide variations in CPsHSP-2 did not cause variations in amino acid sequences so that the same protein was fixed in all M. kusanoi populations. No relationship between latitude and allelic frequency changes was revealed (data not shown). Average pairwise nucleotide diversity values per site (n) were 0.00831 and 0.00150 for
CPsHSP-l and CPsHSP-2, respectively (Table 3). The
level of difference was over 4.5-fold higher in CPsHSP-1 than in CPsHSP-2. The LD test revealed that the two
■so s s.ovcf *
Table 2. Nucleotide polymorphism and corresponding amino acid changes in chloroplast small heat shock protein (CPsHSP) genes of Machilus kusanoi.
Haplotype
Polymorphic site
CPsHSP-1
CPsHSP-2
71
90
113
182
219
27
A
C
T
T
C
A
B
C
C
T
C
A
C
T
C
C
T
C
D
T
C
C
A
C
E
T
C
C
C
C
a
T
b
C
Synonymous
GGT
CCA
CCT
GGC
CCC
CCC
Nonsynonymous
CCA
TTC
GCG
CTA
TCC
GAG
GTG
Amino acid
Pro
Phe
Ala
change
Leu
Ser
Glu
Val
The upper three-letter codes for amino acids at each polymorphic site are the amino acids in haplotypes A and B for CPsHSP-1. The bottom three-letter codes for amino acids indicate the corresponding substitutions occurred in other haplotypes in CPsHSP-1 locus.
32
Botanical Studies, Vol. 51, 2010
CPsHSP loci were significantly linked (P < 0.0001). However, no recombination between loci or within each locus was found as estimated by the six different methods implemented in the RDP2 software.
Phylogenetic relationship between paralogous sequences
The NJ and MP phylogenetic analyses were carried out based on amino acid sequences (Figure 2), and the support for each group of sequences was tested by 1000 bootstrap pseudoreplicates. In these gene trees, two amino acid sequences of Furnaria hygrometrica were used as out groups. Overall, the two gene trees showed similar topologies. In the reconstructed phylogenetic tree, relationships between the two types of CPsHSPs were identical to those obtained using the nucleotide sequence alignment (data not shown). Moreover, the phylogenetic relationship among alleles of CPsHSP-1 was consistent in both the NJ and MP phylogenetic reconstructions. A possible explanation for the pattern of CPsHSP-1 allelic relationships is that two nearly simultaneous duplications of CPsHSP-1A and CPsHSP-1B occurred and sister to the
CPsHSP-1E, CPsHSP-1C, and CPsHSP-1D alleles.
Neutrality tests of CPsHSPs
The frequency distribution of the variants can be examined by Tajima's D test to understand the action of selective forces or the influence of demographic history. Significantly negative values of the test statistic resulted
Figure 4. Hydrophobicity profile analysis. Hydrophobicity
profiles were generated for all allelic variants of CPsHSPs using the Bioedit program based on Kyte-Doolittle's mean hydrophobicity method. The lines indicate the changes in hydrophobicity along the amino acid sequences of CPsHSP-1A and CPsHSP-1B (black), CPsHSP-1C (red), CPsHSP-1D (blue), CPsHSP-1E (green), and CPsHSP-2a and CPsHSP-2b (pink).
from a greater proportion of low-frequency variants compared with expectations under neutrality and are usually interpreted as being a result of a selective sweep or a population expansion. On the other hand, a greater proportion of intermediate-frequency variants causing the test statistic to be more positive than would be expected under neutrality can be described as a result of balancing selection or population admixture. A highly significant positive value of Tajima's D statistic (P < 0.01, Table 3) was estimated for CPsHSP-1 for all samples examined. Figure 3 shows the analysis of CPsHSP-1 sequences in a sliding window plot for Tajima's D value, illustrating that the two regions in the ACD emerged as having significantly and marginally significantly positive Tajima's D values in the sliding windows between nucleotide midpoint numbers 65~105 and 205~225, respectively. The evidence is consistent with balancing selection-maintaining alleles in CPsHSP-1 . The pattern of evolution
of CPsHSP-2 is distinct from that of CPsHSP-1 . In
CPsHSP-2, Tajima's D statistic was positive but not significant (P > 0.10, Table 3), and even negative values were estimated for some of the populations investigated (Table 3). Two alleles were found which resulted in no change in the amino acid sequences of CPsHSP-2. This result can be most appropriately explained as a signature of purifying selection at the amino acid level.
Hydrophobicity profile and functional divergence analyses
Hydrophobicity is important for the chaperone activity of sHSP in cells of various organisms (Shearstone and Baneyx, 1999; Liang et al" 2000; Lindner et al" 2000; Chowdary et al. , 2004). We therefore calculated the mean hydrophobicity based on Kyte and Doolittle's method (1982) using the Bioedit program that reveals the pattern of mean hydrophobicity along the aligned amino acid sequences (Figure 4). Clear shift in the mean hydrophobicity was observed across the sliding windows
Figure 3. Sliding window plot of Tajima's D value. A window size of 50 nucleotides with a step size of 20 nucleotides was used for estimating Tajima's D using DnaSP 4.0 (Rozas et al., 2003). On top of the figure, the blank-boxed region indicates the methionine-rich domain, and the black-boxed region indicates the a-crystallin domain. An asterisk (*) on the point of a measurement indicates significance at P < 0.05. The pound symbol (#) on the point of a measurement indicates significance
at P < 0.10.
CHUNG et al. — Differential selection modes of chloroplast small heat shock protein genes in Machilus kusanoi
33
of amino acid sequences not only between the two types of CPsHSPs but also among allelic variants of CPsHSP-1.
The Gu, 2001 likelihood ratio test in the DIVERGE
software package (Gu and Velden, 2002) was used to examine whether proteins from different branches in the phylogenetic tree have different functional constraints, and whether they have functionally diverged. High posterior probabilities of functional divergence across aligned amino acid sequences are displayed (data not shown), and the level of functional divergence between proteins is significant (theta ML value = 1.181563, P = 0.009)
according to Wang and Gu (2001) when CPsHSP-1 and
CPsHSP-2 are compared. This result suggests that these two types of CPsHSP are likely to have diverged in their biochemical functions.
discussion
Signature of balancing selection of CPsHSP-1
Maintenance of divergent alleles for an extended period of time can be caused by either balancing selection or population subdivision (Bamshad and Wooding, 2003). T he presence of divergent haplotypes of CPsHSP-1 suggests that either balancing selection or population subdivision has occurred in our study populations. However, occurrence of two or more intermediate-frequency alleles/haplotypes in different populations can mainly be attributed to balancing selection (Marjoram
and Donnelly, 1994; Bamshad and Wooding, 2003). The
presence and wide distribution of the three divergent
alleles (CPsHSP-1A, CPsHSP-1C, and CPsHSP-1E) at
intermediate frequencies in almost all of the populations examined shows that subdivision cannot account for the presence of these divergent lineages. A better explanation is that balancing selection has historically been active in the CPsHSP-1 region of the genome, or has been in tight linkage disequilibrium with it in some regions.
Evidence that balancing selection may have been active in the CPsHSP-1 region is further supported by the test of Tajima's D statistic (Tajima, 1989). Tajima's D looks for a departure from a neutral evolution model by comparing estimates of diversity based on nucleotide diversity (6) and average pairwise nucleotide diversity (n), which are differently affected by natural selection. Some types of rare-allele advantage resulting in an accumulating allelic frequency to an intermediate level may have been involved in balancing selection, thus causing a positive value of D. Under balancing selection, there is a possibility of increasing heterozygote reproductive success as well as increasing homozygote mortality that can lead to the heterozygote advantage. Essentially no homozygote was found at the CPsHSP-1 locus in our samples. In spite of the lack of a homozygote in our samples, it is unclear how large the selective difference between homozygotes and heterozygotes in the CPsHSP-1 locus is although it is thought to be relatively small (Nei and Hughes, 1991). However, a theoretical model favoring heterozygote
advantages with multiple alleles that lead to balancing selection was proposed (Hedrick, 1997). Overall, a strong balancing selection favoring heterozygotes was observed in CPsHSP-1 , which reflects the large deficiency of homozygotes compared with that of neutral expectations.
Natural selection dictates the association of genetic variations and phenotypic traits. Thus, evidence that balancing selection has been active in CPsHSP-1 suggests that some kind of functional divergence likely exists in linkage disequilibrium with the genetic variants that distinguished the haplotypes. The polymorphism of these
divergent haplotypes (CPsHSP-1A, CPsHSP-1C, and
CPsHSP-1E) and three relatively common protein variants in our samples is consistent with balanced polymorphism, in that segregating variants have persisted in the populations for a prolonged period of time. Further, a lack of recombination within sequences of the CPsHSP-1 locus indicates that the allelic variants found did not evolve by gene conversion but likely occurred as true allelic variants under balancing selection. Recently, an episode of ancient genome-wide duplications in the basal angiosperm lineages including the Lauraceae was found (Cui et al., 2006) indicating the probable ancestral polymorphism of CPsHSP-1 in M. kusanoi. Moreover, since CPsHSP-1 proteins cluster with those of a gymnosperm (Picea glauca) and a eudicot (Euphorbia esula), we presume they have been around for a long time. Although these polymorphisms do not imply that all amino acid polymorphisms are adaptive, the high frequency and their probable persistence suggest that they are fitness oriented.
Signature of purifying selection of CPsHSP-2
A contrasting signature of the natural selection of the CPsHSP loci might not always be distinguishable through an examination of DNA sequences. Our analyses of nucleotide variations demonstrate that CPsHSP-2 may have also been under some kind of balancing selection due to the insignificant positive value of Tajima's D. This nonsignificant result may be attributable to the low number of polymorphisms observed, which weakens the test. The low number is probably related to the frequent elimination of deleterious alleles that followed divergence from CPsHSP-1 (Wu et al., 2007). However, the positive D value is probably caused by intermediate frequencies of the two haplotypes resulting from one synonymous polymorphic site. In fact, the constraint of CPsHSP-2 nucleotide substitution resulting in only one single nucleotide substitution and causing no amino acid change led to fixation of CPsHSP-2 in all populations examined at the amino acid level and is consistent with the notion that recently duplicated genes evolve under purifying selection (Lynch and Conery, 2000).
Alternatively, in a simulation study (Schierup et al., 2000), balanced polymorphism has been implicated in increasing Tajima's D statistic in linked neutral loci based on empirical evidence from the major histocompatibility complex system and the plant gametophytic self-
34
Botanical Studies, Vol. 51, 2010
incompatibility system. It is likely that the strongly balanced selection of CPsHSP-1 might have raised the value of Tajima's D statistic in the linked CPsHSP-2 locus. Nevertheless, sequence conservation in CPsHSP-2 possibly indicates that purifying selection is most intense at the amino acid level. We previously reasoned that the major functional derivation of CPsHSP-2 was due to a shifting of the hydrophobicity profile across the amino acid sequences that may have been related to the expanding substrate specificity required, which is promoted by positive selection (Wu et al. , 2007). Thus, this example shows that natural selection is able to favor amino acid replacements (positive selection) when one protein first evolves and then selective constraints preserve those presumably functionally important copies of the CPsHSP gene. The inference of purifying selection of the CPsHSP-2 gene suggests that its evolution was not a functional redundancy and that it may have achieved fixation due to its fitness advantage (Nowak et al" 1997).
Concerted evolution caused by gene conversion that results in gene homogenization has been demonstrated to be relatively common in tandemly duplicated genes in yeast (Gao and Innan, 2004). Gene conversion resulting in inerlocus gene homogenization is reported in highly repeated, tandemly arranged ribosomal RNA genes (Baldwin et al., 1995; Leister, 2004). Infrequent gene conversion has been found to have homogenized some of the class I cyotsolic sHSP gene sequences (Waters, 1995). The topology obtained in the phylogenetic tree of a previous (Wu et al., 2007) and this study showed that
alleles of CPsHSP-1 or of CPsHSP-2 formed different
clusters, indicating that they are more closely related between than within species. Interlocus homogenization of CPsHSP-1 and CPsHSP-2 sequences due to the conserved domain in the gene within species is less likely or less frequent because of the high levels of K2P and JTT divergence in the nucleotide and amino acid sequences even though a significant linkage between CPsHSP-1 and CPsHSP-2 was detected. Further, no recombination or gene conversion event was detected using any of the methods implemented in the RDP2 package. Sequence homogeneity attained both at the nucleotide and amino acid levels in CPsHSP-2 has most probably been maintained by strong purifying selection. In this case, any polymorphism observed represents a balance between the generation of new alleles via mutations and the purging of those alleles via purifying selection.
Functional divergence of CPsHSPs related to hydrophobicity shift
Theoretically, the potential for evolving new gene functions is assumed by the notion that gene duplication per se is selectively neutral and one of the gene copies is freed from selective constraints (Ohno, 1 970). Duplicated genes freed from functional constraints can evolve faster and obtain new functions, which provide safeguards against environmental fluctuations in the geographic landscape. The great diversity found between
CPsHSP-1 and CPsHSP-2 at both the nucleotide and
amino acid levels may be an indication of differential evolution between these two gene copies. The larger ranges at both nucleotide and amino acid levels of the within-locus diversity of CPsHSP-1 indicates alternative functional fates of the alleles, which might have been influenced by balancing selection and provided different substrate specificities for CPsHSP-1 proteins. This implication can supposedly be verified by the analysis of the hydrophobicity profile (Figure 4). The possible alternative functional fates of CPsHSP-1 alleles suggest that the novelty of functional divergence of these alleles for different substrate specificities dictated the evolution of CPsHSP-1. From the results of the hydrophobicity profile analyses, it is likely that the majority of changes in the ACD have been directed at shifting the hydrophobicity. According to our data, changes in the hydrophobicity profile of the amino acid sequences of CPsHSP-1 proteins that have elevated and reduced hydrophobicity in different regions of the ACD, which might have played important roles in adjusting to the environmental landscape though stochastic processes, cannot be ruled out. Populations examined are restricted to small geographic regions; this represented the variation of local environment in Taiwan' s rugged geographic topology. On the other hand, the question of whether the genetic variation behind the hydrophobicity changes in CPsHSPs is associated with environmental factors or geographical structuring, such as elevation, requires further investigation.
Acknowledgements. This study was supported by grants from the National Science Council (NSC94-
22621 -B-03 4-001 and NSC95-22621 -B-03 4-001 )
Executive Yuan, Taiwan. The authors are grateful for the assistance with sample collection by Ji-Shen Wu, Division of Silviculture, Taiwan Forest Research Institute.
literature cited
Bamshad, M. and S. Wooding. 2003. Signatures of natural selection in the human genome. Nat. Rev. Genet. 4: 99-111.
Baldwin, B. G. , M. J. S anderson, J. M. Porter, M. F. Wojciechowski, C.S. Campbell, and M.J. Donoghue. 1995. The ITS region of nuclear ribosomal DNA: a valuable source of evidence on angiosperm phylogeny. Ann. Miss. Bot. Gard. 82: 247-277.
Barua, D., S.A. Heckathorn, and C.A. Downs. 2003. Variation in chloroplast small heat-shock protein function is a major determinant of variation in thermotolerance of photosynthetic electron transport among ecotypes of Chenopodium album. Funct. Plant Biol. 30: 1071-1079.
Chen, Q. and E. Vierling. 1991. Analysis of conserved domains identifies a unique structural feature of a chloroplast heat shock protein. Mol. Gen. Genet. 226: 425-431.
Chowdary, T.K., B. Raman, T. Ramakrishna, and C.M. Rao. 2004. Mammalian Hsp22 is a heat-inducible small heat-shock protein with chaperone-like activity. Biochem. J. 381:
CHUNG et al. ― Differential selection modes of chloroplast small heat shock protein genes in Machilus kusanoi
35
379-387.
Cui, L., P.K. Wall, J.H. Leebens-Mack, B.G. Lindsay, D.E. Sol-
tis, J.J. Doyle, P.S. Soltis, J.E. Carlson, K. Arumuganathan,
A. Barakat, V.A. Albert, H. Ma, and C.W. de Pamphilis.
2006. Widespread genome duplications throughout the hi­story of flowering plants. Genom. Res. 16: 738-749.
Doyle, J.J. and J.L. Doyle. 1987. A rapid DNA isolation
procedure for small quantities of fresh leaf material. Phytochem. Bull. 19: 11-15.
Dykhuizen, D.E., A.M. Dean, and D.L. Hartl. 1987. Metabolic
flux and fitness. Genetics 115: 25-31.
Dykhuizen, D.E. and D.L. Hartl. 1980. Selective neutrality of 6pgd allozymes in Escherichia coli and the effects of genetic background. Genetics 96: 801-817.
Eirin-Lopez, J.M., L.J. Frehlickm, and J. Ausio. 2006. Long-term evolution and functional diversification in the mem­bers of the nucleophosmin/nucleoplasmin family of nuclear chaperones. Genetics 173: 1835-1850.
Farris, J.S. 1989. The retention index and the rescaled consistency index. Cladistics 5: 417-419.
Felsenstein, J. 1985. Confidence limits on phylogenies: an approach using the bootstrap. Evolution 39: 783-791.
Ganea, E. 2001. Chaperone-like activity of a-crystallin and other small heat shock proteins. Curr. Prot. Pep. Sci. 2: 205-225.
Gao, L.Z. and H. Innan. 2004. Very low gene duplication rate in the yeast genome. Science 306: 1367-1370.
Gu, X., and K.V. Velden. 2002. DIVERGE: phylogeny-based analysis for functional-structural divergence of a protein family. Bioinformatics 18: 500-501.
Hall, T.A. 1999. BIOEDIT: a user-friendly biological sequence alignment, editor and analysis program for Windows 95/98/ NT. Nucl. Acids Symposium Ser. 41: 95-98.
Hartl, D.L., D.E. Dykhuizen, and A.M. Dean. 1985. Limits of
adaptation: the evolution of selective neutrality. Genetics
111: 655-674.
Heckathorn, S.A., J.K. Mueller, S. LaGuidice, B. Zhu, T. Barrett,
B. Blair, and Y. Dong. 2004. Chloroplast small heat-shock proteins protect photosynthesis during heavy metal stress.
Am. J. Bot. 91: 1312-1318.
Hedrick, P.W. 1997. Neutrality or selection? Nature 387: 138
Jones, D.T., W.R. Taylor, and J.M. Thornton. 1992. A new
approach to protein fold recognition. Nature 358: 86-89.
Kimura, M. 1980. A simple method for estimating evolutionary rate of base substitution through comparative studies of nucleotide sequences. J. Mol. Evol. 16: 111-120.
Kluge, A.G. and J.S. Farris. 1969. Quantitative phyletics and the evolution of Anurans. System. Zool. 18: 1-32.
Knight, C.A. and D.D. Ackerly. 2001. Correlated evolution of chloroplast heat shock protein expression in closely related plant species. Am. J. Bot. 88: 411-418.
Kondrashov, F.A. and A.S. Kondrashov. 2006. Role of selec­tion in fixation of gene duplications. J. Theor. Biol. 239:
141-151.
Kreitman, M. and A. Di Rienzo. 2004. Balancing claims for balancing selection. Trends Genet. 20: 300-304.
Kumar, S., K. Tamura, and M. Nei. 2004. MEGA3: integrated
software for molecular evolutionary genetics analysis and sequence alignment. Brief. Bioinform. 5: 150-163.
Kyte, J. and R. Doolittle. 1982. A simple method for displaying the hydropathic character of a protein. J. Mol. Biol. 157:
105-132.
Leister, D. 2004. Tandem and segmental gene duplication and recombination in the evolution of plant disease resistance genes. Trends Genet. 20: 116-122.
Liang, J.J., T.X. Sun, and N.J. Akhtar. 2000. Heat-induced
conformational change of human lens recombinant aA- and aB-crystallins. Mol. Vis. 6: 10-14.
Lindner, R.A., J.A. Carver, M. Ehrnsperger, J. Buchner, G. Esposito, J. Behlke, G. Lutsch, A. Kotlyarov, and M. Gaestel. 2000. Mouse Hsp25, a small heat shock protein: the role of its C-terminal extension in oligomerization and chaperone action. Eur. J. Biochem. 267: 1923-1932.
Liu, Y.C., F.Y. Lu, and C.H. Ou. 1994. Trees of Taiwan. College of Agriculture, National ChungHsin University, Taichung, Taiwan.
Lynch, M. and J.S. Conery. 2000. The evolutionary fate and consequences of duplicate genes. Science 290: 1151-1155.
Marjoram, P. and P. Donnelly. 1994. Pairwise comparisons of mitochondrial DNA sequences in subdivided populations and implications for early human evolution. Genetics 136:
673-683.
Martin, D.P., C. Williamson, and D. Posada. 2005. RDP2:
recombination detection and analysis from sequence alignments. Bioinformatics 21: 260-262.
Moore, R.C. and M.D. Purugganan. 2003. The early stages of duplicate gene evolution. Proc. Natl. Acad. Sci. USA 100:
15682-15687.
Nei, M. 1987. Molecular Evolutionary Genetics. Columbia University Press, New York, 512 pp.
Nei, M. and A. Hughes. 1991. Polymorphism and evolution of the major histocompatibility complex loci in mammals.
In R.K. Selander, A.G. Clark and T.S. Whittam (eds.),
Evolution at the Molecular Level, Sinauer Associates, pp.
222-247.
Nowak, M.A., M.C. Boerlijst, J. Cooke, and J. Maynard Smith. 19 97. Evolution of genetic redundancy. Nature 388:
167-171.
Ohno, S. 1970. Evolution by gene duplication. Heidelberg, Ger­many, Springer-Verlag.
Rozas, J., J.C. Sanchez-DelBarrio, X. Messequer, and R. Rozas. 2003. DnaSP, DNA polymorphism analyses by the coalescent and other methods. Bioinformatics 19:
2496-2497.
Schierup, M.H., D. Charlesworth, and X. Vekemans. 2000. The effect of hitch-hiking on genes linked to a balanced polymorphism in a subdivided population. Genet. Res. 76:
63-73.
36
Botanical Studies, Vol. 51, 2010
Schneider, S., D. Roessli, and L. Excoffier. 2000. ARLEQUIN, Version 2.000: A Software for Population Genetic Data Analysis. Geneva: Genetics and Biometry Laboratory, University of, Geneva, Switzerland.
Sharma, K.K., H. Kaur, G.S. Kumar, and K. Kester. 1998.
Interaction of 1,1'bi(4-anilino)naphthalene-5,5'-disulfonic acid with a-crystallin. J. Biol. Chem. 273: 8965-8970.
Shearstone, J. and F. Baneyx. 1999. Biochemical characterization of the small heat shock proteins IbpB from Escherichia coli.
J. Biol. Chem. 274: 9937-9945.
Sorensen, J.G., T.N. Kristensen, and V. Loeschcke. 2003. The evolutionary and ecological role of heat shock proteins.
Ecol. Lett. 6: 1025-1037.
Sun, W., M. VanMontagu, and N. Verbruggen. 2002. Small heat shock proteins and stress tolerance in plants. Biochem.
Biophy. Acta 1577: 1-9.
Swofford, D.L. 2002. PAUP* Phylogenetic Analysis Using Parsimony (* and other methods), Ver. 4. Sinauer Associates, Sunderland, MA.
Tajima, F. 1989. Statistical method for testing the neutral mu­tation hypothesis by DNA polymorphism. Genetics 123:
585-595.
Thompson, J.D., T.J. Gibson, F. Plewniak, F. Jeanmougin, and D.G. Higgins. 1997. The Clustal X windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools. Nucleic Acids Res. 25: 4876-4882.
Vierling, E. 1991. The roles of heat shock proteins in plants.
Annu. Rev. Plant Physiol. Plant Mol. Biol. 42: 579-620.
Walsh, B. 2003. Population-genetic models of the fates of duplicate genes. Genetica 118: 279-294.
Wang, Y. and X. Gu. 2001. Predicting functional divergence of caspase gene family. Genetics 158: 1311-1320.
Wang, D. and D.S. Luthe. 2003. Heat sensitivity in a bentgrass variant. Failure to accumulate a chloroplast heat shock protein isoform implicated in heat tolerance. Plant Physiol.
133: 319-327.
Waters, E.R. 1995. The molecular evolution of the small heat-shock proteins in plants. Genetics 141: 785-795.
Wu, M.L., T.P. Lin, M.Y. Lin, Y.P. Cheng, and S.Y. Hwang.
2007. Divergent evolution of the chloroplast small heat shock protein in the genera Rhododendron (Ericaceae) and Machilus (Lauraceae). Ann. Bot. 99: 461-475.
大葉楠葉綠體低分子量熱休克蛋白同源基因之地理分布變異與
選汰模式差異
鍾振德1 林讚標2 鄭育斌3 黃4
1行政院林業試驗所育林組
2臺灣大學植物科學研究所
3行政院林業試驗所生物組
4臺灣師範大學生命科學系
低分子量熱休克蛋白a-水晶體蛋白區域的胺基酸疏水性與多胜肽的鍵結有其重要性並扮演著保護
受質蛋白的功能。我們針對大葉楠以族群遺傳的硏究角度來瞭解葉綠體低分子量熱休克蛋白基因在地理
分布的變異及探討選汰壓力所扮演的角色。由兩個大葉楠低分子量熱休克蛋白同源基因(CPsHSP-1
CPsHSP-2),我們發現CPsHSP-1因同義及非同義置換的突變而有五種單套型的出現;然而CPsHSP-2
僅有兩個因同義置換而形成之兩個單套型。CPsHSP-2基因序列的保守性可能與淨化選汰力量有關。相
對的,CPsHSP-1在不同族群有不同單套型的組合,其Tajima D的檢測爲顯著正値,意味著CPsHSP-1
受到均衡選汰力量的影響。大葉楠低分子量熱休克蛋白兩個同源基因CPsHSP-1CPsHSP-2的疏水性
的變異及功能差異的檢測顯示兩者間有功能上的分歧。CPsHSP-1不同單套型之胺基酸疏水性亦呈現變
異而可能與其受質蛋白對象的擴充有關。
關鍵詞:葉綠體低分子量熱休克蛋白;基因複製;疏水性變化;大葉楠;同源基因。