In multiloci-based genetic association studies of complex diseases, a powerful and high efficient tool for analyses oflinkage disequilibrium (LD) between markers, haplotype distributions and many chi-square/p values w...In multiloci-based genetic association studies of complex diseases, a powerful and high efficient tool for analyses oflinkage disequilibrium (LD) between markers, haplotype distributions and many chi-square/p values with a large numberof samples has been sought for long. In order to achieve the goal of obtaining meaningful results directly from raw data,we developed a robust and user-friendly software platform with a series of tools for analysis in association study withhigh efficiency. The platform has been well evaluated by several sets of real data.展开更多
Linkage disequilibrium(LD) can be applied for mapping the actual genes responsible for variation of economically important traits through association mapping.The feasibility and efficacy of association studies are str...Linkage disequilibrium(LD) can be applied for mapping the actual genes responsible for variation of economically important traits through association mapping.The feasibility and efficacy of association studies are strongly dependent on the extent of LD which determines the number and density of markers in the studied population,as well as the experimental design for an association analysis.In this study,we first characterized the extent of LD in a wild population and a cultured mass-selected line of Pacific oyster(Crassostrea gigas).A total of 88 wild and 96 cultured individuals were selected to assess the level of genome-wide LD with 53 microsatellites,respectively.For syntenic marker pairs,no significant association was observed in the wild population;however,three significant associations occurred in the cultured population,and the significant LD extended up to 12.7 c M,indicating that strong artificial selection is a key force for substantial increase of genome-wide LD in cultured population.The difference of LD between wild and cultured populations showed that association studies in Pacific oyster can be achieved with reasonable marker densities at a relatively low cost by choosing an association mapping population.Furthermore,the frequent occurrence of LD between non-syntenic loci and rare alleles encourages the joint application of linkage analysis and LD mapping when mapping genes in oyster.The information on the linkage disequilibrium in the cultured population is useful for future association mapping in oyster.展开更多
In this study, we propose to use the principal component analysis (PCA) and regression model to incorporate linkage disequilibrium (LD) in genomic association data analysis. To accommodate LD in genomic data and r...In this study, we propose to use the principal component analysis (PCA) and regression model to incorporate linkage disequilibrium (LD) in genomic association data analysis. To accommodate LD in genomic data and reduce multiple testing, we suggest performing PCA and extracting the PCA score to capture the variation of genomic data, after which regression analysis is used to assess the association of the disease with the principal component score. An empirical analysis result shows that both genotype-based correlation matrix and haplotype-based LD matrix can produce similar results for PCA. Principal component score seems to be more powerful in detecting genetic association because the principal component score is quantitatively measured and may be able to capture the effect of multiple loci.展开更多
To investigate the distribution characteristics and linkage disequilibrium of T cell immunoglobulin domain and mucin domain protein 4 (TIM4) promoter polymorphisms in asthma patients of Chinese Han population, the p...To investigate the distribution characteristics and linkage disequilibrium of T cell immunoglobulin domain and mucin domain protein 4 (TIM4) promoter polymorphisms in asthma patients of Chinese Han population, the promoter region of TIM4 was re-sequenced by PCR-sequencing, and linkage disequilibrium was analyzed by SHEsis software. Four single nucleotide polymor- phisms (SNPs) in the promoter region of TIM4 were detected, including two new SNPs (at positions -1609, -153) and two reported SNPs (rs6874202, rs6882076). The frequency distribution of rs6882076 was different among different races (P〈0.05). In addition, linkage disequilibrium among the SNPs of the promoter region of TIM4 was found and GGTG was the predominant haplotype. There were four SNPs in the promoter region of TIM4 in asthma patients of Chinese Han population, which were in linkage disequilibrium.展开更多
A novel method for haplotype phasing in families after joint estimation of recombination fraction and linkage disequilibrium is developed. Results from Monte Carlo computer simulations show that the newly developed E....A novel method for haplotype phasing in families after joint estimation of recombination fraction and linkage disequilibrium is developed. Results from Monte Carlo computer simulations show that the newly developed E.M. algorithm is accurate if true recombination fraction is 0 even for single families of relatively small sizes. Estimates of recombination fraction and linkage disequilibrium were 0.00 (SD 0.00) and 0.19 (SD 0.03) for simulated recombination fraction and linkage disequilibrium of 0.00 and 0.20, respectively. A genome fragmentation phasing strategy was developed and used for phasing haplotypes in a sire and 36 progeny using the 50 k Illumina BeadChip by: a) estimation of the recombination fraction and LD in consecutive SNPs using family information, b) linkage analyses between fragments, c) phasing of haplotypes in parents and progeny and in following generations. Homozygous SNPs in progeny allowed determination of paternal fragment inheritance, and deduction of SNP sequence information of haplotypes from dams. The strategy also allowed detection of genotyping errors. A total of 613 recombination events were detected after linkage analysis was carried out between fragments. Hot and cold spots were identified at the individual (sire level). SNPs for which the sire and calf were heterozygotes became informative (over 90%) after the phasing of haplotypes. Average of regions of identity between half-sibs when comparing its maternal inherited haplotypes (with at least 20 SNP) in common was 0.11 with a maximum of 0.29 and a minimum of 0.05. A Monte-Carlo simulation of BTA1 with the same linkage disequilibrium structure and genetic linkage as the cattle family yielded a 99.98 and 99.94% of correct phases for informative SNPs in sire and calves, respectively.展开更多
Quantitative trait loci (QTL) and their additive, dominance and epistatic effects play a critical role in complex trait variation. It is often infeasible to detect multiple interacting QTL due to main effects often be...Quantitative trait loci (QTL) and their additive, dominance and epistatic effects play a critical role in complex trait variation. It is often infeasible to detect multiple interacting QTL due to main effects often being confounded by interaction effects. Positioning interacting QTL within a small region is even more difficult. We present a variance component approach nested in an empirical Bayesian method, which simultaneously takes into account additive, dominance and epistatic effects due to multiple interacting QTL. The covariance structure used in the variance component approach is based on combined linkage disequilibrium and linkage (LDL) information. In a simulation study where there are complex epistatic interactions between QTL, it is possible to simultaneously fine map interacting QTL using the proposed approach. The present method combined with LDL information can efficiently detect QTL and their dominance and epistatic effects, making it possible to simultaneously fine map main and epistatic QTL.展开更多
Bamboos are one of the most beautiful and useful plants on Earth.The genetic background and population structure of bamboos are well known,which helps accelerate the process of artificial domestication of bamboo.Parti...Bamboos are one of the most beautiful and useful plants on Earth.The genetic background and population structure of bamboos are well known,which helps accelerate the process of artificial domestication of bamboo.Partial sequences of six genes involved in nitrogen use efficiency in 32 different bamboo species were analyzed for occurrence of single nucleotide polymorphisms(SNPs).The nucleotide diversityθw and total nucleotide polymorphismsπT of the sequenced DNA regions was 0.05137 and 0.03332,respectively.Bothπnonsyn/πsyn and Ka/Ks values were<1.The nucleotide sequences of these six genes were inferred to be relatively conserved,and the haplotype diversity was relatively high.The results of evolutionary neutrality tests showed that the six genes were in line with neutral evolution,and that the NRT2.1 and AMT2.1 gene sequences may have experienced negative selection.An inter-SNP recombination event at the NRT2.1 gene in the all pooled sample,of all 32 bamboo species was the lowest at 0.0645,whereas the AMT gene recombination events were all>0.1.Estimation and analysis of linkage disequilibrium of five genes revealed that with the increase in nucleotide sequence length,the degree of SNP linkage disequilibrium decreased rapidly.We inferred the population genetic structure of 32 bamboo species based on the SNP loci of six genes with frequencies>18%.32 bamboo species were divided into five categories,which indicated that the combined population of all bamboo species had obvious multivariate characteristics and was heterogeneous;red(Group 1)and green(Group 2)were the main groups.展开更多
In this article, using the likelihood score theory extended to nuisance parameters we derive a new homogeneity score test for comparing linkage disequilibrium across several strata. Power and sample size formulae are...In this article, using the likelihood score theory extended to nuisance parameters we derive a new homogeneity score test for comparing linkage disequilibrium across several strata. Power and sample size formulae are also obtained.展开更多
Cultivated barley is known to have a complex population structure and extensive linkage disequilibrium (LD).To conduct robust association mapping (AM) studies of economically important traits in US barley breeding ger...Cultivated barley is known to have a complex population structure and extensive linkage disequilibrium (LD).To conduct robust association mapping (AM) studies of economically important traits in US barley breeding germplasm,population structure and LD decay were examined in a complete panel of US barley breeding germplasm (3840 lines) genotyped with 3072 single nucleotide polymorphisms (SNPs).Nine subpopulations (sp1 sp9) were identified by the program STRUCTURE and subsequently confirmed by principle component analysis (PCA).Out of the nine subpopulations,seven were very similar to the respective subpopulations identified by Hamblin et al.(2010) which were based on half of the germplasm and half of the SNP markers,but two subpopulations were found to be new.One subpopulation was dominated by six-rowed spring lines from Utah State University (UT) and the other was composed of six-rowed spring lines from multiple breeding programs (USDA-ARS Aberdeen (AB),Busch Agricultural Resources Inc.(BA),UT,and Washington State University (WA)).LD was found to decay across a range from 4.0 to 19.8 cM.This result indicates that the germplasm genotyped with 3072 SNPs would be robust for mapping and possibly identifying the causal polymorphisms contributing to disease resistance and perhaps other traits.展开更多
Objectives To formulate an equation for fine mapping of disease loci under complex conditions and determine the marker-disease distance in a specific case using this equation. Methods Lewontin’s linkage disequi...Objectives To formulate an equation for fine mapping of disease loci under complex conditions and determine the marker-disease distance in a specific case using this equation. Methods Lewontin’s linkage disequilibrium (LD) measure D’ was used to formulate an equation for mapping disease genes in the presence of phenocopies, locus heterogeneity, gene-gene and gene-environment interactions, incomplete penetrance, uncertain liability and threshold, incomplete initial LD, natural selection, recurrent mutation, high disease allele frequency and unknown mode of inheritance. This equation was then used to determine the distance between a marker (ε4 within the apolipoprotein E gene, APOE) and Alzheimer’s disease (AD) loci using published data.Results An equation was formulated for mapping disease genes under the above conditions. If these conditions are present but ignored, then recombination fraction θ between marker and disease loci will be either overestimated or estimated with little bias. Therefore, an upper limit of θ can be obtained. AD has been found to be associated with the marker allele ε4 in Africans, Asians, and Caucasians. This suggests that the AD-ε4 allelic LD predates the divergence of peoples occurring 100?000 years ago. With the age of AD-ε4 allelic LD so estimated, the maximal distance was calculated to be 23.2 kb (mean 5.8 kb). Conclusions (1) A method is developed for LD mapping of susceptibility genes. (2) A mutation within the APOE gene itself, among others, is responsible for the susceptibility to AD, which is supported by recent evidence from studies using transgenic mice.展开更多
With completion of the Populus genome sequencing project and the availability of many expressed sequence tags (ESTs) databases in forest trees, attention is now rapidly shifting towards the study of individual genet...With completion of the Populus genome sequencing project and the availability of many expressed sequence tags (ESTs) databases in forest trees, attention is now rapidly shifting towards the study of individual genetic variation in natural populations. The most abundant form of genetic variation in many eukaryotic species is represented by single nucleotide polymorphisms (SNPs), which can account for heritable inter-individual differences in complex phenotypes. Unlike humans, the linkage disequilibrium (LD) rapidly decays within candidate genes in forest trees. Thus, SNPs-based candidate gene association studies are considered to be a most effective approach to dissect the complex quantitative traits in forest trees. The present study demonstrates that LD mapping can be used to identify alleles associated with quantitative traits and suggests that this new approach could be particularly useful for performing breeding programs in forest trees. In this review, we will describe the fundamentals, patterns of SNPs distribution and frequency, summarize recent advances in SNPs discovery and LD and comment on the application of LD in the dissection of complex quantitative traits in forest tress. We also put forward the outlook for future SNPs-based association analysis of quantitative traits in forest trees.展开更多
The inference of genome ancestry and the estimation of molecular relatedness are of great importance for breeding efficiency and association studies. Seventy SSR loci, evenly distributed in 10 chromosomes, were assaye...The inference of genome ancestry and the estimation of molecular relatedness are of great importance for breeding efficiency and association studies. Seventy SSR loci, evenly distributed in 10 chromosomes, were assayed for polymorphism among 187 commonly used maize (Zea mays L.) inbreds which represent the genetic diversity in China. The identified 290 alleles served as raw data for estimating population structure using the coalescent linked loci, based on the ADMIXTURE model. Population number, K, has been inferred to be between five and seven. Specifying five subpopulations (K = 5) led to a distinct decrease and specifying K to be greater than six resulted in only minimal increases in the likelihood value. Therefore, population number, K, has been inferred into six subpopulations, which are PA, BSSS (includes Reid), PB, Lan (Lancaster Sure Crop), LRC (Luda Reb Cob, a Chinese landrace, and its derivatives), and SPT (Si-ping-tou, a Chinese landrace and its derivatives). The Kullback-Leibler distance of pairwise subpopulation was also inferred as n × p (187 ×6) Q matrices, which gave a detailed percentage of genetic composition of six subpopulations and molecular relatedness of each line. The genome-wide linkage disequilibrium (LD) indicated that the asso- ciation studies in QTLs and/or candidate genes might avoid nonfunctional and spurious associations, as most of the LD blocks were broken among diverse germplasm. The defined population structure has given us a clear genetic structure of these lines for breeding practice and established a good basis for association analysis.展开更多
Most modern wheat cultivars were selected on the basis of yield-related indices measured under optimal fertilizer and irrigation inputs.With climate change,land degradation and salinity caused by sea water encroachmen...Most modern wheat cultivars were selected on the basis of yield-related indices measured under optimal fertilizer and irrigation inputs.With climate change,land degradation and salinity caused by sea water encroachment,wheat is increasingly subjected to environmental stress.Moreover,expanding urbanization increasingly encroaches upon prime agricultural land in countries like China,and alternative cropping areas must be found.Some of these areas have moderate constraining factors,such as salinity.Therefore,it is important to investigate whether current genetic materials and breeding procedures are maintaining adequate variability to address future problems caused by abiotic stress.In this study,a panel of 307 wheat accessions,including local landraces,exotic cultivars used in Chinese breeding programs and Chinese cultivars released during different periods since1940,were subjected to a genome-wide association study to dissect the genetic basis of salinity tolerance.Both marker-based and pedigree-based kinship analyses revealed that favorable haplotypes were introduced in some exotic cultivars as well as a limited number of Chinese landraces from the 1940 s.However,improvements in salinity tolerance during modern breeding are not as obvious as that of yield.To broaden genetic diversity for increasing salt tolerance,there is a need to refocus attention on local landraces that have high degrees of salinity tolerance and carry rare favorable alleles that have not been exploited in breeding.展开更多
Background: Different production systems and climates could lead to genotype-by-environment(G × E) interactions between populations, and the inclusion of G × E interactions is becoming essential in breeding ...Background: Different production systems and climates could lead to genotype-by-environment(G × E) interactions between populations, and the inclusion of G × E interactions is becoming essential in breeding decisions. The objective of this study was to investigate the performance of multi-trait models in genomic prediction in a limited number of environments with G × E interactions.Results: In total, 2,688 and 1,384 individuals with growth and reproduction phenotypes, respectively, from two Yorkshire pig populations with similar genetic backgrounds were genotyped with the PorcineSNP80 panel.Single-and multi-trait models with genomic best linear unbiased prediction(GBLUP) and BayesC π were implemented to investigate their genomic prediction abilities with 20 replicates of five-fold cross-validation.Our results regarding between-environment genetic correlations of growth and reproductive traits(ranging from 0.618 to 0.723) indicated the existence of G × E interactions between these two Yorkshire pig populations. For single-trait models, genomic prediction with GBLUP was only 1.1% more accurate on average in the combined population than in single populations, and no significant improvements were obtained by BayesC π for most traits. In addition, single-trait models with either GBLUP or BayesC π produced greater bias for the combined population than for single populations. However, multi-trait models with GBLUP and BayesC π better accommodated G × E interactions,yielding 2.2% – 3.8% and 1.0% – 2.5% higher prediction accuracies for growth and reproductive traits, respectively,compared to those for single-trait models of single populations and the combined population. The multi-trait models also yielded lower bias and larger gains in the case of a small reference population. The smaller improvement in prediction accuracy and larger bias obtained by the single-trait models in the combined population was mainly due to the low consistency of linkage disequilibrium between the two populations, which also caused the BayesC π method to always produce the largest standard error in marker effect estimation for the combined population.Conclusions: In conclusion, our findings confirmed that directly combining populations to enlarge the reference population is not efficient in improving the accuracy of genomic prediction in the presence of G × E interactions, while multi-trait models perform better in a limited number of environments with G × E interactions.展开更多
Aim: To complete comprehensive haplotype analysis of USP26 for both fertile and infertile men. Methods: Two hundred infertile men with severe oligospermia or non-obstructive azoospermia were subjected to sequence an...Aim: To complete comprehensive haplotype analysis of USP26 for both fertile and infertile men. Methods: Two hundred infertile men with severe oligospermia or non-obstructive azoospermia were subjected to sequence analysis for the entire coding sequences of the USP26 gene. Two hundred men with proven fertility were genotyped by primer extension methods. Allele/genotype frequencies, linkage disequilibrium (LD) characteristics and haplotypes of fertile men were compared with infertile men. Results: The allele frequencies of five single nucleotide polymor- phisms (370-37 linsACA, 494T〉C, 576G〉A, ss6202791C〉T, 1737G〉A) were significantly higher in infertile patients than control subjects. The major haplotypes in infertile men were TACCGA (28% of the population), TGCCGA (15%), TACCAA (8%), TGCCAA (6%), TATCAA (5%) and CATCAA (5%). The major haplotypes for the control subjects were TACCGA (58% of the population), CACCGA (7%), CATCGA (6%) and TGCCGA (5%). Haplotypes TGCCGA, TATCAA, CATCAA, CATCGC, TACCAA and TGCCAA were over-transmitted in patients with spermato- genic defect, whereas haplotypes TACCGA, CACCGA, and CATCGA were under-transmitted in these patients. Conclusion: Some USP26 alleles and haplotypes are associated with spermatogenic defect in the Han nationality in Taiwan, China.展开更多
Association mapping is a useful tool for the detection of genes selected during plant domestication based on their linkage disequilibrium(LD). This study was carried out to estimate genetic diversity, population str...Association mapping is a useful tool for the detection of genes selected during plant domestication based on their linkage disequilibrium(LD). This study was carried out to estimate genetic diversity, population structure and the extent of LD to develop an association framework in order to identify genetic variations associated with drought and salt tolerance traits. 106 microsatellite marker primer pairs were used in 323 Gossypium hirsutum germplasms which were grown in the drought shed and salt pond for evaluation. Polymorphism(PIC=0.53) was found, and three groups were detected(K=3) with the second likelihood ΔK using STRUCTURE software. LD decay rates were estimated to be 13-15 cM at r2 0.20. Significant associations between polymorphic markers and drought and salt tolerance traits were observed using the general linear model(GLM) and mixed linear model(MLM)(P 0.01). The results also demonstrated that association mapping within the population structure as well as stratification existing in cotton germplasm resources could complement and enhance quantitative trait loci(QTLs) information for marker-assisted selection.展开更多
Recent advances in high-throughput sequencing technologies have revolutionized the field of population genetics. Data now routinely contain genomic level polymorphism information, and the low cost of DNA sequencing en...Recent advances in high-throughput sequencing technologies have revolutionized the field of population genetics. Data now routinely contain genomic level polymorphism information, and the low cost of DNA sequencing enables researchers to investigate tens of thousands of subjects at a time. This provides an unprecedented opportunity to address fundamental evolutionary questions, while posing challenges on traditional population genetic theories and methods. This review provides an overview of the recent methodological developments in the field of population genetics, specifically methods used to infer ancient population history and investigate natural selection using large-sample, large-scale genetic data. Several open questions are also discussed at the end of the review.展开更多
BACKGROUND: The tumor necrosis factor recepter associated factor (TRAF) 6 is an important intracellular adapter protein that plays a pivotal role in activating multiple inflammatory and immune related processes ind...BACKGROUND: The tumor necrosis factor recepter associated factor (TRAF) 6 is an important intracellular adapter protein that plays a pivotal role in activating multiple inflammatory and immune related processes induced by cytokines. TRAF6 represents a strong candidate susceptibility factor for sepsis. We investigated whether polymorphisms at the TRAF6 gene are associated with the susceptibility to and severity of sepsis.METHODS: A hospital-based case-control study was conducted with 255 patients with sepsis and 260 controls who were recruited from Zhengzhou, China. Haplotype tagging single nucleotide polymorphisms (htSNPs) were selected from the HapMap database and genotyped using the SNPstream genotyping platform. The associations with the susceptibility and disease severity of sepsis were estimated by logistic regression, and adjusted for age, sex, smoking, drinking, chronic diseases status, APACHEII score and critical illness status.RESULTS: A total of 13 TRAF6 SNPs were tagged by 7 htSNPs. Five htSNPs (rs5030490, rs5030411, rs5030416, rs5030445 and rs3740961) were genotyped in the case control study. Genotype frequencies of the htSNPs were conformed to the Hardy-Weinberg equilibrium in both patients and controls. No significant association was found between the 5 htSNPs and the susceptibility to and severity of sepsis. Compared with the main haplotype -11120A/-10688T/-9423A/805G/12967G, no certain haplotype was associated with the signi? cantly susceptibility to or severity of sepsis.CONCLUSION: TRAF6 gene polymorphisms might not play a major role in mediating the susceptibility to and severity of sepsis in the Chinese population. A larger population-based case-control study is warranted.展开更多
The present study was aimed to analyze the frequencies of human leukocyte antigen (HLA)-A, -B, and -DRB1 alleles and A-B-DRBI, A-B, A-DRB1 and B-DRB1 haplotypes in inhabitants of Guizhou province, China. All samples...The present study was aimed to analyze the frequencies of human leukocyte antigen (HLA)-A, -B, and -DRB1 alleles and A-B-DRBI, A-B, A-DRB1 and B-DRB1 haplotypes in inhabitants of Guizhou province, China. All samples were typed in the HLA-A,-B, and -DRB1 loci using the polymerase chain reaction-reverse sequence spe- cific oligonucleotide probe (PCR-rSSOP) method and HLA polymorphisms were analyzed. A total of 18 HLA-A, 31 HLA-B, and 13 HLA-DRB1 alleles were found in the Guizhou population. The first two frequent alleles in the HLA-A, -B, and -DRB1 loci were A*1 1(30.72%) and A*02(30.65%), B*40(16.27%) and B*46(16.27%), and DRBl*09(15.91%) and DRBl*15(13.51%), respectively. The most common haplotype was A*02-B*46- DRBl*09(5.59%) in A-B-DRB1, A*02-B*46(I 1.73%) in A-B, B*46-DRBl*09(7.49%) in B-DRB1, and A*02- DRBl*09(8.08%) in A-DRB1. Some baplotypes with strong linkage disequilibrium (LD) were found not only in the common haplotypes, such as A*33-B*58, B*30-DRB1*07, and B*33-DRB1*03, but also in the rare haplotypes, such as A*01-B*37, B*37-DRB1*10, and A*01-DRB1*10. Guizhou inhabitants shared some characteristics of the Southern Chinese population but also had their own unique features. Overall, HLA polymorphism in Guizhou population was more consistent with that of Chengdu population than that of other populations in China.展开更多
Nucleotide diversity (pi) and linkage disequilibrium (LD) analysis based on SNP marker could provide a sound basis for choosing an association analysis method. Japanese larch (Larix kaempferi) is an important timber c...Nucleotide diversity (pi) and linkage disequilibrium (LD) analysis based on SNP marker could provide a sound basis for choosing an association analysis method. Japanese larch (Larix kaempferi) is an important timber coniferous tree species for pulping and papermaking, but its high lignin content has significantly restricted it application potential. In this study, the LACCASE gene, that plays an important regulatory role for lignin biosynthesis, was selected as research target. The full-length cDNA and genomic sequences of the encoding LkLAC8 gene were isolated from the LACCASE expressed sequence tags of the Japanese larch transcriptome database using the rapid amplification of cDNA ends-polymerase chain reaction (RACE-PCR). The cDNA was determined to be 1940 bp, with an open reading frame (ORF, 1734 bp) that encoded a protein of 577 AA. This protein contains four highly specific Cu2+ binding sites and 11 glycosylation sites, thus belonging to the LACCASE family. The deduced protein sequence shared an 89% identity with the PtaLAC from Pinus taeda. A real-time PCR analysis showed that the LkLAC8 transcript was expressed predominantly in mature xylem, with moderate levels in the immature xylem, cambium and mature leaves, the lowest in the roots. Lastly, the genomic sequences of LkLAC8 in 40 individuals from six naturally distributed populations of Japanese larch were amplified, and a total of 201 SNPs (103 and 98 mutation types of transition and transversion, respectively) were detected; the frequency of the SNPs was 1/19 bp. Nucleotide diversity among the six populations ranged from 0.0034 to 0.0053, which suggested that there were no significant differences among the populations. The LD analysis showed that the LD level decayed rapidly within the increasing length of the LkLAC8 gene. These results implied that LD mapping and association analysis based on candidate gene may be feasible for the marker-assisted breeding of new germplasms with low lignin in Japanese larch.展开更多
基金This work was supported by the Major State Basic Research Development program of Chinathe National High Technology Research and Development Program of China.
文摘In multiloci-based genetic association studies of complex diseases, a powerful and high efficient tool for analyses oflinkage disequilibrium (LD) between markers, haplotype distributions and many chi-square/p values with a large numberof samples has been sought for long. In order to achieve the goal of obtaining meaningful results directly from raw data,we developed a robust and user-friendly software platform with a series of tools for analysis in association study withhigh efficiency. The platform has been well evaluated by several sets of real data.
基金supported by the Shandong Seed Project and the National Natural Science Foundation of China (31372524)
文摘Linkage disequilibrium(LD) can be applied for mapping the actual genes responsible for variation of economically important traits through association mapping.The feasibility and efficacy of association studies are strongly dependent on the extent of LD which determines the number and density of markers in the studied population,as well as the experimental design for an association analysis.In this study,we first characterized the extent of LD in a wild population and a cultured mass-selected line of Pacific oyster(Crassostrea gigas).A total of 88 wild and 96 cultured individuals were selected to assess the level of genome-wide LD with 53 microsatellites,respectively.For syntenic marker pairs,no significant association was observed in the wild population;however,three significant associations occurred in the cultured population,and the significant LD extended up to 12.7 c M,indicating that strong artificial selection is a key force for substantial increase of genome-wide LD in cultured population.The difference of LD between wild and cultured populations showed that association studies in Pacific oyster can be achieved with reasonable marker densities at a relatively low cost by choosing an association mapping population.Furthermore,the frequent occurrence of LD between non-syntenic loci and rare alleles encourages the joint application of linkage analysis and LD mapping when mapping genes in oyster.The information on the linkage disequilibrium in the cultured population is useful for future association mapping in oyster.
文摘In this study, we propose to use the principal component analysis (PCA) and regression model to incorporate linkage disequilibrium (LD) in genomic association data analysis. To accommodate LD in genomic data and reduce multiple testing, we suggest performing PCA and extracting the PCA score to capture the variation of genomic data, after which regression analysis is used to assess the association of the disease with the principal component score. An empirical analysis result shows that both genotype-based correlation matrix and haplotype-based LD matrix can produce similar results for PCA. Principal component score seems to be more powerful in detecting genetic association because the principal component score is quantitatively measured and may be able to capture the effect of multiple loci.
基金the National Natural Sciences Foundation of China (No. 30672008)
文摘To investigate the distribution characteristics and linkage disequilibrium of T cell immunoglobulin domain and mucin domain protein 4 (TIM4) promoter polymorphisms in asthma patients of Chinese Han population, the promoter region of TIM4 was re-sequenced by PCR-sequencing, and linkage disequilibrium was analyzed by SHEsis software. Four single nucleotide polymor- phisms (SNPs) in the promoter region of TIM4 were detected, including two new SNPs (at positions -1609, -153) and two reported SNPs (rs6874202, rs6882076). The frequency distribution of rs6882076 was different among different races (P〈0.05). In addition, linkage disequilibrium among the SNPs of the promoter region of TIM4 was found and GGTG was the predominant haplotype. There were four SNPs in the promoter region of TIM4 in asthma patients of Chinese Han population, which were in linkage disequilibrium.
基金support from a Marie Curie International Reintegration Grant from the European Union,project no.PIRG08-GA-2010-277031 "SelectionForWelfare"LGR and WMR acknowledge support from project AGL2012-39137
文摘A novel method for haplotype phasing in families after joint estimation of recombination fraction and linkage disequilibrium is developed. Results from Monte Carlo computer simulations show that the newly developed E.M. algorithm is accurate if true recombination fraction is 0 even for single families of relatively small sizes. Estimates of recombination fraction and linkage disequilibrium were 0.00 (SD 0.00) and 0.19 (SD 0.03) for simulated recombination fraction and linkage disequilibrium of 0.00 and 0.20, respectively. A genome fragmentation phasing strategy was developed and used for phasing haplotypes in a sire and 36 progeny using the 50 k Illumina BeadChip by: a) estimation of the recombination fraction and LD in consecutive SNPs using family information, b) linkage analyses between fragments, c) phasing of haplotypes in parents and progeny and in following generations. Homozygous SNPs in progeny allowed determination of paternal fragment inheritance, and deduction of SNP sequence information of haplotypes from dams. The strategy also allowed detection of genotyping errors. A total of 613 recombination events were detected after linkage analysis was carried out between fragments. Hot and cold spots were identified at the individual (sire level). SNPs for which the sire and calf were heterozygotes became informative (over 90%) after the phasing of haplotypes. Average of regions of identity between half-sibs when comparing its maternal inherited haplotypes (with at least 20 SNP) in common was 0.11 with a maximum of 0.29 and a minimum of 0.05. A Monte-Carlo simulation of BTA1 with the same linkage disequilibrium structure and genetic linkage as the cattle family yielded a 99.98 and 99.94% of correct phases for informative SNPs in sire and calves, respectively.
基金Project supported by the International Pig Improvement Company(PIC) and Sheep Genomics, Australia
文摘Quantitative trait loci (QTL) and their additive, dominance and epistatic effects play a critical role in complex trait variation. It is often infeasible to detect multiple interacting QTL due to main effects often being confounded by interaction effects. Positioning interacting QTL within a small region is even more difficult. We present a variance component approach nested in an empirical Bayesian method, which simultaneously takes into account additive, dominance and epistatic effects due to multiple interacting QTL. The covariance structure used in the variance component approach is based on combined linkage disequilibrium and linkage (LDL) information. In a simulation study where there are complex epistatic interactions between QTL, it is possible to simultaneously fine map interacting QTL using the proposed approach. The present method combined with LDL information can efficiently detect QTL and their dominance and epistatic effects, making it possible to simultaneously fine map main and epistatic QTL.
基金This study was financially supported by the National Natural Science Foundation of China(41301346)the Natural Science Foundation of Fujian Province(2020J01375)the Natural Science Foundation of Fujian Province(2015N0034).
文摘Bamboos are one of the most beautiful and useful plants on Earth.The genetic background and population structure of bamboos are well known,which helps accelerate the process of artificial domestication of bamboo.Partial sequences of six genes involved in nitrogen use efficiency in 32 different bamboo species were analyzed for occurrence of single nucleotide polymorphisms(SNPs).The nucleotide diversityθw and total nucleotide polymorphismsπT of the sequenced DNA regions was 0.05137 and 0.03332,respectively.Bothπnonsyn/πsyn and Ka/Ks values were<1.The nucleotide sequences of these six genes were inferred to be relatively conserved,and the haplotype diversity was relatively high.The results of evolutionary neutrality tests showed that the six genes were in line with neutral evolution,and that the NRT2.1 and AMT2.1 gene sequences may have experienced negative selection.An inter-SNP recombination event at the NRT2.1 gene in the all pooled sample,of all 32 bamboo species was the lowest at 0.0645,whereas the AMT gene recombination events were all>0.1.Estimation and analysis of linkage disequilibrium of five genes revealed that with the increase in nucleotide sequence length,the degree of SNP linkage disequilibrium decreased rapidly.We inferred the population genetic structure of 32 bamboo species based on the SNP loci of six genes with frequencies>18%.32 bamboo species were divided into five categories,which indicated that the combined population of all bamboo species had obvious multivariate characteristics and was heterogeneous;red(Group 1)and green(Group 2)were the main groups.
基金The NNSF (10371015, 10329102) of China, and the Science Foundation (20060101) for Young Teachers of Northeast Normal University.
文摘In this article, using the likelihood score theory extended to nuisance parameters we derive a new homogeneity score test for comparing linkage disequilibrium across several strata. Power and sample size formulae are also obtained.
基金Project supported by the Barley Coordinated Agricultural Project (No.USDA-CSREES-2006-55606-16722) of the USDA National Institute of Food and Agriculture and the Lieberman-Okinow Endowment at the University of Minnesota,USA
文摘Cultivated barley is known to have a complex population structure and extensive linkage disequilibrium (LD).To conduct robust association mapping (AM) studies of economically important traits in US barley breeding germplasm,population structure and LD decay were examined in a complete panel of US barley breeding germplasm (3840 lines) genotyped with 3072 single nucleotide polymorphisms (SNPs).Nine subpopulations (sp1 sp9) were identified by the program STRUCTURE and subsequently confirmed by principle component analysis (PCA).Out of the nine subpopulations,seven were very similar to the respective subpopulations identified by Hamblin et al.(2010) which were based on half of the germplasm and half of the SNP markers,but two subpopulations were found to be new.One subpopulation was dominated by six-rowed spring lines from Utah State University (UT) and the other was composed of six-rowed spring lines from multiple breeding programs (USDA-ARS Aberdeen (AB),Busch Agricultural Resources Inc.(BA),UT,and Washington State University (WA)).LD was found to decay across a range from 4.0 to 19.8 cM.This result indicates that the germplasm genotyped with 3072 SNPs would be robust for mapping and possibly identifying the causal polymorphisms contributing to disease resistance and perhaps other traits.
文摘Objectives To formulate an equation for fine mapping of disease loci under complex conditions and determine the marker-disease distance in a specific case using this equation. Methods Lewontin’s linkage disequilibrium (LD) measure D’ was used to formulate an equation for mapping disease genes in the presence of phenocopies, locus heterogeneity, gene-gene and gene-environment interactions, incomplete penetrance, uncertain liability and threshold, incomplete initial LD, natural selection, recurrent mutation, high disease allele frequency and unknown mode of inheritance. This equation was then used to determine the distance between a marker (ε4 within the apolipoprotein E gene, APOE) and Alzheimer’s disease (AD) loci using published data.Results An equation was formulated for mapping disease genes under the above conditions. If these conditions are present but ignored, then recombination fraction θ between marker and disease loci will be either overestimated or estimated with little bias. Therefore, an upper limit of θ can be obtained. AD has been found to be associated with the marker allele ε4 in Africans, Asians, and Caucasians. This suggests that the AD-ε4 allelic LD predates the divergence of peoples occurring 100?000 years ago. With the age of AD-ε4 allelic LD so estimated, the maximal distance was calculated to be 23.2 kb (mean 5.8 kb). Conclusions (1) A method is developed for LD mapping of susceptibility genes. (2) A mutation within the APOE gene itself, among others, is responsible for the susceptibility to AD, which is supported by recent evidence from studies using transgenic mice.
文摘With completion of the Populus genome sequencing project and the availability of many expressed sequence tags (ESTs) databases in forest trees, attention is now rapidly shifting towards the study of individual genetic variation in natural populations. The most abundant form of genetic variation in many eukaryotic species is represented by single nucleotide polymorphisms (SNPs), which can account for heritable inter-individual differences in complex phenotypes. Unlike humans, the linkage disequilibrium (LD) rapidly decays within candidate genes in forest trees. Thus, SNPs-based candidate gene association studies are considered to be a most effective approach to dissect the complex quantitative traits in forest trees. The present study demonstrates that LD mapping can be used to identify alleles associated with quantitative traits and suggests that this new approach could be particularly useful for performing breeding programs in forest trees. In this review, we will describe the fundamentals, patterns of SNPs distribution and frequency, summarize recent advances in SNPs discovery and LD and comment on the application of LD in the dissection of complex quantitative traits in forest tress. We also put forward the outlook for future SNPs-based association analysis of quantitative traits in forest trees.
文摘The inference of genome ancestry and the estimation of molecular relatedness are of great importance for breeding efficiency and association studies. Seventy SSR loci, evenly distributed in 10 chromosomes, were assayed for polymorphism among 187 commonly used maize (Zea mays L.) inbreds which represent the genetic diversity in China. The identified 290 alleles served as raw data for estimating population structure using the coalescent linked loci, based on the ADMIXTURE model. Population number, K, has been inferred to be between five and seven. Specifying five subpopulations (K = 5) led to a distinct decrease and specifying K to be greater than six resulted in only minimal increases in the likelihood value. Therefore, population number, K, has been inferred into six subpopulations, which are PA, BSSS (includes Reid), PB, Lan (Lancaster Sure Crop), LRC (Luda Reb Cob, a Chinese landrace, and its derivatives), and SPT (Si-ping-tou, a Chinese landrace and its derivatives). The Kullback-Leibler distance of pairwise subpopulation was also inferred as n × p (187 ×6) Q matrices, which gave a detailed percentage of genetic composition of six subpopulations and molecular relatedness of each line. The genome-wide linkage disequilibrium (LD) indicated that the asso- ciation studies in QTLs and/or candidate genes might avoid nonfunctional and spurious associations, as most of the LD blocks were broken among diverse germplasm. The defined population structure has given us a clear genetic structure of these lines for breeding practice and established a good basis for association analysis.
基金financially supported by the National Youth Foundation of China(31901494,31601306,and 31901869)the National Natural Science Foundation of China(31971890)+1 种基金supported by Young Elite Scientists Sponsorship Program of China Association for Science and Technology(2017QNRC001)the Natural Science Fund of Jiangsu Province,China(BK20161092)。
文摘Most modern wheat cultivars were selected on the basis of yield-related indices measured under optimal fertilizer and irrigation inputs.With climate change,land degradation and salinity caused by sea water encroachment,wheat is increasingly subjected to environmental stress.Moreover,expanding urbanization increasingly encroaches upon prime agricultural land in countries like China,and alternative cropping areas must be found.Some of these areas have moderate constraining factors,such as salinity.Therefore,it is important to investigate whether current genetic materials and breeding procedures are maintaining adequate variability to address future problems caused by abiotic stress.In this study,a panel of 307 wheat accessions,including local landraces,exotic cultivars used in Chinese breeding programs and Chinese cultivars released during different periods since1940,were subjected to a genome-wide association study to dissect the genetic basis of salinity tolerance.Both marker-based and pedigree-based kinship analyses revealed that favorable haplotypes were introduced in some exotic cultivars as well as a limited number of Chinese landraces from the 1940 s.However,improvements in salinity tolerance during modern breeding are not as obvious as that of yield.To broaden genetic diversity for increasing salt tolerance,there is a need to refocus attention on local landraces that have high degrees of salinity tolerance and carry rare favorable alleles that have not been exploited in breeding.
基金supported by grants from the earmarked fund for China Agriculture Research System (CARS-35)Modern Agriculture Science and Technology Key Project of Hebei Province (19226376D)+2 种基金the National Key Research and Development Project (SQ2019YFE00771)the National Natural Science Foundation of China (31671327)Major Project of Selection for New Livestock and Poultry Breeds of Zhejiang Province (2016C02054–5)。
文摘Background: Different production systems and climates could lead to genotype-by-environment(G × E) interactions between populations, and the inclusion of G × E interactions is becoming essential in breeding decisions. The objective of this study was to investigate the performance of multi-trait models in genomic prediction in a limited number of environments with G × E interactions.Results: In total, 2,688 and 1,384 individuals with growth and reproduction phenotypes, respectively, from two Yorkshire pig populations with similar genetic backgrounds were genotyped with the PorcineSNP80 panel.Single-and multi-trait models with genomic best linear unbiased prediction(GBLUP) and BayesC π were implemented to investigate their genomic prediction abilities with 20 replicates of five-fold cross-validation.Our results regarding between-environment genetic correlations of growth and reproductive traits(ranging from 0.618 to 0.723) indicated the existence of G × E interactions between these two Yorkshire pig populations. For single-trait models, genomic prediction with GBLUP was only 1.1% more accurate on average in the combined population than in single populations, and no significant improvements were obtained by BayesC π for most traits. In addition, single-trait models with either GBLUP or BayesC π produced greater bias for the combined population than for single populations. However, multi-trait models with GBLUP and BayesC π better accommodated G × E interactions,yielding 2.2% – 3.8% and 1.0% – 2.5% higher prediction accuracies for growth and reproductive traits, respectively,compared to those for single-trait models of single populations and the combined population. The multi-trait models also yielded lower bias and larger gains in the case of a small reference population. The smaller improvement in prediction accuracy and larger bias obtained by the single-trait models in the combined population was mainly due to the low consistency of linkage disequilibrium between the two populations, which also caused the BayesC π method to always produce the largest standard error in marker effect estimation for the combined population.Conclusions: In conclusion, our findings confirmed that directly combining populations to enlarge the reference population is not efficient in improving the accuracy of genomic prediction in the presence of G × E interactions, while multi-trait models perform better in a limited number of environments with G × E interactions.
文摘Aim: To complete comprehensive haplotype analysis of USP26 for both fertile and infertile men. Methods: Two hundred infertile men with severe oligospermia or non-obstructive azoospermia were subjected to sequence analysis for the entire coding sequences of the USP26 gene. Two hundred men with proven fertility were genotyped by primer extension methods. Allele/genotype frequencies, linkage disequilibrium (LD) characteristics and haplotypes of fertile men were compared with infertile men. Results: The allele frequencies of five single nucleotide polymor- phisms (370-37 linsACA, 494T〉C, 576G〉A, ss6202791C〉T, 1737G〉A) were significantly higher in infertile patients than control subjects. The major haplotypes in infertile men were TACCGA (28% of the population), TGCCGA (15%), TACCAA (8%), TGCCAA (6%), TATCAA (5%) and CATCAA (5%). The major haplotypes for the control subjects were TACCGA (58% of the population), CACCGA (7%), CATCGA (6%) and TGCCGA (5%). Haplotypes TGCCGA, TATCAA, CATCAA, CATCGC, TACCAA and TGCCAA were over-transmitted in patients with spermato- genic defect, whereas haplotypes TACCGA, CACCGA, and CATCGA were under-transmitted in these patients. Conclusion: Some USP26 alleles and haplotypes are associated with spermatogenic defect in the Han nationality in Taiwan, China.
基金supported by the National Natural Science Foundation of China(31201246)the Project of International Science and Technology Cooperation and Exchange from the Ministry of Science and Technology,China(2010DFR30620-3)
文摘Association mapping is a useful tool for the detection of genes selected during plant domestication based on their linkage disequilibrium(LD). This study was carried out to estimate genetic diversity, population structure and the extent of LD to develop an association framework in order to identify genetic variations associated with drought and salt tolerance traits. 106 microsatellite marker primer pairs were used in 323 Gossypium hirsutum germplasms which were grown in the drought shed and salt pond for evaluation. Polymorphism(PIC=0.53) was found, and three groups were detected(K=3) with the second likelihood ΔK using STRUCTURE software. LD decay rates were estimated to be 13-15 cM at r2 0.20. Significant associations between polymorphic markers and drought and salt tolerance traits were observed using the general linear model(GLM) and mixed linear model(MLM)(P 0.01). The results also demonstrated that association mapping within the population structure as well as stratification existing in cotton germplasm resources could complement and enhance quantitative trait loci(QTLs) information for marker-assisted selection.
文摘Recent advances in high-throughput sequencing technologies have revolutionized the field of population genetics. Data now routinely contain genomic level polymorphism information, and the low cost of DNA sequencing enables researchers to investigate tens of thousands of subjects at a time. This provides an unprecedented opportunity to address fundamental evolutionary questions, while posing challenges on traditional population genetic theories and methods. This review provides an overview of the recent methodological developments in the field of population genetics, specifically methods used to infer ancient population history and investigate natural selection using large-sample, large-scale genetic data. Several open questions are also discussed at the end of the review.
文摘BACKGROUND: The tumor necrosis factor recepter associated factor (TRAF) 6 is an important intracellular adapter protein that plays a pivotal role in activating multiple inflammatory and immune related processes induced by cytokines. TRAF6 represents a strong candidate susceptibility factor for sepsis. We investigated whether polymorphisms at the TRAF6 gene are associated with the susceptibility to and severity of sepsis.METHODS: A hospital-based case-control study was conducted with 255 patients with sepsis and 260 controls who were recruited from Zhengzhou, China. Haplotype tagging single nucleotide polymorphisms (htSNPs) were selected from the HapMap database and genotyped using the SNPstream genotyping platform. The associations with the susceptibility and disease severity of sepsis were estimated by logistic regression, and adjusted for age, sex, smoking, drinking, chronic diseases status, APACHEII score and critical illness status.RESULTS: A total of 13 TRAF6 SNPs were tagged by 7 htSNPs. Five htSNPs (rs5030490, rs5030411, rs5030416, rs5030445 and rs3740961) were genotyped in the case control study. Genotype frequencies of the htSNPs were conformed to the Hardy-Weinberg equilibrium in both patients and controls. No significant association was found between the 5 htSNPs and the susceptibility to and severity of sepsis. Compared with the main haplotype -11120A/-10688T/-9423A/805G/12967G, no certain haplotype was associated with the signi? cantly susceptibility to or severity of sepsis.CONCLUSION: TRAF6 gene polymorphisms might not play a major role in mediating the susceptibility to and severity of sepsis in the Chinese population. A larger population-based case-control study is warranted.
基金supported by the Chinese Marrow Donor Program(CMDP),CMDP Guizhou Registry
文摘The present study was aimed to analyze the frequencies of human leukocyte antigen (HLA)-A, -B, and -DRB1 alleles and A-B-DRBI, A-B, A-DRB1 and B-DRB1 haplotypes in inhabitants of Guizhou province, China. All samples were typed in the HLA-A,-B, and -DRB1 loci using the polymerase chain reaction-reverse sequence spe- cific oligonucleotide probe (PCR-rSSOP) method and HLA polymorphisms were analyzed. A total of 18 HLA-A, 31 HLA-B, and 13 HLA-DRB1 alleles were found in the Guizhou population. The first two frequent alleles in the HLA-A, -B, and -DRB1 loci were A*1 1(30.72%) and A*02(30.65%), B*40(16.27%) and B*46(16.27%), and DRBl*09(15.91%) and DRBl*15(13.51%), respectively. The most common haplotype was A*02-B*46- DRBl*09(5.59%) in A-B-DRB1, A*02-B*46(I 1.73%) in A-B, B*46-DRBl*09(7.49%) in B-DRB1, and A*02- DRBl*09(8.08%) in A-DRB1. Some baplotypes with strong linkage disequilibrium (LD) were found not only in the common haplotypes, such as A*33-B*58, B*30-DRB1*07, and B*33-DRB1*03, but also in the rare haplotypes, such as A*01-B*37, B*37-DRB1*10, and A*01-DRB1*10. Guizhou inhabitants shared some characteristics of the Southern Chinese population but also had their own unique features. Overall, HLA polymorphism in Guizhou population was more consistent with that of Chengdu population than that of other populations in China.
基金financially supported by the Fundamental Research Funds for the Central Non-profit Research Institution of CAF(RIF2014-06)the Forestry Industry Research special funds for Public Welfare Projects(201504104)
文摘Nucleotide diversity (pi) and linkage disequilibrium (LD) analysis based on SNP marker could provide a sound basis for choosing an association analysis method. Japanese larch (Larix kaempferi) is an important timber coniferous tree species for pulping and papermaking, but its high lignin content has significantly restricted it application potential. In this study, the LACCASE gene, that plays an important regulatory role for lignin biosynthesis, was selected as research target. The full-length cDNA and genomic sequences of the encoding LkLAC8 gene were isolated from the LACCASE expressed sequence tags of the Japanese larch transcriptome database using the rapid amplification of cDNA ends-polymerase chain reaction (RACE-PCR). The cDNA was determined to be 1940 bp, with an open reading frame (ORF, 1734 bp) that encoded a protein of 577 AA. This protein contains four highly specific Cu2+ binding sites and 11 glycosylation sites, thus belonging to the LACCASE family. The deduced protein sequence shared an 89% identity with the PtaLAC from Pinus taeda. A real-time PCR analysis showed that the LkLAC8 transcript was expressed predominantly in mature xylem, with moderate levels in the immature xylem, cambium and mature leaves, the lowest in the roots. Lastly, the genomic sequences of LkLAC8 in 40 individuals from six naturally distributed populations of Japanese larch were amplified, and a total of 201 SNPs (103 and 98 mutation types of transition and transversion, respectively) were detected; the frequency of the SNPs was 1/19 bp. Nucleotide diversity among the six populations ranged from 0.0034 to 0.0053, which suggested that there were no significant differences among the populations. The LD analysis showed that the LD level decayed rapidly within the increasing length of the LkLAC8 gene. These results implied that LD mapping and association analysis based on candidate gene may be feasible for the marker-assisted breeding of new germplasms with low lignin in Japanese larch.