Arrays and beyond: Evaluation of marker technologies for chicken genomics
by Johannes Geibel
Date of Examination:2021-11-12
Date of issue:2022-01-04
Advisor:Prof. Dr. Henner Simianer
Referee:Prof. Dr. Henner Simianer
Referee: Prof. Dr. Apl. Steffen Apl. Weigend
Referee:Prof. Dr. Gudrun A. Brockmann
Referee:Prof. Dr. Timothy Mathes Beissinger
Files in this item
Name:Promotion_JohannesGeibel_publish.pdf
Size:4.82Mb
Format:PDF
Abstract
English
A key research question in livestock research is how livestock’s phenotypic diversity is shaped by its genomic diversity. Genomic diversity is thereby assessed through genomic markers. The use and definition of genomic markers is strongly technology driven and therefore changes through time. During the last years, single nucleotide polymorphisms (SNPs) have become the main marker class. Additionally, SNP arrays have been the genotyping technology of choice during the last years due to their early availability. They are, however, currently partially displaced by whole-genome-sequencing (WGS) for SNP calling. Further, structural variants (SV) are moving more and more into the focus of researchers. In this context, the thesis aims in evaluating the value of SNP markers in various ways with its main focus on chickens as a diverse livestock species with major agricultural value. In Chapter 1, the current knowledge of genomic variation, marker technologies, and their use in livestock sciences, especially in chickens, is reviewed. Chapter 2 and 3 then address a systematic error of SNP arrays, the SNP ascertainment bias. SNP ascertainment bias is a systematic shift of the allele frequency spectrum of SNP arrays towards more common SNPs due to the pre-selection of SNPs in a limited number of individuals of few populations. Chapter 2 aims in assessing the magnitude of the bias for a standard chicken SNP array and the steps of array design that created the bias. In the study, we therefore remodeled the design process of the chicken array based on (pooled) WGS of various chicken populations. This revealed a sequential reduction of rare alleles during the design process, which was mainly caused by the initial limitation of the discovery set and a later within-population selection of common SNPs while aiming for equidistant spacing. Increasing the discovery set had the largest impact on limiting ascertainment bias. Other steps, as e.g. validation of the SNPs in a broader set of populations did not show relevant effects. Correction methods for ascertainment bias are by now often unfeasible in studies. Chapter 3 therefore proposes to use imputation of the array data to WGS level as an in silico correction method of the allele frequency spectrum. The study revealed that imputation is able to strongly reduce the effects of ascertainment bias, even when a very sparse reference panel was used. However, it became also obvious that the reference panel then has the same effect as the discovery panel during array design. It is therefore crucial to select samples for the reference panel evenly spaced across the intended range of populations. SVs are harder to call and genotype than SNPs. Therefore, the question arises whether effects of SV are captured by SNP-based studies due to strong linkage disequilibrium between SNPs and SVs. This is assessed in Chapter 4 for three commercial chicken breeds, based on WGS data. The study showed that LD between deletions and SNPs was on the same level as LD between SNPs and other SNPs, indicating that deletion effects are captured by SNP marker panels as good as SNP effects. LD between SNPs and other SVs was strongly reduced. The main factor for this reduction was local differences to SNPs in terms of minor allele frequency. However, a reduction of homozygous variant calls for non-deletion SVs compared to the Hardy-Weinberg-expectation may indicate problems of the used SV genotypers. In the last chapter (Chapter 5), the impact of ascertainment bias and possibilities to deal with it in chicken genomics (and also more general in livestock genomics) is discussed. Further, the potentials of including SVs into studies are evaluated. It also discusses what is necessary to combine the information of different genomic data sets to leverage the value of analyses. Finally, an outlook on what information will be additionally available in near future based on recent technological advances is given.
Keywords: SNP ascertainment bias; chickens; population genetics; imputation; single nucleotide polymorphisms; structural variants; linkage disequilibrium; whole-genome sequencing; SNP arrays