• Deutsch
    • English
  • English 
    • Deutsch
    • English
  • Login
Item View 
  •   Home
  • Naturwissenschaften, Mathematik und Informatik
  • Fakultät für Agrarwissenschaften (inkl. GAUSS)
  • Item View
  •   Home
  • Naturwissenschaften, Mathematik und Informatik
  • Fakultät für Agrarwissenschaften (inkl. GAUSS)
  • Item View
JavaScript is disabled for your browser. Some features of this site may not work without it.

Understanding the Effective Number of Independent Chromosome Segments as a Tuning Parameter for Genomic Prediction

by Nicolas Frioni Garcia
Doctoral thesis
Date of Examination:2024-04-23
Date of issue:2025-04-16
Advisor:Prof. Dr. Henner Simianer
Referee:Prof. Dr. Jens Tetens
Referee:Prof. Dr. Thomas Kneib
Referee:Dr. Malena Erbe
crossref-logoPersistent Address: http://dx.doi.org/10.53846/goediss-11210

 

 

Files in this item

Name:Thesis_PhD_Frioni.pdf
Size:3.38Mb
Format:PDF
ViewOpen

The following license files are associated with this item:


Abstract

English

The effective number of independently segregating chromosome segments (𝑀𝑒) is a key parameter in animal selection. Its relevance lies within the methods to obtain the accuracy of genomic prediction. However, there is some controversy around 𝑀𝑒 since many methods are available for its estimation, and yet all produce different values. The aim of this thesis is to gain a better understanding of the parameter 𝑀𝑒, its nature, relevant factors and propose a method for its estimation and validation. In the 1st chapter we present relevant background for the present study. First, a brief history of livestock practices and how this allowed and demanded the progress and evolution of the animal breeding field. This is followed by a discussion on the significance of 𝑀𝑒 in genomic prediction, a concise review of haplotype recovery from genomic datasets, techniques for detecting identity by descent (𝐼𝐵𝐷), and an introduction to the processes of meiosis and recombination. In the 2nd chapter we present the scientific article: “Phasing quality assessment in a brown layer population through family- and population-based software” published in the Journal BMC Genetics. This paper aimed to compare phasing software options (FImpute and Beagle). Ten replicated samples of a layer population comprising 888 individuals were simulated using a real SNP dataset of 580k and a pedigree of 12 generations. Chromosomes 1, 7, and 20 were analyzed. The percentage of SNPs equally phased between true and phased haplotypes (𝐸𝑞𝑝) was measured, along with the proportion of individuals completely correctly phased, the number of incorrectly phased SNPs (Breakpoints, 𝐵𝑘𝑝), and the length of inverted haplotype segments. Results were obtained for three subsets: individuals with (i) no parents or offspring genotyped, (ii) only one parent genotyped, and (iii) both parents genotyped. Phasing was performed with Beagle (v3.3 and v4.1) and FImpute v2.2 (with and without pedigree). 𝐸𝑞𝑝 values ranged from 88% to 100%, with the best results from Beagle v4.1 and FImpute with pedigree information and at least one parent genotyped. FImpute haplotypes showed a higher number of 𝐵𝑘𝑝 than Beagle, resulting in longer switched haplotype segments for Beagle. We concluded that for the dataset in this study, Beagle v4.1 or FImpute with pedigree information and at least one parent genotyped were the best options for obtaining high-quality phased haplotypes. With a good overview of phasing and imputation steps for accurate haplotype recovery, we move to the assessment of 𝑀𝑒 in the 3rd chapter. Two new methods are presented for obtaining 𝑀𝑒, one for random mating populations, and the second for full and half sib populations. The populations for this study were created in-silico with the R-package MoBPS. Full and half sib populations were created from a common base pedigree of 200 individuals (50% of each sex) over 6 generations of random mating. Three in-silico random mating populations were created, each consisting of 50, 100, and 200 individuals, respectively, over 50 generations. Both equations were validated based on Visscher et al. (2006) through 𝐼𝐵𝐷 share variance. For the full sib populations, 𝑀𝑒 values ranged from 24 to 27.8. Half sib populations had 𝑀𝑒 values from 21.9 to 26.9, and 𝐼𝐵𝐷𝑠𝑣 values from 0.00233 to 0.00285. For random mating populations, 𝑀𝑒 values ranged from 133 to 207 for 50 individuals, 147 to 195 for 100 individuals, and 147 to 176 for 200 individuals, with 𝐼𝐵𝐷𝑠𝑣 decreasing as population size increased. Validations were partially successful in this study, which emphasizes the need for further research. 𝑀𝑒 was estimated with precise information on chromosome segment origin, and 𝐼𝐵𝐷𝑠𝑣 was used as a measure of genetic contrast or similarity. The results showed higher 𝐼𝐵𝐷𝑠𝑣 values for full sibs compared to half sibs, indicating higher average relatedness. For random mating populations, the smallest population had the highest 𝐼𝐵𝐷𝑠𝑣 values due to increased stratification. The proposed equations provided reliable 𝑀𝑒 estimates for full and half sibs but revealed challenges in larger populations due to dynamic changes. Future research should refine these methods and apply them to real population data. A last step for understanding 𝑀𝑒 is covered in the 4th chapter. The study aimed to obtain 𝑀𝑒 using the commonly available SNP marker data. The methods proposed in the 3rd chapter relied on 𝐼𝐵𝐷 information; however, 𝐼𝐵𝐷 status is not readily available for commercial livestock populations. A possibility is to obtain the 𝐼𝐵𝐷 status through software or to apply the methods presented in chapter 3 from SNP data directly, through 𝐼𝐵𝑆 (identical by state) status. The data for this study consisted of populations created in-silico with scenarios combining three levels of marker density and two mating schemes. Two populations with a marker density of 1’000 and 5’000 SNPs/M presented random mating for 50 generations, and the last generation presented a marker density of 10’000 SNPs/M with 40 generations of random mating and 10 subsequent generations with phenotypic selection for the top 50% of individuals. The populations presented a 𝑁𝑒 of 200, with a genome of 10 chromosomes with size 1 M each, a recombination rate of 1/M, and Poisson distribution of crossing-over events. The results were contrasted between the true 𝑀𝑒 (and relevant parameters), 𝑀𝑒 obtained from IBS (genomic relationship matrix), and 𝑀𝑒 through 𝐼𝐵𝐷 recovered from the software RefinedIBD (R-IBD). The study evaluated 𝑀𝑒 using true 𝐼𝐵𝐷 and genomic relationship matrices in simulated populations. For true 𝐼𝐵𝐷, 𝑀𝑒 values ranged from 144 to 174 in random mating populations and 115 to 149 in phenotypically selected populations. 𝐼𝐵𝐷 shared variance (𝐼𝐵𝐷𝑠𝑣) was higher in selected populations. Genomic relationship matrices produced 𝑀𝑒 values close to true 𝐼𝐵𝐷, ranging from 145 to 201 in random mating populations and lower values around 97.3 in selected populations. R-IBD software provided accurate 𝐼𝐵𝐷𝑠𝑣 for random mating populations but showed discrepancies in selected populations. The study concludes that genomic relationship matrices and IBS are practical alternatives to true 𝐼𝐵𝐷 for genomic predictions, especially in large-scale evaluations. Further research is suggested to explore different SNP densities in selected populations and to improve the estimation of 𝐼𝐵𝐷 variance at single loci. Combining R-IBD for 𝐼𝐵𝐷𝑠𝑣 and genomic relationship matrices for genomic relationships may provide the most effective method for accurate genomic predictions. In the 5th chapter, the general discussion of the thesis is presented. This chapter is structured in four parts. In the 1st section, the relevance of quality phasing software for obtaining accurate estimations of 𝑀𝑒 is analyzed. Secondly, the relevance of 𝑀𝑒 in genomic prediction is discussed together with how the new proposed methods can have an impact. In the 3rd section, the potential use of 𝐼𝐵𝑆 instead of 𝐼𝐵𝐷 to obtain 𝑀𝑒 is discussed. In the last part, the potential use of genomic prediction in other areas (outside of livestock science) is presented. Finally, as a closure, the outlook and conclusions of this thesis are presented.
Keywords: Genetics; Genomic prediction
 

Statistik

Publish here

Browse

All of eDissFaculties & ProgramsIssue DateAuthorAdvisor & RefereeAdvisorRefereeTitlesTypeThis FacultyIssue DateAuthorAdvisor & RefereeAdvisorRefereeTitlesType

Help & Info

Publishing on eDissPDF GuideTerms of ContractFAQ

Contact Us | Impressum | Cookie Consents | Data Protection Information | Accessibility
eDiss Office - SUB Göttingen (Central Library)
Platz der Göttinger Sieben 1
Mo - Fr 10:00 – 12:00 h


Tel.: +49 (0)551 39-27809 (general inquiries)
Tel.: +49 (0)551 39-28655 (open access/parallel publications)
ediss_AT_sub.uni-goettingen.de
[Please replace "_AT_" with the "@" sign when using our email adresses.]
Göttingen State and University Library | Göttingen University
Medicine Library (Doctoral candidates of medicine only)
Robert-Koch-Str. 40
Mon – Fri 8:00 – 24:00 h
Sat - Sun 8:00 – 22:00 h
Holidays 10:00 – 20:00 h
Tel.: +49 551 39-8395 (general inquiries)
Tel.: +49 (0)551 39-28655 (open access/parallel publications)
bbmed_AT_sub.uni-goettingen.de
[Please replace "_AT_" with the "@" sign when using our email adresses.]