Modern breeding informatics approaches to optimise breeding of rhizomania resistance in sugar beet (Beta vulgaris L.) and its genomic prediction
Doctoral thesis
Date of Examination:2023-09-29
Date of issue:2024-04-26
Advisor:Prof. Dr. Armin Schmitt
Referee:Prof. Dr. Armin Schmitt
Referee:Prof. Dr. Stefan Scholten
Referee:Prof. Dr. Mehmet Gültas
Sponsor:KWS Saat SE & Co. KGaA
Files in this item
Name:PhD_thesis_ThomasLange.pdf
Size:10.6Mb
Format:PDF
Abstract
English
Rhizomania, a viral disease caused by the Beet necrotic yellow vein virus (BNYVV), is considered the most destructive disease in sugar beet (Beta vulgaris L.). Since no plant protection against rhizomania is currently available, plant breeding remains the only defence strategy at the moment. The objective of the presented thesis is to support future breeding programmes towards rhizomania resistance which was achieved in two stages. Firstly, an existing greenhouse biotest was optimised to screen the resistance level of sugar beet genotypes. While it is common practice to measure the resistance of sugar beet genotypes against rhizomania by determining the concentration of BNYVV in roots of young sugar beet plants via enzyme-linked immunosorbent assay (ELISA), the results presented in this thesis lead to the conclusion that the raw ELISA values are non-normally distributed. However, by using a serial dilution and a logistic regression model, it was possible to estimate the relative concentration of BNYVV in a genotype. Additionally, the transformed data seemed to follow a normal distribution. Using the normally distributed virus concentrations, the experimental design of the greenhouse biotest was further optimised. Secondly, genomic prediction was performed with a population where the majority of individuals were resistant towards rhizomania at all known resistance genes using single nucleotide polymorphism (SNP) markers. In a first step, the data set was filtered. Hereby, SNPs with a minor allele frequency (MAF) below 0.05 and redundant SNPs were removed from the data set. Since the majority of individuals were resistant towards rhizomania at all known resistance genes, SNPs in high linkage disequilibrium with the known resistance genes were removed from the data set during MAF filtering. However, satisfactory values for prediction accuracy were obtained, indicating the involvement of additional genes. To perform genomic prediction, linear models such as best linear unbiased prediction, BayesA, BayesB, and BayesC as well as machine learning methods such as support vector regression and random forest were used. All models performed similarly, except BayesC, which performed significantly worse. Moreover, feature selection was performed using machine learning to reduce the number of genomic markers in the prediction model. It was demonstrated that feature selection reduced the number of genomic markers to only around 0.1-0.6% of the available SNPs while achieving similar prediction accuracy. Six genomic markers were consistently selected during feature selection, and they clustered around four different genomic positions, three of which were not previously associated with rhizomania resistance. Moreover, it has been shown that some of these genomic markers provided strong interactions. It can therefore be assumed that these would not have been identified with a genome-wide association study examining each SNP individually, i.e. using a simple linear regression model.
Keywords: Breeding informatics; Rhizomania; Resistance breeding; Genomic prediction