|Over the last decade, interest in detection of genes or genomic regions that are targeted by selection has been growing. Identifying signatures of selection can provide valuable insights about the genes or genomic regions that are or have been under selection pressure, which in turn leads to a better understanding of genotype-phenotype relationships. The main focus of this thesis is the detection of selection signatures in various breeds of chicken. Common strategy for the detection of selection signatures is to compare samples from several populations and search for genomic regions with outstanding genetic differentiation. This strategy uses inter-populations statistics. In this dissertation in each chapter (chapter 2, 3 and 4) one or two inter-populations statistics for selection signature detection is investigate.
Two sets of data set were used in this thesis: The first set comprised a total of 96 individuals of three commercial layer breeds (White leghorn, White Rock and Rhode Island Red) and 14 non-commercial fancy breeds (including, G. g. gallus and G. g. spadiceus) were genotyped with three different 600K SNP-chips. The second set was pool sequences (10 individuals per pool) from 43 different chicken breeds. 43 chicken breeds were contained of 3 commercial breeds (White leghorn, White Rock and Rhode Island Red) and 40 non-commercial breeds (including G. g. gallus, G. g. spadiceus and G. various).
In our first approach, as described in the 2nd chapter, Wright’s fixation index, FST, was used as an index of genetic differentiation between populations for detection of selection signatures on the first data set. This chapter focuses on detection of selection signatures between different chicken groups based on SNP-wise FST calculation. After removing overlapping SNPs between the three 600K SNP arrays a total of 1,139,073 SNPs remained. After filtering for minor allele frequencies lower than 5% and removing SNPs on unknown locations, a total of 1 million SNPs were available for FST calculation. The average of FST values were calculated for overlapping windows. The average FST values between overlapping windows were then compared to detect for selection signatures. Two sets of comparisons were made in this study in order to detect selection signatures. First, we performed a comparison between commercial egg layers and non-commercial fancy breeds and second between commercial egg-layer, white egg layers and brown egg layers. Comparing non-commercial and commercial breeds resulted in the detection of 630 selection signatures, while 656 selection signatures were detected in the comparison between commercial egg-layer breeds. Annotation of selection signature regions revealed various genes corresponding to productions traits, for which layer breeds had been selected. NCOA1, SREBF2 and RALGAPA1 were among the detected genes, which are associated with reproductive traits, broodiness and egg production. Several of the detected genes were associated with growth and carcass traits, including POMC, PRKAB2, SPP1, IGF2, CAPN1, TGFb2 and IGFBP2. These genes are good candidates for further studies. Our approach in chapter 2 demonstrates that including different populations with specific breeding histories can provide a unique opportunity for a better understanding of farm animal selection.
In the study described in the 3rd chapter, our aim was to use haplotype frequencies and considering the hierarchical structure in order to detect selection signatures. We used a subset of the first data set with a total of 74 individuals of three commercial layer breeds (White leghorn, White Rock and Rhode Island Red) and two non-commercial fancy breeds (G. g. gallus and G. g. spadiceus). To facilitate this, we used the statistical methods FLK and hapFLK. FLK calculates variation of the inbreeding coefficient by using a population's kinship matrix to incorporate hierarchical structure. A similar statistic is used in hapFLK but haplotype frequencies are used instead of allele frequencies. FLK and hapFLK were calculated in all layer breeds, using non-commercial fancy individuals for the estimation of the ancestral genetic distance. FLK and hapFLK were applied to three groups; all layers, white layers and brown layers. A total of 107 and 41 regions were detected as selective signatures in the FLK and hapFLK studies, respectively. Annotation of selection signature regions revealed various genes and QTL corresponding to productions traits, for which layer breeds were selected. A number of the detected genes were associated with growth and carcass traits, including IGF-1R, AGRP and STAT5B. We also annotated an interesting gene associated with dark brown mutational phenotype in chickens (SOX10). Our new analysis in chapter 3 provided a great comparison between FST, FLK and hapFLK. Large overlap exists between the regions that have been determined as regions under selection in the FST study and in the current analysis using FLK and hapFLK. QTL associated to meat production, as well as both IGF-1R and STAT5B, were located in regions that were similar between brown layers. These results showed a large degree of agreement with the FST results discussed of chapter 2. We demonstrated that using haplotype frequencies and considering a hierarchical structure can improve the power of detection in our data set.
The approach discussed in 4th chapter of this dissertation uses SNPs extracted from the pool sequence data (the second data set) to detect selection signatures. Over 30 million SNPs in 43 pools consisted of 3 commercial breeds (White leghorn, White Rock and Rhode Island Red) and 40 non-commercial breeds (including G. g. gallus, G. g. spadiceus and G. various) were used in this study. After filtering for mapping quality and sequencing depth, 22 million SNPs remained. The breeds were studied for selection signature in three contrasts (i.e. skin color, egg shell color, and toe number). FST was calculated between the two groups, whereas heterozygosity (HE) was obtained for each group. Both measures (FST and HE) were subsequently summarized in 40 kb windows with an overlap of 50% within each contrast. Comparisons of summarized FST and HE between overlapping windows was employed for selection signature detection, this was done to improve the power and reliability of detection. A total of eight regions (in all contrast) were detected as selective signatures using both FST and HE methods. Annotation of selection signature regions revealed one gene (BCO2) and three QTL corresponding to skin color and egg shell color, respectively. In this study we demonstrated that the use of sequence data with a larger number of populations and combination of methods with different statistic background (i.e. FST and HE) can improve the power of detection.
In conclusion, the results of the studies discussed in this dissertation showed that the identification of regions that were potentially under selection can be carried out by including various populations and utilizing of high-resolution genome scans (using dense marker or pool sequence data). Our study provides a great comparison between different inter-populations methods for selection signature detection (FST, FLK and hapFLK) and the use of different resolution of genome scans (pool sequence and high density chip). We demonstrated that use of inter-populations method with combination of an intra-populations statistic (heterozygosity) can improve the power of detection. We identified several putative selection signature regions with genes corresponding to the productions traits that chicken breeds were selected for. The regions identified are good candidates for further studies for both commercial purposes and biodiversity studies. We believe that our study gives a better understanding of farm animal selection, particularly in regard to chicken.