Animals and phenotypes
A total of 11,279 meat-type chickens were obtained from seven lines of two breeding companies in China. The W1, W2, W3, and W4 were four commercial white-feathered pure lines in Yunnan Province, China. Body weight exceeded 2.4 kg at 40 days of age in lines W1, W2, and W4. Body weight exceeded 1.6 kg at 40 days of age in line W3. The feed conversion ratio was under 1.69 between 28 and 40 days of age, and the age at first egg was 175–180 d. The Y1, Y2, and Y3 were three commercial yellow-feathered pure lines in Jiangsu Province, China. The feed conversion ratio was ~ 3.13, which was considered medium growth rate meat-type chickens.
Line W2 was selected for the egg production more than eight generations. We used the individuals of three generations (3159 chickens: 2532 females; 627 males), including the 6th, 7th and 8th generations. All other lines were used for individual data of one generation. All chickens had pedigree, genomic information and were housed in identical individual cages. There were 11,279 individuals in all lines, of which 9279 were recorded for egg numbers. Blood samples were collected from the brachial veins of each chicken using the standard procedure of the breeding program, which was approved by the Animal Welfare and Ethics Committee of the Institute of Animal Sciences, Chinese Academy of Agricultural Sciences (IAS-CAAS, Beijing, China).
Staged egg numbers statistics
We made a detailed division of the laying period using 627 hens from the 7th generation of line W2. We recorded the age at first egg (AFE), and egg production for each hen from first egg to 354 days of age. We divided the egg-laying period into four stages according to the laying curve in Additional file 1: Fig. S1 and counted the egg numbers (EN) at each stage. Four stages, including the stage of rapidly increasing egg production (from the onset of laying eggs to 195 days of age; EN1); the stage of peak egg production (from 196 to 227 days of age; EN2); the stage at which the laying rate was between 60% and 75% (from 228 to 307 days of age; EN3); and the stage of decreasing egg production (from 308 to 354 days of age; EN4). Moreover, we counted the total egg number from first egg to 354 d (TEN).
Genotyping and quality control
Genomic DNA was extracted from blood samples using the phenol-chloroform method. All the birds were genotyped using a customized 55 K SNP chicken array obtained from Beijing Compass Biotechnology Co., Ltd. (Beijing, China) [9]. We updated the SNP positions using the newest release from the University of California-Santa Cruz (GRCg6a/galGal6 genome version). The following quality control criteria were applied to the target panel: individual call rate ≥ 90%, SNP call rate ≥ 90%, and minor allele frequency ≥ 0.05. Ultimately, 37,847–45,346 SNPs and 11,279 birds remained for further analyses.
Population structure testing
Twenty individuals randomly selected from each line for structure using ADMIXTURE software (version1.3.0) [10]. A principal component analysis (PCA) plot was constructed using the “scatterplot3d” package in R (version 4.0.2).
Genome-wide association study
GWAS was performed using 9778 individuals with egg number phenotypes from the all lines. The GWAS for egg number traits was performed using the univariate linear mixed model implemented in GEMMA software, version 0.98.1 [11]. The statistical model was as follows:
$$y= W\alpha + x\beta +u+\varepsilon; u\sim MV{N}_n\left(0,\lambda {\tau}^{-1}K\right), \varepsilon \sim MV{N}_n\left(0,{\tau}^{-1}{\mathrm{I}}_n\right)$$
(1)
where y is the vector of phenotypic values; W is the vector of covariates, including a column of 1 s; α is the vector of the corresponding coefficients, including the intercept; x is the vector of marker genotypes; β is the effect size of the marker; u is the vector of random polygenic effects; ε is the vector of errors; τ−1 is the variance of the residual errors; λ is the ratio between the two variance components; K is the centered relatedness matrix estimated from SNPs, and Ιn is the identity matrix. MVNn is the n-dimensional multivariate normal distribution. The Wald test was used to select SNPs associated with egg production traits.
We used the parameters of plink software --indep-pairwise 25 5 0.2 to infer effective independent tests. Considering the over-conservation of the 5% Bonferroni correction method, we adjusted the threshold of genome-wide significant P-values based on the number of independent SNPs [12]. Manhattan and Q-Q plots were constructed for each trait using the “qqman” package in R (version 4.0.2). Boxplots were produced by the “ggplot2” package in R (version 4.0.2).
Meta-analyses
Meta-analyses of the seven lines were performed using standard method by Metal software (version 2011-03-25). Meta-analyses can combine either (a) test statistics and standard errors or (b) P-values across multiple GWAS for a single trait (taking sample size and direction of effect into account) [13]. The detailed calculation formula is introduced in reference [13]. Meta-analysis results in little or no loss of efficiency compared to analysis of a combined dataset including data from all individual studies.
Association study on GGA Z
Due to the particularity of GGA Z, we used “-method threshold” function of SNPTEST v2.5.6 software [14], a software for sex chromosome association study, for GGA Z association study. For chickens, female genotypes are encoded as 0/1 in SNPTAST. In addition to association test statistics, SNPTEST can output expected genotype and allele counts for diploid and haploid samples. Computation of allele frequencies and info statistics also take into account ploidy.
Selective sweep analysis
Selective sweep analysis was performed on all hens from the line W1 and 782 hens from the 8th generation of line W2, which had genomic information obtained by 55 K SNP chips. Line W2 was selected for the egg number more than eight generations, Line W1 was not selected for egg number. We used the “--weir-fst-pop” function of the Vcftools v0.1.14 software to calculate the population differentiation index (Fst), which were a window length of 40 kb and a step size of 2 kb [15].
Bayesian analysis
Two Bayesian approaches, Bayes B and Bayes R, were used to obtain the estimated marker effect explained by each SNP. Bayes B and Bayes R analyses were performed using the BGLR and hibayes package in R, respectively. The number of iterations after the burn-in phase and that of the burn-in period were 20,000 and 10,000, respectively. The purpose of the Bayesian analysis was to estimate whether the QTL we mapped had significantly larger effects. The Bayesian analysis’ result was the effect size of the SNPs, while the result of GWAS was the P value and effect size of SNPs associated with phenotype. Their results could be verified by each other.
Genome prediction models
The genome prediction models, genomic best line prediction (GBLUP) and single-step GBLUP (ssGBLUP), were utilized in this study and were as follows:
$$y= Xb+ Z\alpha +e$$
(2)
Where y is the vector of phenotypic value, X and Z are the association matrices of fixed and additive genetic effects, b is the vector of the fixed effect, α is the vector of the random additive genetic effect, and e is the vector of the random residual error. For GBLUP, when an α obeyed the following normal distribution α ~N(0, G \({\upsigma}_{\mathrm{u}}^2\)), G was the consanguinity matrix (G matrix). In ssGBLUP, the H matrix replaces the G matrix.
H matrix construction:
$${H}^{-1}={A}^{-1}+\left[\begin{array}{cc}0& 0\\ {}0& {G}^{-1}-{A}_{22}^{-1}\end{array}\right]$$
(3)
Where H is an inter-individual relational matrix using genome-wide markers and pedigrees, A is the pedigree-based relatedness matrix, G is the genome-based relationship matrix, and A22 is the individual genealogical relation matrix of sequencing individuals. The relative weights of G and A22 in the H matrix were set as Gω = 0.95 × G + 0.05 × A22 . The level of the G matrix was corrected to the A22 (matrix)
$${G}^{\ast }=a+b\times G$$
(4)
G∗ represents the adjusted G matrix, and a and b are as follows:
$$\mathrm{Avg}\left(\operatorname{diag}(G)\right)\times b+a=\mathrm{Avg}\left(\operatorname{diag}\left({A}_{22}\right)\right)$$
(5)
$$\mathrm{Avg}\left(\mathrm{offdiag}(G)\right)\times b+a=\mathrm{Avg}\left(\mathrm{offdiag}\left({A}_{22}\right)\right)$$
(6)
The formula of the H matrix after merging is:
$${H}^{-1}={A}^{-1}+\left[\begin{array}{cc}0& 0\\ {}0& {\left[0.95\times \left(a+b\times G\right)+0.05\times {A}_{22}\right]}^{-1}-{A}_{22}^{-1}\end{array}\right]$$
(7)
The animal models of the two traits are as follows:
$$\left[\begin{array}{c}{y}_1\\ {}{y}_2\end{array}\right]=\left[\begin{array}{cc}{X}_1& 0\\ {}0& {X}_2\end{array}\right]\left[\begin{array}{c}{b}_1\\ {}{b}_2\end{array}\right]+\left[\begin{array}{cc}{Z}_1& 0\\ {}0& {Z}_2\end{array}\right]\left[\begin{array}{c}{\alpha}_1\\ {}{\alpha}_2\end{array}\right]+\left[\begin{array}{c}{e}_1\\ {}{e}_2\end{array}\right]$$
(8)
Where y, X, b, Z, α, and e are the same formula (2).
The genetic parameters of line W2 for egg production traits were calculated using the ssGBLUP model, and the heritability of the seven lines was estimated by the GBLUP model. GBLUP and ssGBLUP were calculated using ASReml v4.1 software [16].