Phenotypic and genomic relationships between vulva score categories and reproductive performance in first-parity sows

Background One of the biggest challenges in the swine industry is to increase female reproductive efficiency. Recently, vulva score categories (VSC), assessed prior to puberty, has been proposed as an indicator trait of efficient reproductive performance in sows. The objective of this study was to validate the use of VSC as an indicator trait for reproductive performance, and to perform genetic and genomic analyses for VSC. Methods The phenotypic relationship of VSC, using a three-point scale: small (VSC-S), medium (VSC-M), and large (VSC-L), on reproductive performance was evaluated on three farms. VSC was measured at 15 weeks of age, for farms 1 and 2, and at 14 weeks of age for farm 3 on 3981 Yorkshire gilts, in which 1083 had genotypes (~ 50 K SNPs). Genetic parameters for VSC with reproductive traits were estimated using ssGBLUP. A Genome-wide association study (GWAS) for VSC was performed using BayesB. Results For the phenotypic analysis of VSC across datasets, differences in performance were identified there was a significant effect (P ≤ 0.05) for the interaction between Farm and VSC for total number dead (TND), and a trend (P < 0.10) for total number born (TNB). There were significant (P ≤ 0.05) pre-defined contrasts of VSC-S versus VSC-M + L on TNB, number born alive (NBA), TND, number of stillborn (NSB), and number of mummies (MUM). Heritability estimates for VSC as a categorical trait (VSCc) and a quantitative trait (VSCq) were 0.40 ± 0.02 and 0.83 ± 0.02, respectively, for across farm, 0.13 ± 0.07 and 0.20 ± 0.10, respectively, for Farm1, 0.07 ± 0.07 and 0.09 ± 0.09, respectively, for Farm2, and 0.20 ± 0.03 and 0.34 ± 0.05, respectively, for Farm3. For across farms, favorable genetic correlations estimates were found for TNB (0.28 ± 0.19) and NBA (0.26 ± 0.17). Within farms, moderate genetic correlations between VSC with reproductive traits were found for TNB (0.61 ± 0.47) and MUM (0.69 ± 0.47) for farm 1, for number of services until first farrow (NS; 0.69 ± 0.38) and unique service with successful first farrow (SFS; − 0.71 ± 0.38) for farm 3. Multiple genomic regions associated with VSCc were identified. Of these, a QTL located on chromosome 3 at 33–34 Mb accounted for about 7.1% of the genetic variance for VSCc and VSCq. This region harbors the gene PRM1 that has been associated with early embryonic development in pigs. Conclusions The results support potential of VSC for improved reproductive efficiency on first-parity performance, but the results might depend on the interaction between environmental factors and VSC, as well as potentially additive genetics.


Background
The efficiency of reproductive performance in the swine industry is critical to maximizing productivity. However, genetic selection for reproductive traits in sows (e.g., litter size traits) is challenging because of their low heritability [1]. Additional difficulties for genetic selection for these traits include the fact that these traits are sexdependent and expressed later in life. To overcome these limitations, one strategy could be the identification of an indicator trait, which should have: high heritability, have high favorable genetic correlation (r G ) with reproductive traits, be easy and cheap to measure, and be expressed early in life. Recently, the use of vulva size as an indicator trait for reproductive traits has been explored [2,3]. Pre-pubertal gilts with larger vulva width at~15 weeks of age had greater follicular activity and reached puberty at a younger age compared to those with smaller vulva width [2]. Romoser et al. [3] reported favorable phenotypic relationship between vulva width scores and litter size in sows, suggesting that vulva score categories (VSC) could be used as a proxy for reproductive performance. Gilts classified as having large VSC had higher first farrowing rates (84.4% vs. 64.7%) and number of piglets born at first parity (12.4 vs. 11.8) compared to gilts classified as having small VSC [2]. On the genetic side, knowledge is scant regarding this novel trait. Knauer et al. [4] explored the genetics of vulva width in gilts of approximately 162 days of age and reported a moderate heritability estimated (h 2 = 0.57). Corredor et al. [5] reported genetic parameters and QTL for vulva size traits in Landrace and Yorkshire gilts. Heritability estimates in Yorkshire gilts ranging from 0.31 (vulva width) to 0.55 (vulva height), and major QTL for VS on chromosomes 1 (87-91 Mb and 282-287 Mb), and 5 (67 Mb) were observed, explaining up to 6.9% of the genetic variance [5]. However, results from Romoser et al. [3] have not been validated in an independent dataset and no studies, to the best of our knowledge, have evaluated both genetic and phenotypic relationships between VSC and reproductive traits. Therefore, the objectives of this study were to 1) validate the phenotypic relationship of VSC on reproductive performance, validating the work of Romoser et al. [3] using an additional datasets, 2) estimate genetic parameters for VSC and reproductive traits; and 3) perform GWAS for VSC in three populations of first-parity gilts.

Animals and phenotype data
A total of 3981 (Farm1 + 2 + 3) Yorkshire gilts from three farms located in Colorado, USA, were used in this study, with 746, 722, and 2513 from farms 1 (Farm1), 2 (Farm2), and 3 (Farm3). All animals were from the same genetic source and were reared under the same controlled conditions. At 15 weeks of age in Farm1 and Farm2, and at 14 weeks of age in Farm3, all gilts were assigned a VSC, following methodology described in Romoser et al. [3]. The VSC of gilts were visually categorized by the same trained person in the three farms using a three-point scale: small (VSC-S), medium (VSC-M), and large (VSC-L). No additional information was available for these farms. The frequencies of the VSC for each farm were 21, 547, and 178, respectively, for Farm1; 12, 533, and 177, respectively, for Farm2; and 532, 1569, and 412, respectively, for Farm3 (Table 1).
First-parity performance traits included number of piglets born alive (NBA), number of stillborn piglets (NSB), and number of mummified piglets (MUM). Total number of born dead piglets (TND) was calculated as NSB + MUM, and total number of born piglets (TNB) was calculated as NBA + TND. Prior to statistical analyses, data on TND, NSB, and MUM were transformed in order to meet the assumptions of statistical inference [6] using ln(y + 1), where y represents the observed phenotype of the trait. In addition, farrowing traits included: age at first service (AFS) in days, number of services until first farrow (NS), and unique service with successful first farrow (SFS, success = 1, fail = 0). In this study, a single service was defined as any service events within 14 days. Therefore, a second service event was any one occurring 14 days after the first event. The summary statistics of these traits is shown on Table 2. A 9-generation pedigree including 9080 individuals was available.

Genotype data
Genotype data were available for 193, 202, and 688 animals for Farm1, Farm2, and Farm3, respectively (Table  1). DNA was isolated from tail or ear tissue using the ReliaPrep 96/KingFisher tissue kits (Promega, Madison, WI, USA). Individuals were genotyped using a custom Affymetrix/Thermo Fisher Axiom® genotyping array containing 51,467 evenly spaced SNPs. Markers without known position, located on sexual chromosomes, with minor allele frequencies below 0.01, and with a call rate below 0.8 were excluded. After this genotype quality control performed on the genotypic information from all farms combined, a total of 6778 SNPs were excluded. The final number of SNPs that remained in the data set were 44,689 SNPs. The remaining missing SNP genotypes were imputed chromosome-wise across all farms genotype information combined using a Hidden Markov Model based algorithm implemented in Eagle v.2.4.1 software [7]. A previous study using other animals from the same population has shown an imputation accuracy of missing genotypes of over 95% using this SNP chip [8].

Phenotypic analysis of vulva score categories on reproductive traits
The phenotypic relationship between VSC with reproductive traits (for TNB, NBA, TND, and MUM) was evaluated using the single-step BLUP (ssGBLUP) procedure [9] in the following model: Where y ijk is the observed phenotype (i.e., reproductive traits); μ is the general mean; Farm i is the i th level of the fixed-effect of farm; VSC j is the j th level of the fixedeffect of vulva score category; (Farm × VSC) ij is the interaction term between Farm i and VSC j ; a k is the animal random effect of the k th animal, assuming a k $ Nð0; H σ 2 a Þ, where H is the additive genetic relationship matrix including genotyped and non-genotyped animals [10]; and e ijk is the random error term associated with y ijk , assuming e i jk ∼Nð0; Iσ 2 e Þ, where I is the identity matrix. For TNB, NBA, TND, NSB, and MUM, the effect of VSC was estimated using a similar model, with the addition of a random effect of contemporary group (CG, combination of year and week of farrow), assuming CG l $ Nð0; Iσ 2 CG Þ. For NS, the random effect of CG was included in the model as a combination of year and week of service. For the analysis of NS and SFS, the effect of AFS was included as a covariate in order to account for the age of the gilt at time of insemination. In addition to testing the overall effect of VSC on reproductive traits, an additional contrast was evaluated following Romoser et al. [3], in which we tested the difference between VSC-S and the average of VSC-M and VSC-L, as well as of another contrast comparing the average of VSC-S and VSC-M with VSC-L. The threshold for significant and trending effects were P-value< 0.05 and P-value< 0.10, respectively. In addition, we also evaluated the effect of random effect of service sire on these models. However, due to the high number of missing data and lack of effect (i.e., < 1% of the variation explained by this effect), this strategy was not further pursued. All analyses were performed in ASReml v4.0 [11].  Genetic parameters and efficiency of correlated response to selection Genetic parameters for VSC were estimated using the following animal model: Where y ij is the observed phenotype; μ is the overall mean; Farm i is the i th level of the fixed-effect of farm; a j is the animal random effect of the j th animal, assuming a j $ Nð0; Hσ 2 a Þ, where H is the additive genetic relationship matrix including genotyped and non-genotyped animals [10]; and e i is the random error term associated with y ij , assuming e i j ∼Nð0; Iσ 2 e Þ . In addition to this model, we had also evaluated the random effects of week of VSC measurement and common-environment (i.e., litter effect) in the model, but the variance estimates for these effects were close to zero (data not shown), and hence, these effects were not included in the final model. Genetic parameters for AFS, TNB, NBA, TND, NSB, MUM, NS, and SFS were estimated including the appropriate random CG effect, as previously described in model (1). In addition to the model described above for all three farms, these analyses were also performed for each farm separately. Genetic parameters were estimated for VSC as a categorical (VSC c ) and as a continuous (VSC q ) trait. Heritabilities were estimated using a probit mixed model for VSC c and SFS, and using a general mixed linear model for all other traits. Genetic and phenotypic correlations were estimated within farm between VSC and reproductive traits, and genetic correlations between farms for VSC. Due to software limitations, these were estimated for VSC q . The efficiency (E) of correlated response to selection was estimated as: Where r G is the estimated genetic correlation between VSC and the reproductive trait of interest; h VSC is the square root of the heritability estimate for VSC; and h Trait of interest is the square root of the heritability estimate for a reproductive trait of interest.

Genome-wide association analysis
Genome-wide association studies (GWAS) was performed for VSC for each farm and across farms, using Bayesian genomic prediction methods [12] using the following model: Where y i is the observed phenotype; μ is the overall mean; m ij is the genotype at the j th SNP for the animal i; a j is the allele substitution effect for the j th SNP, and e i is the error term associated with y i , assuming e i ∼Nð0; I σ 2 e Þ. Additionally, data on the three farms were analyzed simultaneously, and in this analysis, the fixed effect of farm was included in the model. The estimates of additive genetic and residual variances obtained from the genetic parameter estimation were used as priors in BayesC analysis, assuming all SNPs with an effect (i.e., π = 0). Then, BayesCπ was performed to estimate the proportion of SNPs with zero effect (π). Afterwards, analyses were performed using BayesB, with a π = 0.999. Analyses were carried out using 50,000 iterations using Gibbs sampling, and a burn-in of 5000 cycles. Analyses were performed in GenSel version 4.4 [13].
Putative candidate genes within identified QTL regions and in the neighboring upstream and downstream 3-Mb regions were identified based on the Sscrofa11.1 genome assembly, using the BioMart tool from the Ensembl Genome Browser (https://www.ensembl.org/index.html). The 3-Mb neighboring regions of each side of the identified regions were investigated to account for the resolution of the QTL mapping method used in this study [14]. QTL regions explaining at least 1% of the total genetic variance accounted for by the markers (TGVM) were discussed in this study, including the identification of candidate genes within these QTL.

Phenotypic analysis of vulva score categories on reproductive traits
The phenotypic relationship of VSC on reproduction performance is shown in Table 3. For the phenotypic analysis of VSC across datasets (Table 3), there was a significant effect (P ≤ 0.05) for the interaction between Farm and VSC for TND, and a trend (P < 0.10) for TNB. Although this interaction was not significant (P ≥ 0.11) for NBA, NSB, and MUM, we observed a significant (P ≤ 0.05) pre-defined contrast of VSC-S versus M + L for these traits, as well as for TNB, and TND.
The relationships found for TND were more complex. For Farm1, the relationship was the same as for NBA, with lower (P < 0.05) number of piglets in VSC-S gilts (0.35 ± 0.05) compared to VSC-M and VSC-L (0.82 ± 0.02). For Farm2, VSC-M had the fewer (P < 0.05) piglets (0.55 ± 0.01) than VSC-S and VSC-L (0.72 ± 0.05), whereas no relationships were found for Farm3 (P < 0.05). For NSB, the same relationships found for TND were found, with greater performance found in VSC-S gilts on Farm1, and lower performance found in VSC-S gilts of Farm2 compared to VSC-M and VSC-L (P < 0.05 for both farms). No relationships were found (P > 0.05) between VSC and NSB on Farm3. Finally, VSC-S gilts showed fewer (P < 0.05) MUM (0.15 ± 0.04) than VSC-M and VSC-L (0.41 ± 0.01) on Farm1, whereas no relationships (P > 0.05) were found for the other farms. These results show that the overall relationship between VSC and reproductive performance depends on the environment (i.e., farm) in which these gilts are raised in.
We also found overall relationships between VSC and reproductive performance. There were relationships between VSC with NS (P = 0.08), and SFS (P = 0.05), with VSC-S showing overall greater performance than VSC-M and VSC-L gilts. VSC-S required fewer (P < 0.05) NS (1.06 ± 0.04) than VSC-L gilts (1.08 ± 0.01) and had greater (P < 0.05) SFS (0.94 ± 0.03) than VSC-L gilts (0.92 ± 0.01). Although we found a trend effect of Farm×VSC on TNB (P = 0.06) and significant (P < 0.01) pre-defined contrast (S vs. M + L), there was a trend (P = 0.08) for the main effect of VSC, with VSC-M  Means lacking the same superscript within a trait are significantly different at P < 0.05 based on the main effect of VSC (11.65 ± 0.09) having lower performance (P < 0.05) than VSC-S (11.72 ± 0.36) and VSC-L (11.72 ± 0.13). Finally, there was an effect of Farm for all traits analyzed (P < 0.01), indicating that the environment is a major contributor for the variability of the reproductive data in this study.
Estimates of phenotypic (r P ) and genetic (r G ) correlations between VSC q with reproductive traits for each farm are presented in Table 5. For across farms, estimates of r G were low, ranging from 0.28 ± 0.19 (TNB) to − 0.30 ± 0.13 (SFS). Favorable r G estimates were found for TNB (0.28 ± 0.19) and NBA (0.26 ± 0.17), whereas unfavorable estimates were found for AFS (0. 25  For the remaining traits, estimates were low, ranging from − 0.15 ± 0.14 (MUM) and 0.14 ± 0.39 (TND). Estimates of r P were low between VSC q with reproductive traits across all datasets. For Farm1, these ranged from − 0.07 ± 0.04 (SFS) to 0.04 ± 0.05 (AFS), for Farm2 from − 0.03 ± 0.04 (AFS) to 0.04 ± 0.04 (NSB), whereas for Farm3 these ranged from − 0.05 ± 0.02 (SFS) to 0.06 ± 0.02 (NBA). The efficiency of correlated response to selection when selecting for increased VSC, would be 1.80 and 2.37 for TNB and NBA, respectively, for across farms, 1.58 and 0.37 for TNB and NBA, respectively, for

Genome-wide association analysis
Results from GWAS were similar for VSC q and VSC c . Thus, only VSC c results are presented in Fig. 1 and Table 6. A total of 20 unique genomic regions (quantitative trait loci; QTL) explaining more than 1% of the total

Discussion
In this study, we investigated the relationship between VSC, assigned to 14-and 15-week old gilts, with subsequent reproductive performance. We aimed to corroborate, at both genetic and phenotypic levels, the previous findings from Graves et al. [2] and Romoser et al. [3]. Graves et al. [2] discovered a relationship (r P = − 0.28, P = 0.01) between prepubertal vulva width and age at first estrus. These authors suggested that VSC measurements in gilts between 95 and 115 days of age could be used as a proxy for ovarian development and onset of puberty. Following this reasoning, Romoser et al. [3] determined that VSC-L gilts were more likely to achieve parity 1 compared to VSC-S (84.4% vs. 64.7%, respectively; P = 0.02), and presented greater TNB than VSC-S (12.4 vs. 11.8, respectively; P = 0.02). In addition to validating these results, we sought to explore the genomic basis of VSC in gilts.

Phenotypic analysis of vulva score categories on reproductive traits
In general, significant phenotypic relationships between VSC and the traits evaluated were observed. However, the direction of the relationship (i.e. positive or negative) between VSC with these traits were not consistent. In Romoser et al. [3], there was a consistent relationship between VSC and reproductive traits, in which the greater was the VSC, the better was performance. Differently than in our study, these authors evaluated these relationships on the same farm. Furthermore, we fitted the random animal effect in the model used for these analyses, which was not the case for Romoser et al. [3]. This additional effect could have helped showing differences between both studies. Nonetheless, within a farm, results were overall reasonable. For example, in Farm1, there was a favorable relationship with NBA. Furthermore, the relationships between VSC and TNB were numerically positive, which is in accordance with the increased NBA observed in this farm. However, gilts with VSC-M and VSC-L had larger litter size but also larger TND, NSB and MUM, indicating that a greater VSC would increase overall litter size, in the expense of having dead piglets. For Farm2, sows with VSC-M and VSC-L had lower TNB but also lower TND and NSB. Furthermore, the relationships between VSC and MUM were favorably negative, which is in accordance with the decrease in TND. For Farm3, sows with VSC-M and VSC-L had Table 5 Estimates of phenotypic (r P ) and genetic (r G ) correlations between vulva score categories a and reproductive traits b for each dataset and across datasets  higher TNB and NBA. Furthermore, the relationships between VSC and TND, NSB, and MUM were favorable. Gilts with VSC-L had lower TND, NSB and MUM. In general, the positive favorable relationships between VSC with TNB, and NBA for Farm1 and Farm3, and the negative favorable relationships between VSC with TND, NSB, and MUM for Farm2 and Farm3, were consistent with those observed in the study of Romoser et al. [3].
The reasons for differences in VSC results between farms are unclear. In addition to the non-genetic factors that are inherited from each farm that do not allow us to separate them in our statistical analyses (i.e., confounded effects), two other processes could have resulted in these differences. First, the distributions of VSC-S in Farm1 and Farm2 were very different than in Farm3, with 2.8%, 1.7% and 21.2% in Farm1, Farm2, and Farm3, respectively. With this, the very low frequencies in Farm1 and Farm2 increased the standard error (SE) of the estimates, decreasing the statistical power to identify differences in performance based on VSC. This was not the case for Farm3. When ignoring the large SE, numerically, favorable phenotypic relationships were identified for TNB in Farm1, for MUM, NS, and SFS in Farm2, and for TND, NSB, and MUM in Farm3. Second, a different genetic makeup between farms could explain this. Animals in the three farms were from the same breed (Yorkshire) and were sourced from the same genetic source. In order to investigate the population structure of the three farms, we performed a principal component analysis (Fig. 2) with the use of the base prcomp function in R [15] . Although there were four visible clusters based on this analysis, we can see that the same clusters were formed across populations, indicating that they do share the same within population variation, but not between population variation. This could indicate that the differences in results should not be due to different genetic makeup of the three populations. With similar genetics and potential different environmental effects (which based on this data may not be additive), we could hypothesize that a possible explanation for the different results could be due to genotype-by-environmental interaction (i.e., G×E). If this is the case, proposing the use of VSC in pre-pubertal gilts as a selection tool for farrowing performance must be taken carefully, as the ideal environment must be obtained in order to identify this effect.

Genetic parameters and efficiency of correlated response to selection
The heritability estimates across farms dataset and for each farm showed that VSC is highly heritable, and therefore, selection for this trait is possible. Heritability estimates for VSC c and VSC q estimates from the across farms dataset are similar than the ones reported for vulva width in crossbred Landrace × Large White gilts by Knauer et al. [4], with 0.57 ± 0.09, and for vulva size measurements by Corredor et al. [5], with 0.46 ± 0.10, 0.55 ± 0.10, and 0.31 ± 0.09 in Yorkshire population for vulva area, height, and width, respectively. However, these authors evaluated objective continuous VS measurements, whereas in the present study we only had categorical VS (VSC), which limited our power to properly link the observed variance with genetic variability in the population. In addition, the age difference during measurements between studies, with 23 and 14.5 weeks of age for Corredor et al. [5] and the current study, respectively, could be an added reason for the differences. Also, heritability estimates for VSC q were estimated different than VSC c , however, they are not directly comparable since they are in different scales. Heritability estimates for VSC q are in the scale of the observed data, while estimates of heritability for VSC c were obtained with a non-linear (threshold) model and, therefore, the estimates are in the latent scale.
The estimates of genetic correlation for VSC q between farms were overall high. The high estimate close to unity for Farm1 and Farm2 (0.97 ± 0.25) indicates that the animals with high genetic potential for VSC in one farm would have high genetic potential in the other farm. In fact, the estimates of genetic correlation between VSC q and the reproductive traits for Farm1 and Farm2 had a similar direction and magnitude for TNB, TND, and NSB. This was also the case for Farm2 and Farm3, that had an estimate of genetic correlation close to unity (0.98 ± 0.16) and coincided with the direction of negative genetic correlations between VSC q with AFS, and MUM, and positive genetic correlations between VSC q with TNB, TND, and NSB. Between Farm1 and Farm3, we found a lower genetic correlation estimate (0.77 ± 0.22) for VSC q compared to those for the other farms. This more moderate genetic correlation coincides with some greater differences in the direction of genetic correlations between VSC q and the traits, such as for AFS and MUM. This moderate genetic correlation would indicate re-ranking of animals between Farm1 and Farm3, suggesting the presence of G×E.
In general, the heritability estimates for litter size traits TNB, NBA, TND, NSB, and MUM were low, and in accordance with recent reports in the literature [16][17][18][19][20], and thus, properly representing data from comparable studies. However, scarce literature is available for the genetic basis of other fertility related traits, such as AFS, NS, and SFS. Holm et al. [21], in a study with Landrace, reported heritability estimates for AFS and return rate on gilts (binary trait based on whether the gilt was reinseminated after the first service), with 0.37 ± 0.01 and 0.03 ± 0.01, respectively. In our study, the heritability estimates for AFS in Farm1 was the same (0.37 ± 0.12) as in Holm et al. [21], whereas estimates for the other farms and across farms were numerically lower, with 0.27 ± 0.11, 0.14 ± 0.04, and 0.08 ± 0.03 for Farm2, Farm3, and across farms, respectively. In general, selection for improved AFS, NS, and SFS is feasible, and could result in more reproductively efficient sows.
Estimates of genetic correlations were positive between VSC q with TNB, NBA, TND, and NSB across farms and for all three farms. These estimates indicated that a higher VSC corresponds genetically to larger TNB and NBA, further supporting the idea of using VS as a selection tool to increase NBA [3] but also high NSB and TND. However, the magnitude of these estimates differed across and between farms. Stronger favorable correlations between VSC with TNB and NBA were obtained in across farms, Farm1, and Farm3, which also  had greater heritability estimate for VSC, compared to Farm2. In contrast, for the unfavorable correlations (TND and NSB), numerically, Farm2 had greater estimates. Even though there were unfavorable genetic correlation for VSC, the stronger favorable genetic correlations for TNB and NBA indicates that, overall, there is an overall benefit in selecting for increased VSC in Farm1 and Farm2. There were contrasting genetic correlation estimates between VSC with AFS, MUM, NS, and SFS, depending on the dataset analyzed. For across farms and Farm1, all directions were unfavorable, whereas for Farm2, all directions were favorable, and for Farm3, unfavorable for NS and SFS, and favorable for AFS and MUM. The reasons for these differences are unknown. Finally, in general, many of these estimates had moderate to large standard errors, and hence, the value of the estimates should be taken carefully.
Given the favorable r G estimates, the correlated response to selection for TNB and NBA would be limited. Although overall results indicate superior response to correlated response to selection for these traits using VSC q , the within farm analysis indicate that a greater efficiency would only be possible based on the results for Farm1. However, although there seems to be a limited impact of VSC on reproductive traits at the genetic level, there was a clear impact of VSC on reproductive traits at the phenotypic level (Farm1 and Farm3), indicating that there is great potential in using VSC for culling criterion. Regardless on this limitation, a combined phenotypic culling and genetic selection strategy could potentially be used to optimize selection for increased NBA in purebred populations. Selecting for VSC instead of a reproductive trait would be beneficial to anticipate selection, since VSC can be measured in gilts prior to insemination. In addition, this would allow for a higher intensity of selection because a larger population of gilts would be available prior to insemination. This advantage of having an early-age indicator of future reproductive performance could also increase response to selection by reduction the generation interval in female pigs. Thus, the use of VSC for both phenotypic and genetic purposes may be beneficial to the swine industry.
Finally, as seen for the phenotypic relationship between VSC with reproductive performance, results were somewhat different among the different datasets analyzed, even if for most cases the direction of correlations were similar across them. But, this additional inconsistency in results between farms suggest that 1) relationships within dataset might be real (similar results within datasets), which supports the hypothesis of 2) occurrence of G×E due to the non-genetic differences previously discussed in this study.

Genome-wide association study
Results from genomic analyses differed among farms, which is in accordance with all other results presented in this study, further suggesting that non-genetic effects may be playing a role in the expression of VSC phenotypes between the three farms. Although there were no genomic regions identified for Farm2, Farm1 and Farm3 had associations with greater %TGVM for VSC than when the whole dataset was used for analysis (i.e., Farm1 + 2 + 3). However, analysis using Farm1 + 2 + 3 noted additional regions not identified when analyses were performed for each farm separately. This difference in results could be due to the much larger sample size used for the Farm1 + 2 + 3 (1083 observations compared to 193, 202, and 688, for Farm1, Farm2, and Farm3, respectively) which should have improved the statistical power to identify these additional QTL. Nonetheless, it is important to note that none of the Farm1 QTL were identified using Farm1 + 2 + 3. Given that the sample size of both farms was similar, the QTL identified in both Farm1 + 2 + 3 and Farm3 analyses could potentially indicate general QTL for VSC, whereas those identified in Farm1 or Farm3 and not in Farm1 + 2 + 3 could represent the potential occurrence of GxE in this study.
While Corredor et al. [5] investigated the genomic basis of quantitative vulva measurements in an independent dataset with Yorkshire and Landrace gilts, this is the first study to investigate the genomic basis of vulva qualitative assessments. Three QTL identified using data from Farm3 coincided with Corredor et al. [5]. In our study, the QTL identified on SSC 1 (85 Mb) is in close proximity to the one reported by Corredor et al. [5] on SSC 1 (87-91 Mb) as well as the one on SSC 10 (1 and 24-25 Mb) is in close proximity to the one reported on SSC 10 (8-19 Mb) by these authors for vulva area and height in Landrace gilts. For Farm1 + 2 + 3, some of the uniquely identified QTL also coincided with those reported by Corredor et al. [5]. The QTL on SSC 7 (103-104 Mb) is close to the one reported by these authors on SSC 7 (107-110 Mb) for vulva area and height in Landrace gilts.
Candidate genes related to reproductive development and performance were proposed for the identified regions that explained more than 3%TGVM, including the regions on SSC 1 (85 Mb and 160 Mb), 3 (33-34 Mb), 5 (91-99 Mb), 7 (103-104 Mb), 15 (52 Mb), and 18 (14 Mb). Within the QTL region on SSC 1 (85 Mb) serine protease 35 (PRSS35) is located, a gene that has been identified as a novel mouse ovary gene [22,23]. Miyakoshi et al. [22] determined, using real-time polymerase chain reaction, that PRSS35 was highly expressed at the time of ovulation and remained elevated in the developing corpus luteum. Wahlberg et al. [23] performed a study to identify new proteases that are involved in ovulation, using a microarray analysis of gene expression. Wahlberg et al. [23] found that PRSS35 was highly expressed in the theca layers of developing follicles, and it was also expressed in the forming and regressing corpus luteum. Taken together, Miyakoshi et al. [22] and Wahlberg et al. [23] results suggested that PRSS35 may be involved in ovulation in mice. In a study in humans, Li et al. [24] assessed the expression of PRSS35 and observed that expression in cumulus cells of fertilized oocytes were significantly higher than those in cumulus cells of unfertilized oocytes. Li et al. [24] concluded that PRSS35 may be correlated with oocyte fertility potential.
The QTL on SSC 1 (160 Mb) harbors serpin family B member 11 (SERPINB11). Yang et al. [25] investigated the expression of SERPINB11 in mice uteri during early pregnancy and suggested that SERPINB11 is involved in embryo implantation and decidualization. Similarly, Yang et al. [26] investigated the expression of SERP INB11 in mice testis and suggested that SERPINB11 might be involved in spermatogenesis. Likewise, Lim et al. [27] evaluated the expression profile of this gene across various tissues in chickens and observed high abundance of SERPINB11 expression in the chicken oviduct, specifically in the luminal and glandular epithelia.
The gene protamine 1 (PRM1) is located within the QTL region on SSC 3 (33-34 Mb), which has been associated with sperm quality and embryonic early development in humans and pigs [28][29][30][31][32]. Depa-Martynów et al. [32] investigated the relationship between PRM1 mRNA expression, among other genes, with embryonic development and sperm capacitation in humans. These authors concluded that PRM1 mRNA expression could be used for estimating quality of spermatozoa in humans. However, this relationship has not been demonstrated in pigs [31].
Within the QTL region SSC 5 (91-99 Mb) we found KIT ligand (KITLG). In humans, KITLG has been associated with male infertility [33] and oocyte growth and follicular development [34,35]. In porcine, expression of KITLG in the porcine ovary of prepuberal and mature animals by in situ hybridization showed that this gene is expressed in the granulosa cell layer and in the endothelial tissue and throughout the corpus luteum [36]. Brankin et al. [36] suggested that in the mature animal KITLG have a role in maintaining progesterone secretion by the corpus luteum.
The gene thyroid stimulating hormone receptor (TSHR) is located within the QTL region on SSC 7 (103-104 Mb). TSHR is a vital element in the pituitary thyroid axis of all vertebrates. TSHR commands to intracellular processes required for the synthesis, storage, and secretion of thyroid hormones, the main regulators of cellular metabolism [37,38]. Karlsson et al. [39] investigating a domestic related mutation in the TSHR, found that it modulates photoperiodic response in chickens. These authors suggested that TSHR plays a key role in the signal transduction of seasonal reproduction. Rodríguez-Castelán et al. [40] explored the distribution of TSHR in reproductive organs of female rabbits. They found a presence of TSHR in the primordial, primary, secondary, tertiary, and Graafian follicles of virgin rabbits, as well as in the corpora lutea, corpora albicans, and wall of hemorrhagic cysts of pregnant rabbits. These wide presence of TSHR in female reproductive organs could suggests varied effects of TSHR in the reproduction of rabbits.
Neuregulin 1 (NRG1) resides within the QTL region on SSC 15 (52 Mb). Studies in mice and chickens have concluded that NRG1 exerts an important regulatory role in oocyte meiotic maturation [41][42][43]. Jeon et al. [42] revealed that relative expression of NRG1 mRNA increased in the oviducts of chicks treated with a synthetic non-steroidal estrogen. Furthermore, these authors suggested that NRG1 is a novel estrogenresponsive gene closely correlated with the development of the oviduct of chicks.
The genes aldo-keto reductase family 1 member B (AKR1B1) and stimulated by retinoic acid 8 (STRA8) are located within the QTL region on SSC 18 (14 Mb). Multiple gene expression studies in humans and cattle demonstrated that AKR1B1 is strongly associated with prostaglandin production, which is an important regulator of female reproductive function [44][45][46][47][48]. In pigs, AKR1B1 functions in prostaglandin metabolism during the estrous cycle and pregnancy [49]. Studies in mice showed that STRA8 expression is required for meiotic initiation in both female and male germ cells [50][51][52][53]. This gene has been explored in a study with transgenic pigs, observing the expression of this gene in testicular tissue [52]. These authors concluded that the expression of STRA8 in transgenic pigs, from mouse STRA8 promoter, could be useful as an animal model to study male germ cell manipulation and development.
In general, the genomic regions identified in this study for VSC include relevant genes for reproduction-related traits. Most of the genomic regions identified to be associated with VSC were associated with follicular and/or embryonic development. Furthermore, the genetic and phenotypic associations discovered between VSC and reproductive traits in this study additionally corroborate with the biologically relevant findings from our genomic analyses for VSC. Additional research to validate the use of VSC in different environments, using additional parities and genetic lines to continues assessment as an indicator trait of reproductive performance are warranted.

Conclusion
In this study, phenotypic analyses support that VSC is associated with improved reproductive performance of sows, with large VSC gilts having greater NBA than small VSC gilts. VSC had moderate heritability among farms, showing that selection for VSC is possible. However, the genetic correlation for VSC between farm indicated the presence of G×E. For each farm, VSC was positively genetically correlated with TNB and NBA indicating that selection for greater VSC could result in increased litter size. Several genomic regions associated with VSC were identified, locating relevant candidate genes with reproductive function. These results support phenotypic relationship between VSC with TNB and NBA but that environmental factors could influence this relationship.