A Capra hircus chromosome 19 locus linked to milk production influences mammary conformation

Background Economically important milk production traits including milk volume, milk fat and protein yield vary considerably across dairy goats in New Zealand. A significant portion of the variation is attributable to genetic variation. Discovery of genetic markers linked to milk production traits can be utilised to drive selection of high-performance animals. A previously reported genome wide association study across dairy goats in New Zealand identified a quantitative trait locus (QTL) located on chromosome 19. The most significantly associated single nucleotide polymorphism (SNP) marker for this locus is located at position 26,610,610 (SNP marker rs268292132). This locus is associated with multiple milk production traits including fat, protein and volume. The predicted effect of selection for the beneficial haplotype would result in an average production increase of 2.2 kg fat, 1.9 kg protein and 73.6 kg milk yield. An outstanding question was whether selection for the beneficial allele would co-select for any negative pleiotropic effects. An adverse relationship between milk production and udder health traits has been reported at this locus. Therefore, a genome wide association study was undertaken looking for loci associated with udder traits. Results The QTL and production associated marker rs268292132 was identified in this study to also be associated with several goat udder traits including udder depth (UD), fore udder attachment (FUA) and rear udder attachment (RUA). Our study replicates the negative relationship between production and udder traits with the high production allele at position 19:26,610,610 (SNP marker rs268292132) associated with an adverse change in UD, FUA and RUA. Conclusions Our study has confirmed the negative relationship between udder traits and production traits in the NZ goat population. We have found that the frequency of the high production allele is relatively high in the NZ goat population, indicating that its effect on udder conformation is not significantly detrimental on animal health. It will however be important to monitor udder conformation as the chromosome 19 locus is progressively implemented for marker assisted selection. It will also be of interest to determine if the gene underlying the production QTL has a direct effect on mammary gland morphology or whether the changes observed are a consequence of the increased milk volume. Supplementary Information The online version contains supplementary material available at 10.1186/s40104-021-00667-y.


Background
Genetic gain focused on production for the dairy goat population in New Zealand is limited largely to within farm selection and some buck transfers between farms. This has resulted in considerable unrealised marker driven selection potential which led to the search for and discovery of quantitative trait loci (QTL) for milk production traits in a subset of the population. In particular, the most significant QTL identified was located on chromosome 19 associated with all three production traits including fat yield, protein yield and milk volume [1]. A QTL at the same location was also reported in the French [2,3] and British dairy goat populations [4]. A pleiotropic effect for this QTL has been described on a number of udder conformation traits including udder attachment and udder depth [3,4]. It appears that udders in animals with higher milk production determined by the chromosome 19 locus are characterised by greater udder depth with apparent weaker and narrower udder attachment. It is important to evaluate potential negative pleiotropic effects before large scale genetic selection.
There are examples where the potential introduction of alleles may have or have resulted in the introduction of negative characteristics. For example, in cattle the nonsynonymous A > C mutation in a splice enhancer region of the bovine Diacylglycerol O-Acyltransferase 1 (DGAT1) gene dramatically reduced milk fat content but caused scouring (non-bloody watery diarrhoea), impaired growth rate and flattened intestinal microvilli [5]. Furthermore, the non-synonymous C636R mutation in bovine Mannose Receptor C Type 2 (MRC2) gene increased muscularity in meat breeds but caused Crooked Tail Syndrome [6,7].
Goat udder traits and teat morphology play important roles in milk quality (somatic cell count), milking ability/ ease of milking [8] and overall milk production [9]. Goat udder circumference [10,11] and udder depth [12] have been positively correlated with milk volume. Goats with greater udder depth also had a higher somatic cell count, a negative indicator of milk quality, udder health and marker of mastitis [13]. Teat size and shape was likewise indicative of mammary gland health with shorter and narrower teats associated with a lower somatic cell count [13,14]. These factors influence the animal's productive lifespan or longevity [15]. Extensive selection pressure for milk production traits has been shown to exert a negative effect on udder and teat traits [3,4,16]. It is thereby important to consider these traits in animal selection. Udder health and conformation traits have not been assessed at scale in the New Zealand goat herd. This paper reports for the first time a genome wide investigation for goat udder traits of the New Zealand goat population using Illumina GoatSNP50 BeadChip (Illumina Inc., San Diego, CA, USA) and identifies caprine SNP marker rs268292132 as a viable selection tool for future marker assisted breeding programmes.

Animals and phenotypic measurements
A total of 1058 animals from 2 farms (439 animals from farm A, 619 animals from farm B) with mixed age (mean age: 4.72 years, range: 2-9 years) and of primiparous or multiparous status (mean parity: 3.72 years, range: 1-8) were scored for udder traits in February and March 2020. The New Zealand goat herd is a mixed breed population comprised of approximately 85% Saanen, 5% Toggenburg, 5% Nubian/Anglo-Nubian, 4.5% Alpine and 0.5% Sable [17]. The udder traits measured included udder depth (UD), fore udder attachment (FUA), rear udder attachment (RUA), udder furrow (UF), teat placement (TP) and teat angle (TA). Udder traits were scored on a point system from 1 to 9 described in Table 1 with definition of traits as used in McLaren et al. [18]. Additionally, the following milk production traits, milk volume, fat yield and protein yield were measured for a subset of 336 animals out of the 1058-animal cohort (141 animals from farm A, 195 animals from farm B) as part of standard herd testing procedures using Fourier transform infrared spectroscopy. Animals were of mixed age (mean age: 6.49, range: 6-9) and primiparous or multiparous status (mean parity: 5.49, range: 5-8). The animals were from the cohort described in Scholtens et al [1]. Milk production traits were modelled by extrapolation to a 305-day lactation length period, restricted to animals with lactation length > 200 days.

Genotypes
Skin samples from the 1058 phenotyped animals were collected and genotyped with the Illumina GoatSNP50 BeadChip with 53,347 SNP markers (Illumina Inc., San Diego, CA, USA) [19]. Markers with ambiguous or missing genotypes in more than 5% of goats, or with minor allele frequency < 0.01 were omitted. After filtering, 51,477 markers were retained.

Genome-wide association analysis
Genome wide association mapping with an applied mixed linear model accounting for differences in population structure was performed using genome-wide complex trait analysis tool (GCTA) v1.93.2 [20,21]. The following mixed linear model was used in our study: y is the phenotype of interest, a is the mean term, b is the fixed (additive) effect of the SNP, x is the SNP genotype, g− is the effect of population structure estimated by calculating genetic relationship matrices (GRM) between all animals [21]. Mixed linear model association analysis, leave one chromosome out (MLMA-LOCO) approach was used where the chromosome on which the significant candidate SNP is located are excluded from the GRM calculations to avoid double fitting of candidate variants. e is the residual effect.
A SNP was considered significantly associated to a trait of interest at the whole genome level if their -log 10 P-values exceeded -log 10 (5 × 10 − 8 ). SNP marker trait association plots against the ARS1 goat reference [22] were constructed using the ggplot R package [23].

Calculation of allelic effect sizes for marker rs268292132
Allelic effect sizes for marker rs268292132 were calculated for production and udder traits in the 1058 animal population using a linear regression model adjusted for age and farm origin. A one-way analysis of variance (ANOVA) test between the means (α = 0.05) was performed to characterise allelic effect differences between rs268292132 genotype groups presented in Fig. 2. The Shapiro-Wilk normality assumption test was performed on all data presented. Multiple pairwise comparisons adjusted for false discovery rate were utilised to highlight statistically significant differences in the data presented.

Whole genome sequencing for variant identification
Whole genome sequencing was performed on 302 goats consisting of 48 animals in the cohort used for GWAS, 45 additional animals originating from farms A and B and 209 animals distributed across four other farms. The animals were selected to capture the genetic diversity of each farm. Skin samples were collected from these animals and sequenced on the Illumina HiSeq XTen platform (Illumina Inc., San Diego, CA, USA). For each animal, 700-1000 million paired end reads with read pair lengths of 150 nucleotides were generated. An average sequencing depth of 35 × was achieved. Read sequence quality was assessed using FastQC (FastQC v0.11.7). Reads were mapped to the ARS1 goat reference genome assembly using Burrows Wheeler aligner [24]. Duplicate read pairs were marked using picard MarkDuplicates (Picard v2.1.0., https://broadinstitute.github.io/ picard/). Reads aligned around small insertions/deletions were realigned using the Genome Analysis Toolkit (GATK v 3.8, RealignerTargetCreator and IndelRealigner tools) [25,26]. Probabilities for single nucleotide and indel variants were discovered individually in each alignment using GATK's HaplotypeCaller tool [27] and genotyped jointly across the entire cohort with GATK's GenotypeGVCFs tool. Pairwise linkage disequilibrium R 2 values between variants were calculated using PLINK software [28]. The Ensembl Variant Effect Predictor (VEP v90.3) software (https://www.ensembl.org/vep) was used to annotate variants with their calculated variant consequences based on the ARS1 reference General Feature Format (GFF) file (http://ftp.ensembl.org/pub/ release-100/gff3/capra_hircus/Capra_hircus.ARS1.100. gff3.gz).

A chromosome 19 QTL associated with goat udder health traits in the NZ goat population
We report a single genome wide significant QTL for several udder health traits including udder depth, fore udder attachment and rear udder attachment (Table 2 & Fig. 1). The QTL extend from chromosome 19 positions 24 -29 Mb. A QTL signal was also detected for teat angle and udder furrow at the same locus but did not reach genome wide significance. No QTL association was observed for the teat placement phenotype. Fitting the most significant QTL SNP (rs268292132) as a covariate in the association analysis resulted in the ablation of the entire QTL signal, indicative of a single haplotype responsible for the signal.
An overlapping locus was previously identified to be associated with milk production traits in the French [3], British [4] and New Zealand goat populations [1]. The SNP most significantly associated with milk production in the New Zealand study (rs268292132) was also the most significantly associated SNP with the above udder traits in this study ( Table 2 & Fig. 1).

Discussion
A genome wide association study with an applied mixed model to account for differences in population structure was performed in 1058 dairy goats in New Zealand. The study highlighted a QTL located on chromosome 19 (24)(25)(26)(27)(28)(29) Mb) associated with udder depth, fore udder attachment and rear udder attachment. This QTL was also identified in a previous study of the dairy goat population in New Zealand to be associated with milk production traits; fat yield, protein yield and milk volume (as described in Scholtens et al. [1]). Our findings are in agreement with trait effects of an overlapping QTL previously reported in French dairy goats for milk fat content, milk protein content, udder depth and rear udder attachment [2,3] and British dairy goats for milk volume, udder depth and udder attachment [4]. The pleiotropic effect of this locus on both production and udder phenotypes combined with a high density of genes within the region (22.5 genes/Mb) has made it difficult  Linkage disequilibrium values were calculated between the variant of interest and the most significant QTL marker rs268292132 (chr19:26,610,610). SIFT predicts whether an amino acid substitution affects protein function based on sequence homology and the amino acid properties. SIFT scores range from 0.0 (deleterious) to 1.0 (tolerated) to nominate candidate genes for the observed traits.
Martin and colleagues suggested several fatty acid and lipid metabolism pathway genes including phospholipase D2 (PLD2), gamma-glutamyltransferase 6 (GGT6) and the arachiodonate lipoxygenase (ALOX) family of genes [3]. Mucha and colleagues also noted a potential role for protein stability and lipid homeostasis gene, asialoglycoprotein receptor 2 (ASGR2) in udder formation [4]. Our analysis from 302 whole genome sequenced animals showed no protein-altering variants for PLD2, GGT6, ASGR1/ASGR2 and ALOX genes. Whole genome sequence analysis revealed 340 variants within the 5 Mb QTL with linkage disequilibrium R 2 > 0.8 with the most significant SNP marker rs268292132. After removing markers not predicted to change protein sequence, two non-synonmous variants and one inframe deletion variant were of interest. We propose two candidate non-synonymous variants including a valine to isoleucine variant (V222I) in the proteasome 20S subunit 6 (PSM6B) gene, and a serine to isoleucine variant (S267I/ S277I/S281I/S166I) in the sex hormone binding globulin (SHBG) gene. SIFT scores were predicted at 0.3 and 0.01 respectively suggesting a likely deleterious effect for the SHBG mutation. Genetic variation in human PSM6B was associated with LDL cholesterol and total cholesterol levels in a meta-analysis of 617,303 individuals of multi-ethnic backgrounds with 32 million genotyped and imputed variants [29]. PSMB6 encodes a core component of the 20S proteasome complex involved in intracellular protein degradation. However, its role in lipid metabolism is unclear. In the same GWAS study, genetic variants in SHBG were associated with triglyceride levels [29]. SHBG regulates the plasma metabolic clearance rate of steroid hormones such as oestrogen [30][31][32], although like PSMB6 a strong functional link with lipid metabolism is not yet determined. Additionally, a GAG glutamate inframe deletion (E89del) was detected in the candidate gene Sentrin-specific protease 3 (SENP3). Limited knowledge exists in the literature on the mechanistic role of SENP3 in milk production or morphological udder development. Given its widespread roles in protein stability, chromatin organisation, transcription, DNA repair, autophagy, protein localisation and homeostasis, a number of potential mechanisms maybe of interest [33]. Attributing causality to either of these mutations will require further genotyping of these variants within a study population that contain sufficient meioses to separate out causative haplotypes and appropriate functionality testing. Despite directing the majority of our focus to variants predicted to impact protein sequence as a direct link to functionality, we acknowledge that non-coding and regulatory variants are also of interest. Indeed our whole genome analysis have revealed two splice region variants, one 5′ UTR variant and five 3′ UTR variants within the QTL that exists in high linkage disequilibrium with the most significant QTL SNP marker. Future studies involving these variants in eQTL analyses or chromatin accessibility (See figure on previous page.) Fig. 2 Allelic effect sizes for marker rs268292132 genotype groups. Production traits, Milk Fat (A), Milk Protein (B), Milk volume (C) and udder traits, Fore udder attachment (D), Rear Udder Attachment (E), Teat Angle (F), Teat Placement (G), Udder Depth (H), Udder Furrow (I) were plotted as least squared means with 95% confidence intervals. Asterisks indicate pairwise significant differences between genotype groups (P-value < 0.5, multiple pairwise comparisons adjusted for false discovery rate; α = 0.05) analyses such as ChIP-seq or ATAC-seq or experimental CRISPR-cas9 variant models could provide additional insight into biological causality. Moreover, future investigation and characterisation of large copy number variants in our dataset could additionally reveal candidate variants not detectable by SNP chip methods. Selection on the chromosome 19 QTL for milk production traits will hold economic value in dairy industries. Markers capturing the QTL effect may be utilised in marker-assisted breeding programmes to drive selection of the next generation of high-performance animals. Employing selection with the most significant marker, rs268292132 (chr19:26,610,610) was predicted to increase fat yield, protein yield and milk volume in our study population. However, a potential adverse relationship between production traits and udder conformation has been indicated for this locus [3,4]. Our study replicated the observations of a shift in the negative direction for udder depth, fore udder attachment and rear udder attachment scores. It is unclear whether the negative correlation between production and udder traits are a consequence of increased milk production and udder volume retention giving rise to a perceived increase in udder depth with weak and narrow udder attachment, or a true biological deterioration of udder morphology independent of the production increase. In the future, it will be interesting to determine whether the gene underlying the QTL has a direct effect on mammary gland morphology. Both alleles at the rs268292132 locus are common in the dairy goat population, with the highmilk production allele approaching fixation on some farms after long-term phenotypic selection for both milk production and conformation traits. These observations imply that the magnitude of the udder effects may be well-balanced with the increase in milk production. A comprehensive analysis of udder shape deterioration versus the production increase across the wider population will be required to optimise selection and breeding programmes for marker rs268292132 as these traits affect production [9], milk quality (somatic cell count) [13], milking ability [8] and total number of days in production [15].
The overarching goal of this research is development of a marker-assisted breeding programme that can be implemented to accelerate productivity gain. This study identified use of the rs268292132 SNP as a viable selection marker for marker assisted breeding programmes to increase production.

Conclusion
Genome wide association analyses across the dairy goat population in New Zealand have discovered a 24-29 Mb QTL on caprine chromosome 19 that has a pleiotropic effect on milk production and udder conformation traits.
A negative correlation between production and udder conformation traits was observed in the New Zealand goat population which corroborates reports in French and British dairy goat herds. Close monitoring of udder health is recommended for future selection programmes involving this locus. Marker assisted breeding programmes with markers effectively identifying this QTL can be utilised to select the next generation of highperformance animals to accelerate productivity gain.