A post-GWAS confirming the genetic effects and functional polymorphisms of AGPAT3 gene on milk fatty acids in dairy cattle

People are paying more attention to the healthy and balanced diet with the improvement of their living standards. Milk fatty acids (FAs) have been reported that they were related to some atherosclerosis and coronary heart diseases in human. In our previous genome-wide association study (GWAS) on milk FAs in dairy cattle, 83 genome-wide significant single nucleotide polymorphisms (SNPs) were detected. Among them, two SNPs, ARS-BFGL-NGS-109493 and BTA-56389-no-rs associated with C18index (P = 0.0459), were located in the upstream of 1-acylglycerol-3-phosphate O-acyltransferase 3 (AGPAT3) gene. AGPAT3 is involved in glycerol-lipid, glycerol-phospholipid metabolism and phospholipase D signaling pathways. Hence, it was inferred as a candidate gene for milk FAs. The aim of this study was to further confirm the genetic effects of the AGPAT3 gene on milk FA traits in dairy cattle. Through re-sequencing the complete coding region, and 3000 bp of 5′ and 3′ regulatory regions of the AGPAT3 gene, a total of 17 SNPs were identified, including four in 5′ regulatory region, one in 5′ untranslated region (UTR), three in introns, one in 3′ UTR, and eight in 3′ regulatory region. By the linkage disequilibrium (LD) analysis with Haploview4.1 software, two haplotype blocks were observed that were formed by four and 12 identified SNPs, respectively. Using SAS9.2, we performed single locus-based and haplotype-based association analysis on 24 milk FAs in 1065 Chinese Holstein cows, and discovered that all the SNPs and the haplotype blocks were significantly associated with C6:0, C8:0 and C10:0 (P < 0.0001–0.0384). Further, with Genomatix, we predicted that four SNPs in 5′ regulatory region (g.146702957G > A, g.146704373A > G, g.146704618A > G and g.146704699G > A) changed the transcription factor binding sites (TFBSs) for transcription factors SMARCA3, REX1, VMYB, BRACH, NKX26, ZBED4, SP1, USF1, ARNT and FOXA1. Out of them, two SNPs were validated to impact transcriptional activity by performing luciferase assay that the alleles A of both SNPs, g.146704373A > G and g.146704618A > G, increased the transcriptional activities of AGPAT3 promoter compared with alleles G (P = 0.0004). In conclusion, our findings first demonstrated the significant genetic associations of the AGPAT3 gene with milk FAs in dairy cattle, and two potential causal mutations were detected.

Genome-wide association study (GWAS) is a commonly used strategy to identify potential genetic variants underlying important complex traits in human and domestic animals. So far, some candidate genes and QTL regions for milk production traits have been detected with GWA studies in dairy cattle, such as DLGAP1, AP2B1, SCD, BTA11 (1.59-3.37 Mb), and BTA3 (70.34-73.69 Mb) [8][9][10][11][12][13]. In our previous GWAS for milk FAs in Chinese Holstein cows, 83 genome-wide significant single nucleotide polymorphisms (SNPs) were detected in total [12], in which, two SNPs (ARS-BFGL-NGS-109493 and BTA-56389-no-rs) associated with C18index (P = 0.0459), were located in the upstream of 1-acylglycerol-3-phosphate Oacyltransferase 3 (AGPAT3) gene. In addition, we performed a joint GWAS for milk FAs in combined Chinese and Danish Holstein populations and found that a chromosome-wide significant QTL region of 146. .31 Mb on BTA1 was associated with C18:0 [13]. The AGPAT3 gene was nearby this region with approximately 400 kb. 1-acylglycerol-sn-glycero 3-phosphate acyltransferase (AGPAT), encoded by the AGPAT3 gene, is one of the isoforms of AGPATs [14] and is involved in the glycerolipid (ko00561) and glycerophospholipid metabolisms (ko00564), and phospholipase D signaling pathway (ko04072). Mammalian AGPAT catalyzed the acylation of lysophosphatidic acid to form the phosphatidic acid that was the precursor of all glycerplipids [14]. Therefore, it was implied that the AGPAT3 gene was a promising candidate gene for milk FA traits in dairy cattle. The purpose of the present study was to further detect whether the AGPAT3 gene had significant genetic effects on milk FAs in a Chinese Holstein cow population.

SNP identification and genotyping
Based on the genomic sequence of bovine AGPAT3 gene (Gene ID: 506607), 14 pairs of primers (Table S1) were designed by the Primer 3 version 4.0 (http://bioinfo.ut. ee/primer3-0.4.0/) and were synthesized in the Beijing Genomics Institute (Beijing, China) to amplify all the exons with partial adjacent intron region, and 3000 bp of 5′ and 3′ regulatory regions. As previously descripted [15], two DNA pools were constructed and the polymerase chain reaction (PCR) amplifications were performed with each DNA pool as template. To identify potential polymorphisms, the PCR amplification products were bidirectionally sequenced with an ABI3730XL DNA analyzer (Applied Biosystems, Foster, CA, USA). Then, the identified SNPs were genotyped for the 1065 cows by the matrix-assisted laser-desorption/ionization time of flight mass spectrometry (MALDI-TOF MS, Sequenom MassARRAY, Bioyong Technologies Inc., HK).

Linkage disequilibrium (LD) and association analyses
We estimated the LD among the identified SNPs of AGPAT3 gene with Haploview 4.1 (Broad Institute, Cambridge, MA, USA).
For association analysis, the 1065 cows were traced back to three-generation pedigrees to construct the kinship matrix (A-matrix) by SAS 9.2 (SAS institute, Cary, NC, USA), so that 3335 individuals were totally included. Single-locus and haplotype-based associations with 24 kind of milk FAs were performed by the following mixed animal model with SAS 9.2: Here, Y ijklm is the phenotypic value of each milk fatty acid trait; μ is the overall mean; G i is the fixed effect corresponding to the genotype or haplotype combination of individual i; h j (j = 1-23) and l k (k = 1-4) were the fixed effect of farm j and stage of lactation ll, respectively; a l is the random polygenic effect; M m (m = 1-293) is the fixed effect of age at calving m; b is the regression coefficient of covariate M; and e ijklm is the random residual. Further, we calculated the additive effect (a), dominant effect (d), and allele substitution effect (α) according to a [17]. Here, AA, AB and BB were the least square means of milk FAs corresponding to the genotypes, and p and q were the frequencies of allele A and B, respectively.

Recombinant plasmid construction and luciferase assay
To detect the allele-specific effects of the SNPs g.146702957G > A, g.146704373A > G, g.146704618A > G, and g.146704699G > A on the transcriptional activity of AGPAT3 gene, eight luciferase reporter gene fragments (G and A of g. 146702957G > A; A and G of g.146704373A > G; A and G of g.146704618A > G; and G and A of g.146704699G > A) corresponding to the eight alleles of the four SNPs (Fig. 1a) were designed and synthesized (Genewiz, Suzhou, China). The eight fragments with the KpnI and Nhel restriction sites at the 5′ and 3′ termini, respectively, were cloned into the pGL4.14 luciferase assay vector (Promega, Madison, USA). In addition, all the plasmids were purified by the Endo-free Plasmid DNA Mini Kit II (OMEGA, omega bio-tek, Norcross, Georgia, USA), and were sequenced to confirm the integrity of each construct's insertion.
The human embryonic kidney (HEK) 293 T cells were cultured with Dulbecco′s modified Eagle′s medium (Gibco, Life Technologies) and 10% fetal bovine serum (Gibco) at 37°C in a humidified incubator containing 5% CO 2 . Before transfection, about 2 × 10 5 cells were seeded in each 24-well plate. For eight luciferase reporter gene fragments of g. 146702957G > A, g.146704373A > G, g.146704618A > G and g.146704699G > A, 500 ng constructed plasmid was co-transfected along with 10 ng pRL-TK Renilla luciferase reporter vector (Promega) into each well. All the experiments were performed in three replicates for each construct. Approximate 48 h after transfection, the cells were harvested and the activity of both firefly and Renilla luciferases were measured with a Dual-Luciferase Reporter Assay System (Promega) on a Modulus microplate multimode reader (Turner Biosystems, CA, USA). The average statistic of three replicates were calculated as the normalized luciferase data (Firefly/Renilla).

Estimation of LD among the identified SNPs of AGPAT3
We used the haploview 4.1 to estimate the LD among these 17 SNPs, and observed two haplotype blocks

Prediction of TFBSs changing caused by the SNPs in 5′ regulatory region
By performing Genomatix software suite v3.9, it was predicted that that four SNPs in the 5′ regulatory region of AGPAT3 gene, g.146702957G > A, g.146704373A > G, g.146704618A > G and g.146704699G > A altered the binding sites of some transcription factors ( Table 4). The allele A of g.146702957G > A created a TFBS for SMAR CA3 (SWI/SNF related, matrix associated, actin dependent regulator of chromatin, subfamily a, member 3) and REX1 (REX1 transcription factor; zinc finger protein 42), respectively, and the allele G created a TFBS for VMYB (v-Myb, variant of AMV v-myb). The alleles A and G of g.146704373A > G created a TFBS for BRACH (Brachyury) and NKX26 (NK2 homeobox 6, Csx2), respectively. The allele G of g.146704618A > G created two TFBSs for ZBED4 (Zinc finger, BED-type containing 4; GC-box binding sites) and SP1 (Stimulating protein 1, ubiquitous zinc finger transcription factor). The allele G of g.146704699G > A created two TFBSs for USF1 (Upstream stimulating factor 1) and ARNT (AhR nuclear translocator homodimers), and the allele A created a TFBS for FOXA1 (Forkhead box protein A1, hepatocyte nuclear factor 3-alpha (HNF-3-alpha)).

Exploring for luciferase activity altered by the SNPs in 5′ regulatory region
To validate the TFBS prediction results, the luciferase assay was further performed for the four SNPs (g.146702957G > A, g.146704373A > G, g.146704618A > G and g.146704699G > A) (Fig. 1b). We observed that the luciferase activities of six constructs containing g.146704373A > G, g.146704618A > G, and g.146704699G > A, were significantly higher than that of the pGL4.14 empty vector (P < 0.0007) and blank control (P < 0.0008), while g.146702957G > A did not (P > 0.05). Further, the luciferase activity of alleles A of g.146704373A > G and g.146704618A > G were significantly higher than that of their alleles G (P = 0.0004; Fig. 1b). The luciferase activity of allele G of g.146704699G > A was higher than that of allele A, while not significant (P > 0.05). These results indicated that the transcriptional activity of the AGPAT3 gene significantly altered by g.146704373A > G and g.146704618A > G might be the reasons strongly impacted on FAs.

Discussion
This study was a follow-up investigation for our previous GWAS on milk FAs in Chinese Holstein [12]. AGPAT3 is involved in pathways related to lipid metabolism (ko00561, ko00564 and ko04072). In human, docosapentaenoic acid as the substrate of AGPAT3 protein transfers a fatty acid in sn-2 position of lysophosphatic acid, a step in the phospholipid biosynthesis pathway [19]. Here, we detected that the AGPAT3 gene mainly impacted the medium-chain milk FAs in dairy cattle.
Mammalian AGPAT catalyzed the acylation of lysophosphatidic acid to form the phosphatidic acid, which was the precursor of all glycerplipids [14]. For the AGPAT families, AGPAT1 had significant association with milk FA CLA [20], and AGPAT6 was strongly associated with C14:0, C16:0, C10:1, C12:1, C14:1 and C16:1 [21]. In our previous GWA studies [12,13], AGPAT3 gene was identified as a candidate for two milk FAs, C18index and C18:0. In this study, using an independent Chinese Holstein population that was different from the previous GWA studies, we also observed that AGPAT3 showed a significant genetic effect on C18index and C18:0. In addition, our results revealed that the AGPAT3 had strong associations with C6:0, C8:0 and C10:0. Overall, the previous GWASs and this study suggested that AGPAT3 gene had significantly genetic effects on milk FAs.
Sequences-specific binding of transcription factors to the regulatory regions on the DNA is a key regulatory mechanism that affects gene expression and hence heritable phenotypic variation [22,23]. Eukaryotic regulatory sequences, including enhancers and promoters, are typically between a hundred and several thousand base pairs in length, and can harbor many TFBSs [24]. It is essential to understand the evolution dynamics of transcription factor binding for understanding the evolution of gene regulation [25]. In this study, by prediction, g.146704373A > G changed the bindings of transcription factors (TFs) BRACH and NKX26, and g.146704618A >  Note: LSM least square mean. SE standard error. P indicates the significances of the association analysis between the haplotype block and milk fatty acid traits. P is the raw value. * : P < 0.05. ** : P < 0.01. Different letter (small letters: P < 0.05; capital letters: P < 0.01) superscripts indicate significant differences among the haplotype combinations. The number in the brackets represents the number of cows for the corresponding haplotype combination G altered the bindings of TFs ZBED4 and SP1. Further, we used the luciferase assay to verify that the alleles A of g.146704373A > G and g.146704618A > G strongly increased the transcription activity of the AGPAT3 gene than the alleles G. Previous studies showed that BRACH as a regulatory factor directly activated downstream mesoderm-specific genes to exert its mesoderminducing effects [26], and NKX26 restrained the transcription activity of Cx40 through the F151L missense mutation to impact the heart development [27]. ZBED4 could act as a co-repressor of nuclear hormone receptors (NHRs) by its LXXLL motifys in cones [28]. Through interfering with the recruitment of SP1 to ZNF132 promoter region, methylation of SP1-binding site can inhibit ZNF132 transcriptional expression to impact the tumor in the development of esophageal squamous cell carcinoma [29]. These reports have indicated that the TFs BRACH, NKX26, ZBED4 and SP1 could activate or repress the expression of their target genes. Based on our association analysis, the cows with the AA genotypes of g.146704373A > G and g.146704618A > G of AGPAT3, yielded significantly lower contents of C6:0 and C8:0 than those with GG genotypes. According to above, we deduced that the BRACH as a TF might activate AGPAT3 gene transcription activity by binding to the allele A of g.146704373A > G thereby reducing the contents of C6:0 and C8:0, while, the transcription factors NKX26, ZBED4 and SP1 might have the contrary effects. Nowadays, genomic selection is the main implication for dairy cattle breeding, where the genomic chips are used. Among the SNP markers in these chips, most of them were collected from the current SNP database and almost evenly distributed across the whole genome. Hence, g.146704373A > G and g.146704618A > G of AGPAT3 as the potentially causal mutations could be put into the SNP chip instead of used in marker selection to increase selection efficiency in some specific dairy cattle populations to improve the contents of milk FAs.

Conclusion
In conclusion, through a post-GWAS approach, our study firstly indicated there were significant genetic associations between the AGPAT3 gene and milk FAs in dairy cattle. Further, we found that two SNPs in 5′ regulatory region (g.146704373A > G and g.146704618A > G) changed the transcriptional activity of AGPAT3 implying their potential causal function. These findings provided important molecular information for dairy cattle breeding.
Additional file 1: Table S1. PCR primers information of AGPAT3 gene Additional file 2: Table S2. Additive (a), dominant (d) and allele substitution (α) effects of 17 SNPs on milk fatty acid traits of AGPAT3 gene in Chinese Holstein cows