Review on the development of genotyping methods for assessing farm animal diversity

Advances in molecular biotechnology have introduced new generations of molecular markers for use in the genetic improvement of farm animals. Consequently, more accurate genetic information can be obtained to better understand existing animal genetic resources. This review gives a brief summary on the development of genetic markers including both the classical genetic markers and more advanced DNA-based molecular markers. This review will help us better understand the characteristics of different genetic markers and the genetic diversity of animal genetic resources.


Introduction
The development of every species under its particular natural ecosystem, environmental, and socio-economic conditions has led to each having its own specific genetic characteristics. Collectively, these characteristics constitute the Earth's species diversity. Mankind can learn and make use of these special genetic resources to develop animal production for human food needs. However, sufficient genetic markers for evaluating the population structure and other aspects of available animal genetic resources are necessary to assess genetic diversity.
In earlier studies, morphological markers and ecogeographical factors were used to represent diversity, and after that, chromosomal karyotyping was developed. With the rapid development of modern biotechnology, biochemical markers, such as proteins and isozymes, were utilized. By the 1980s, many different types of DNA molecular markers had been explored, e.g. Restriction Fragment Length Polymorphism (RFLP), Random Amplified Polymorphic DNA (RAPD), Amplified Fragment Length Polymorphism (AFLP), Single-Strand Conformation Polymorphism (SSCP) and Microsatellite DNA. All of these DNA-based markers contain specific advantages and have played significant roles in the evaluation of genetic diversity in farm animals. In addition, with biotechnological and computer innovations, novel strategies such as wholegenome SNP chips and DNA Barcoding have emerged. At present, DNA molecular marker techniques are widely applied in the fields of germplasm identification, phylogenetics, and genetic structural analysis. They overcome the limitations of morphological, cytological, and biochemical markers, namely the small numbers of such markers and the fact they can be environmentally influenced. The expansion in DNA information will facilitate study of genome-wide diversity; such information is much more precise for the assessment of genetic diversity than previous markers. The following is a brief summary on the principles and advancements of primary genetic markers involved in assessments of Animal Genetic Resources (AnGR).

Conventional methods applied to AnGR assessments Morphological markers
Morphological markers normally refer to external animal characteristics (i.e. coat color, body shape, skin structure, and anatomical characteristics) [1,2], which can be obtained by direct visual observation and measurement. They are used in the identification, classification, and characterization of genetic evolution of different species or populations. However, an animal's phenotype is determined by its genetic background and the environment it experiences. The evaluation of farm animal genetic resources through morphological markers is based on subjective judgments and descriptions, and the conclusions reached are often not completely accurate. Furthermore, the measurement and identification of animal morphological traits usually takes a long time, and it is not easy to remove the effects of environmental factors. Consequently, the application of morphological markers is limited in the evaluation of quantitative traits. However, it is still an effective method for the assessment of qualitative traits, for which it is easy to characterize phenotypic differences between individuals through direct observation and measurement.

Cytological markers
Cytological markers have been used for the assessment of farm animal genetic resources [3,4] based on the numbers and morphology of animal chromosomes. Cytological markers include chromosome karyotypes, bandings, repeats, deletions, translocations, and inversions. Chromosomes are the carriers of genetic material and chromosome mutations are crucial sources of genetic variation [5], we can use these mutations as markers to determine the specific location of a gene on the chromosome, and its position relative to other genes. For instance, researchers can trace the origins and evolutionary history of livestock [6], and assess the genetic diversity of domesticated animals by comparing chromosome number and structure between domesticated animals and their wild ancestors [7].

Biochemical markers
Biochemical markers, e.g. blood type and isozymes, represent biochemical traits and can be analyzed by protein electrophoresis. In 1967, Buvanendran et al. investigated the genetic variation within species and phylogenetic relationships between species by differences in the amino acid composition of isozymes and soluble proteins [8]. Nevertheless, neither proteins nor isozymes are genetic material but the products of gene expression, and they are vulnerable to environmental impacts and individual growth discrepancies, limiting the breadth of their application [9]. Conversely, protein electrophoresis is a rapid, economic, and straightforward technique and provides a more detailed representation of polymorphisms than morphological or cytological markers; thus, it is still widely used in elucidating the origin and classification of species [10].

Molecular markers (DNA based markers)
With the development of molecular biotechnology, molecular markers have made rapid progress. A molecular marker is based on the nucleotide sequence mutations within the individual's genome; they are the most reliable markers available. Molecular markers can be used for investigating genetic variations at the DNA level between different populations and individuals; its advantage is being able to find genetic variations rapidly and directly. Molecular markers have developed quickly, and they are becoming more and more informative. Up to now, various types of molecular markers have been utilized to evaluate DNA polymorphisms, e.g. RFLPs. Polymerase chain reaction (PCR) [11] can exponentially amplify a fragment of DNA in vitro, and since its invention a series of techniques have emerged in combination with PCR, e.g. PCR-RFLP, AFLP, simple sequences repeats (SSRs), and Single Nucleotide Polymorphisms (SNPs). In this review, we mainly focus on the introduction of several important DNA-based markers, and their various applications in characterizing animal genetic resources.

RFLP markers
RFLP is a method established by Grodzicker et al. in 1974, it is used to identify DNA polymorphisms among different individuals [12]. Its basic principle is as follows: first, genomic DNA from different individuals is digested into DNA fragments of varying size, using known restriction enzymes. Second, the digested fragments are separated via electrophoretic analysis. Finally, separated fragments are hybridized with radioactive or chemiluminescent homologous probes and exposed to an X-ray film; the different fragments are visible by autoradiography. The molecular basis of RFLP is that nucleotide base substitutions, insertions, deletions, duplications, and inversions within the whole genome can remove or create new restriction sites.
RFLP was the first DNA-based marker for constructing genetic linkage maps; it is also one of the most widely used markers in AnGR assessments and breeding program development. By combining this method with PCR (PCR-RFLP), Jiang and Gibson [13] detected four new genetic polymorphisms in the leptin gene of different pig breeds. The main advantages of RFLPs include: 1) high reliability, because it is generated from specific sites via known restriction enzymes and the results are constant over time and location. 2) Co-dominance, which means investigators are able to distinguish heterozygotes from homozygotes. 3) Selective neutrality refers to a situation in which different alleles of a certain gene confer equal fitness. The disadvantages of RFLPs are as follows: 1) labor-intensive and time-consuming. 2) RFLPs can only check out specific mutations at enzyme cut sites, which limits identification of whole genome variation in animals.
3) The polymorphism of RFLP markers is relatively low and must be detected by radioisotope, which limits its application.

RAPD markers
RAPD was developed by U.S. scientists in 1990 [14,15]. It amplifies the target genomic DNA with short, arbitrary primers (commonly 10 bp) in a PCR reaction, and can be used to produce relatively complicated DNA profiles for detecting amplified fragment length polymorphisms between organisms. Since the arbitrary primers complement different parts of the genomic DNA, PCR products will differ in number and size (polymorphism).
RAPD-PCR fingerprints have been successfully used in defining genetic diversity among different species. For example, the RAPD method was used to generate specific fingerprint patterns of ten different species: including wild boar, pig, horse, buffalo, beef, venison, dog, cat, rabbit, and kangaroo [16].
RAPD markers have several obvious features as summarized in the literature: 1) no prior sequence knowledge is necessary for designing the specific primers, which can then be used in different templates. 2) The amount of DNA required is very small because it will be amplified by PCR. 3) RAPDs are simple, quick, and cost effective compared to RFLP [17,18]. However, RAPDs also have some disadvantages, these include 1) the repeatability and reliability of RAPD polymorphic profiles are poor [19]. 2) Some non-specific and therefore non-reproducible binding of primers occurs. 3) RAPDs are dominant genetic markers which cannot be used to distinguish homozygote from heterozygote genotypes in F2 populations.

AFLP markers
AFLP was developed by Zabeau and Vos in 1993; it is a combination of the RFLP and PCR techniques [20]. The AFLP procedure is as follows: first, the genomic DNA is digested with a restriction enzyme, and then the digested fragments are ligated to synthetic adaptors and amplified with specified primers that are complementary to a selective sequence on the adaptors. Subsequent separation of the amplified fragments is obtained by selective primers and visualized using autoradiography [21]. AFLPs overcome the drawbacks of the labor-intensive, timeconsuming RFLP method and solve the reliability problem caused by non-specific amplifications in RAPDs. Hoda et al. used AFLPs to assess genetic diversity and relationship among different breeds of sheep. They analyzed 93 unrelated individuals from three local Albanian sheep breeds markers. The results obtained indicated high diversity in Albania sheep breeds [22].
AFLPs are notable for their genetic stability, they provides an effective, rapid, and economical tool for detecting a large number of polymorphic genetic markers, that can be genotyped automatically [23,24]. However, AFLPs are dominant bi-allelic markers [23], and are unable to distinguish dominant homozygous from dominant heterozygous individuals [25]. The AFLP method is an ideal molecular approach for population genetics and genome typing, it is consequently widely applied to detect genetic polymorphisms, evaluate, and characterize animal genetic resources [26][27][28][29].

Microsatellite DNA markers
Microsatellite DNA, also known as simple sequences repeats (SSRs) or short tandem repeats (STRs), are common repeated sequences within eukaryotic genomes. Generally they consist of motifs which are made up of 1-6 base pairs (bp) tandemly repeated several times (e.g. CACACACACACACACA) [30,31]. The flanking regions of repeated sequences at microsatellite loci are mostly conservative and the repetition motifs are highly variable between different species and even different individuals of the same species. So we can design specific primers based on the conserved sequences and amplify the core repeat sequences by way of PCR, genetic polymorphisms can then be detected via electrophoresis [31].
SSRs have the same advantages as RFLPs, and avoid the utilization of radioisotopes essential for RFLPs; it has higher repeatability and stability than RAPDs; compared to AFLP markers, SSRs are co-dominant markers and able to distinguish homozygotes from heterozygotes. Until recently, microsatellites were the markers most widely used for genetic diversity, mapping quantitative trait loci for production, and functional traits in farm animals [32][33][34]; they have also been used for marker assisted selection practices [35].
The advantages and disadvantages of SSR markers have been reported by many authors [36][37][38][39][40]. Its advantages are as follows: low quantities of template DNA required (10-100 ng), high polymorphism, co-dominant markers, high accuracy, high reproducibility, different microsatellites can be multiplexed in PCR, and they are amenable to automation. Its disadvantages include: time-consuming and expensive to develop, heterozygotes may be misclassified as homozygotes when null-alleles occur because of mutations in the primer annealing sites, stutter bands may complicate accurate scoring of polymorphisms, underlying mutation model largely unknown, and microsatellite markers do help to identify neutral biodiversity but do not provide information on functional trait biodiversity. Despite these disadvantages, microsatellite markers are still popular nuclear DNA markers for the investigation of genetic variation among and within species.

New approaches for AnGR assessments
In addition to the classical markers discussed above, with the development of modern molecular techniques and the completion of the Human Genome Project (HGP), some new markers have emerged and are being used in the evaluation of farm animal genetic resources; these include high-density SNP arrays, whole-genome sequencing, and DNA barcoding.
SNP markers and whole-genome sequencing SNP, a novel molecular marker technology, was first proposed by Lander in 1996, it refers to a sequence polymorphism caused by a single nucleotide mutation at a specific locus in the DNA sequence. This sort of polymorphism includes single base transitions, transversions, insertions and deletions [41], and the minor allele frequency should be 1% or greater [42]. Of all the SNP mutation types, transitions are the most common (approx. 2/3) [43]. Currently, SNP markers are one of the preferred genotyping approaches, because they are abundant in the genome, genetically stable, and amenable to highthroughput automated analysis [42].
The fundamental principle of SNPs is to hybridize detected DNA fragments with high-density DNA probe arrays (also called SNP chips); the SNP allele is then named according to the hybridization results. SNPs are bi-allelic markers, indicating a specific polymorphism in only two alleles of a population [44]. SNPs distribute in both coding and non-coding regions of genomes, they are vital players in the process of population genetic variations and species evolution [45].
Currently, DNA chip technology is usually carried out during SNP investigations. A group of associated SNP loci located on a certain region of the chromosome can form one SNP haplotype. SNPs are third generation molecular marker technology coming after RFLPs and SSRs [46]; it has been successfully used to investigate genetic variation among different species and breeds [47][48][49].
Compared with previous markers, SNPs have the following advantages: 1) they are numerous and widely distributed throughout the entire genome [50]. 2) High genetic stability, excellent repeatability, and high accuracy. 3) Allow for fast, high-throughput genotyping [51]. 4) Convenient for effectively distinguishing heterozygote from homozygote alleles because of its co-dominances.
Because of their extensive distribution and abundant variations, SNPs play an important role in farm animal population structure, genetic differentiation, origin, and evolution research. For example, linkage disequilibrium (LD) among different SNPs can be utilized for association analysis. Furthermore, we can gain information concerning animal population diversity and population evolution (origins, differentiation, and migrations) via SNP haplotypes among different populations.
One disadvantage of SNP markers is the low level information obtained compared with that of a highly polymorphic microsatellite, but this can be compensated for by employing a higher numbers of markers (SNP chips) and whole-genome sequencing [52,53].
With the improvement of sequencing technology, whole-genome/gene sequencing has become available for characterizing genetic diversity among farm animals. It is the most straight-forward method and provides more complete information on the genetic variation among different populations because it can detect all the variations within the genome. Currently, the problem with whole-genome sequencing is setting up a high-through data analysis platform to explore useful information for the conservation and utilization of farm animals.

DNA barcoding markers
Barcoding is an automatic scanning and identification technology, which has emerged from practical computer technologies. Biological taxonomists apply this principle to species classification, referring to a DNA barcode. A DNA barcode is a Short DNA sequence from a standardized region of the genome used for identifying species. The intent of DNA barcoding is to use large-scale screening of one or more reference genes in order to (i) assign unknown individuals to species, and (ii) enhance discovery of new species [54,55].
Tautz et al. [56] were the first researchers to use the DNA sequences in systematical biological taxonomy (also called DNA taxonomy). Subsequently, Hebert et al. [54] proposed the concept of DNA Barcoding and suggested its use for a single mtDNA gene, mitochondrial cytochrome c oxidase I (COI), as a common sequence in animal DNA barcoding studies. Researchers can compile a public library of DNA barcodes linked to named specimens, which can provide a new master key for identifying species diversity [57].
Compared with time-consuming and inefficient traditional morphological classification [58], DNA Barcoding has a high accuracy of 97.9% [59], and provides us a new, quick, and convenient identification strategy for animal genetic diversity [54]. However, as with the other markers mentioned the DNA barcoding technique also has some disadvantages: 1) the genome fragments are very difficult to obtain and are relatively conservative and have no enough variations. Some organisms cannot be identified with COI because of the low evolution rates of COI sequences in some species. 2) COI is an mtDNA sequence of maternal origin, which could bias species diversity [60,61]. The above disadvantages can be compensated for by using one or more nuclear gene barcodes together to make a standardized analysis of AnGR.

Summary
Farm animals are extremely important to humans, supplying some 30% of our total food requirements [62]. The accurate evaluation of animal genetic resources is the basis for their conservation and utilization. From the first demonstration of RFLPs to the current wholegenome sequencing, many methods have been developed and tested at the DNA sequence level, providing a large number of markers and opening up new opportunities for evaluating diversity in farm animal genetic resources. Currently, SSR and SNP markers are valuable tools for evaluating germplasm diversity because of their high EMI (effective marker index) and high QND (qualitative nature of data), respectively [63]. With the development of new markers, more accurate genetic evaluation is possible. The development of molecular markers will continue in the near future and provide better understanding of animal genetic resources.