Identification of circulating miRNA biomarkers based on global quantitative real-time PCR profiling

MicroRNAs (miRNAs) are small noncoding RNAs (18-25 nucleotides) that regulate gene expression at the post-transcriptional level. Recent studies have demonstrated the presence of miRNAs in the blood circulation. Deregulation of miRNAs in serum or plasma has been associated with many diseases including cancers and cardiovascular diseases, suggesting the possible use of miRNAs as diagnostic biomarkers. However, the detection of the small amount of miRNAs found in serum or plasma requires a method with high sensitivity and accuracy. Therefore, the current study describes polymerase chain reaction (PCR)-based methods for measuring circulating miRNAs. Briefly, the procedure involves four major steps: (1) sample collection and preparation; (2) global miRNAs profiling using quantitative real-time PCR (qRT-PCR); (3) data normalization and analysis; and (4) selection and validation of miRNA biomarkers. In conclusion, qRT-PCR is a promising method for profiling of circulating miRNAs as biomarkers.


Background
MicroRNAs (miRNAs), a class of 18 to 25 noncoding nucleotides, are capable of regulating gene expression through messenger RNA degradation or translational repression and are involved in various biological processes, such as proliferation, differentiation, development, and apoptosis [1,2]. Recently, the presence of miRNAs in the blood circulation has been reported [3]. Interestingly, deregulation of circulating miRNAs has been associated with a variety of human diseases, including cancer [4,5] and cardiovascular diseases [6,7], indicating that miRNAs could be used as biomarkers for cancer and other diseases.
Several methods, such as northern blot [8], bead-based flow cytometry [9], microarray [10,11], quantitative realtime PCR (qRT-PCR) [12][13][14], and deep sequencing [15,16] have been developed to measure miRNA expression [17]. Of these methods, qRT-PCR is superior due to its high sensitivity, specificity and reproducibility. While other methods, such as microarray, require a larger amount of RNA sample (usually more than 1 μg), qRT-PCR requires less RNA input, where even as little as a single cell can be used for profiling [18,19]. Since the expression levels of circulating miRNAs are very low, qRT-PCR is well adapted for analyzing circulating miR-NAs profiles because of its sensitivity. In addition, approximately 1,900 mature miRNAs have been found in human genome (miRbase 18, released on November 3, 2011) [20]. As qRT-PCR is easily adapted to 384-well plates, it is possible to carry out high-throughput screening. Here, we describe a procedure for the identification of circulating miRNA biomarkers by qRT-PCR profiling that is composed of four steps: (1) sample collection and preparation; (2) global miRNA profiling using qRT-PCR; (3) data normalization and analysis; (4) selection and validation of miRNA biomarker(s).
Step 1: Sample collection and preparation Blood samples can be collected after obtaining the approval of relevant ethics committees and informed consents of donors. All information collected from blood donors, including gender, age, disease grade, symptom, should be recorded. In general, at least tens or hundreds of blood samples should be collected from both pathological and healthy control groups in order to acquire statistically significant data. To reduce costs at the initial screening step, a pooled sample derived from a number of individual specimens (for example, a mixture of 10 to 20 specimens) can be used. Subsequently, the candidate miRNA biomarkers can be further validated with a larger number of samples (> 100) to obtain reliable results [21].
Both serum and plasma are appropriate for the detection of circulating miRNA. However, serum may be preferable to plasma due to the following reasons. First, serum is easier to obtain from clinical sample repositories compared to plasma. Second, plasma is more likely contaminated with platelets and erythrocytes [22]. Finally, some anticoagulants used in plasma collection, such as heparin, inhibit the efficiency of reverse transcription and/or PCR, whereas ethylenediaminetetraacetic acid (EDTA) and citrate are acceptable [23]. It is notable that hemocytolysis during sample collection should be avoided since the products interfere with circulating miRNA quantification. To isolate serum/plasma, blood samples are centrifuged at 3,000 × g for 10 min at 4°C or room temperature. Centrifugation of the serum/plasma can be performed once again at 15,000 × g to remove cell debris [24]. Serum/plasma can be subjected to RNA purification immediately after centrifugation or stored at -80°C, and these procedures should be kept consistent throughout the study to reduce technical variation.
The purification of miRNAs from serum/plasma is difficult because very little amount of miRNAs exist in these samples. In addition, serum/plasma contains numerous inhibitors possibly contaminating the purified RNA that could interfere with subsequent enzymatic reactions. The efficiency of circulating miRNA purification can be monitored by using a heterogenous spike-in RNA, such as synthetic Caenorhabditis elegans miRNA (cel-miRNA), which can be added following the mixing of denaturing reagents.
Two specialized types of reagents have been developed for the purification of circulating RNA. The first type is Trizol LS reagent (Invitrogen) or Tri-Reagent BD (Molecular Research Center). After incubation with these denaturing reagents and extraction with phenol/chloroform, inhibiting factors in serum/plasma are removed effectively. Total RNA is then precipitated with ethanol or isopropanol. Higher concentrations of RNA can be obtained using less RNase-free water, which is an advantage of this procedure. However, technical variation usually exists due to the slight loss of RNA during washing and dissolving steps. Moreover, the operation is laborious for processing large numbers of clinical samples mainly due to the precipitation procedure. In contrast, kits using a column-binding strategy, such as miRNeasy (QIAGEN) and mirVana PARIS kits (Ambion), may provide better reproducibility and easier operation. By properly regulating the RNA affinity with ethanol on a solid support such as silica or glass-fiber, small RNAs < 200 nucleotides can bind to the column and then be eluted with RNase-free water. The mirVana PARIS kit is preferred since it applies an equal volume of denaturing reagent with serum/plasma sample, significantly reducing the volume of reagent used per sample and enhancing the efficiency of RNA purification.
Generally, the concentration of total RNA purified from serum/plasma can be measured using specific equipment, for example the NanoDrop spectrophotometer (NanoDrop Technologies), and is usually < 50 ng/uL [25].
Step 2: Global miRNA profiling using qRT-PCR Following RNA purification, cDNA synthesis and subsequent miRNA profiling can be carried out by qRT-PCR. Despite remarkable sensitivity and specificity of qRT-PCR method, there are challenges for using this method to analyze miRNA profiles due to the following reasons: (1) miRNAs are too short to provide enough sequence for primer design; (2) many miRNAs are highly conserved in sequence; (3) there are multiple forms of miR-NAs, including primary transcript (pri-miRNA), miRNA precursor (pre-miRNA), and mature miRNA, therefore, it requires outstanding specificity to recognize the mature miRNAs. At present, there are two qRT-PCR methods for miRNA expression analyses, the stem-loop method and the poly(A) method.
The stem-loop method that uses an elaborately designed stem-loop reverse transcription (RT) primer for synthesizing the cDNA of miRNAs, is the most extensively used approach for miRNA quantification (Table 1) [12,18,19,26]. The stem-loop RT primer is an oligonucleotide that forms a stem-loop structure containing a universal reverse primer sequence on the loop and several (usually six) specific bases at 3' end that are complementary to the 3'end of specific mature miRNA ( Figure 1) [12]. After RT, the cDNA of miRNA can be Stemloop specific universal LNA UPL probe [32] Poly(A) specific universal SYBR [55][56][57] Poly(A) LNA specific LNA specific SYBR [58] quantitatively amplified with specific forward and universal reverse primers. The stem-loop method is able to discriminate mature miRNAs from genomic DNA, pri-miRNA and pre-miRNA, presumably due to the effect of base stacking and spatial restriction of the stem-loop RT primer [12]. In addition, a dual-labeled hydrolytic Taqman probe that is specific for each miRNA is used to ensure the high specificity of miRNA detection ( Figure 1A). Initially, individual stem-loop RT primers were used for miRNA detection. Later, pooled stem-loop RT primers were applied for cDNA synthesis of hundreds of miRNAs simultaneously, facilitating high-throughput miRNA profiling [18,19,26]. For instance, using Megaplex Pools (Applied Biosystems) to detect hundreds of miRNAs on 384-well plates with a compatible thermal cycler, such as the Applied Biosystems 7900 HT, many studies identified a number of promising biomarkers associated with cancers and many other diseases [24,25,[27][28][29][30] (Table 2). However, since the Taqman probes designed for this method are distinct for each miRNA, they are likely to be costly for global miRNAs profiling [14]. Furthermore, the preamplification step in the stem-loop method might cause bias in the miRNA quantification [19]. To overcome the disadvantage of high cost while retaining the high specificity of the stem-loop method, some researchers used a probe from the Universal ProbeLibrary (UPL) for miRNA expression studies ( Figure 1B) [31,32]. The UPL probe is a class of probes produced by Roche Diagnostics, each of which contains 8 to 9 nucleotides with some nucleotides modified by a locked nucleic acid (LNA) technique. By integrating the binding sequence of an elaborately selected UPL probe into the conventional stem-loop RT primer, the UPL probe can be used with the comparable specificity to the Taqman probe, resulting in a single probe for the economical detection of various miRNAs [31].
The poly(A) method is another means used for miRNA expression analysis [14,33,34]. In this method, a poly(A) tail is added to the 3' end of each mature miRNA done by poly(A) polymerase. Tailed miRNAs are then subjected to RT using a universal RT primer containing 2 to 3 degenerate nucleotides at 3' end followed by an oligo (dT) and universal reverse primer sequence ( Figure 2). The synthesized cDNA is amplified with specific forward and universal reverse primers (Figure 2A). Instead of using a hydrolytic probe, SYBR Green I is used to quantitatively detect the amplified products and provides a more cost-effective detection method compared with the stem-loop/Taqman probe method. However, all polyadenylated RNAs are possibly recognized by the oligo(dT) RT primer. In addition, since SYBR Green I dye could bind to all double-strand DNA including amplification products of target miRNA, contaminant genomic DNA, Figure 1 Schematic representation of the stem-loop method for the study of miRNA expression. The reverse transcription of miRNA can be performed using a stem-loop RT primer that contains a universal reverse primer sequence on the loop and several specific bases at 3' end that are complementary to the 3'end of mature miRNA. The cDNA is amplified with specific forward and universal reverse primers. The amplification products can be detected by either (A) specific Taqman probe or (B) universal UPL (Universal ProbeLibrary) probe. Table 2 Summary of circulating miRNA biomarkers identified through qRT-PCR profiling Step 1 Step 2 Step 3 Step and primer-dimers, the poly(A) method is not as specific as the stem-loop method. Nevertheless, the specificity of the poly(A) method can be monitored by observing the melting curve (or dissociation curve). Many companies, such as SBI (System Biosciences), SAB (SABiosciences), and Invitrogen, have developed kits using the poly(A) method for miRNA profiling (Table 1), due to its cost-effective advantage. For example, circulating miRNA profiling has been performed on plasma collected from patients with colorectal cancer by using the Quanti-Mir kit (SBI), indicating miR-17-3p and miR-92 as potential biomarkers [21] (Table 2). A modified poly(A) method, miRCURY platform, has been developed recently ( Figure 2B, Table 1) by using enhanced LNA-specific forward and reverse primers instead of a universal reverse primer during amplification, achieving superior sensitivity and linearity [35].
Generally, the stem-loop method has been coupled with hydrolytic probes and the poly(A) method with SYBR green I detection; however, these combinations are flexible. For example, SYBR can be used in the stem-loop method [31], and the UPL probe in the poly (A) method [32]. Moreover, the LNA base can be replaced by an extra artificial tail at the 5' end of specific forward/reverse primers in the poly(A) method, achieving considerably higher efficiency than with LNA primers [36]. Overall, both the stem-loop and poly(A) methods as well as their derivatives can be used in circulating miRNA profiling. Currently, miRNA kits from Applied Biosystems that use stem-loop RT primers and Figure 2 Schematic representation of the poly(A) method used to study miRNA expression. After polyadenylation, the reverse transcription of miRNA is carried out using a universal RT primer that contains degenerate nucleotides at 3' end followed by an oligo(dT) and universal reverse primer sequence. The cDNA is amplified with either (A) specific forward and universal reverse primers or (B) LNA-integrated specific forward and specific reverse primers. The amplification products are detected by SYBR Green I. Several controls are required for the analysis of circulating miRNAs. A no-template control (NTC) can detect the presence of primer-dimers and other non-specific amplification products. In addition, a no-RT control (NRC) would monitor contamination of the sample with genomic DNA. An inter-run calibrator (IRC), usually using a DNA fragment, is a positive control to calibrate the deviation between qRT-PCR runs/plates. It is of great importance to use an IRC in a profiling assay when various samples are detected in different qRT-PCR runs/plates. Reference controls including spike-in RNA controls are essential for data normalization, as discussed below.
Step 3: Data normalization and analysis Data normalization is another major challenge for the analysis of circulating miRNA because there are no verified housekeeping genes existing in serum/plasma that can be used for normalization. To date, three normalization strategies have been proposed for circulating miRNA quantification.
The first strategy uses a stably expressed gene as a reference control according to the existing data, such as miR-16 [37], small nuclear/nucleolar RNAs RNU6 [21,29], RNU44 and RNU48 [38]. Although these reference genes constitute the best possible normalization controls, they cannot ensure constant expression under all experimental conditions [38,39]. Spiked-in RNAs, such as cel-miR-39, cel-miR-54, and cel-miR-238, are a class of heterogenous RNA that can not only monitor the efficiency of RNA purification and RT, but also be used as normalization controls [3,28,30]. By adding the same amount of spiked-in RNAs with an equal volume of serum/plasma, a stable reference control is obtained. However, using an equal mass of circulating total RNA to combine with Spike-in RNAs may result in better accuracy for normalization.
The second strategy of data normalization is to identify suitable normalizers for each study through the systematic evaluation of the expression level of a set of housekeeping genes. Several housekeeping genes are first selected and subjected to qRT-PCR assay using a number of samples. An algorithm named geNorm is then used to calculate the ratio of expression level of one housekeeping gene to another [38]; the algorithm assumes that this ratio is constant across all samples. An M-value is defined as the average standard deviation of the ratio of the pairwise housekeeping genes. The gene with the lowest M-value is determined to have the most stable expression level. A good combination of stably expressed housekeeping genes can be recognized by geNorm, and the geometric average can be used as a normalization factor for qRT-PCR data analysis.
The geNorm program was first used to identify appropriate reference controls for 13 types of human tissue [38]. Later, many studies used this algorithm to successfully identify reliable reference controls for miRNA quantification. For example, let-7a and miR-16 were identified as reference controls for breast cancer [40]. Similarly, miR-191 and miR-103 were chosen as reference controls for lung cancer and 13 distinct human solid tissues [41], as well as circulating miR-22*, miR-26a and miR-221 for hepatitis B virus-infected serum samples [25]. Normfinder is another algorithm used for identification of suitable normalizers [42]. The geNorm and Normfinder program have been integrated in qRT-PCR data analysis software qBaseplus [43,44] or GenEx [45].
It is important to identify a set of miRNAs as reference controls; however, this procedure is labor-intensive because of the numerous miRNAs in a number of samples that need to be measured. Additionally, there may be few relatively stable miRNAs existing in serum/plasma that can be chosen for conducting such a systematic evaluation, which limits its use in circulating miRNA quantification studies.
The third strategy is the newly reported global mean normalization method. Instead of using a single or a set of reference control(s), the global mean method uses the average expression level of all miRNAs detected in a sample as a normalization factor [46]. This strategy assumes that the mean expression level of all miRNAs in a sample, from either the healthy control or the patient, is constant when using the same total RNA input. Mestdagh et al. [46] demonstrated that the global mean method was better in reducing technical variation and preserving biological variation than using endogenous small miRNAs, such as nuclear/nucleolar RNAs. This method is very suitable to normalize genome-wide miRNA profiling without the need for selecting a specific reference control. However, data obtained by this method should be viewed cautiously when many miRNAs analyzed show great variability in expression levels in a sample, since the average value in this case will be apparently different. Moreover, the global mean normalization method is not suitable during the biomarker validation (discussed in the next section) because only few candidate miRNAs will be evaluated. The global mean normalization program is integrated in the qBase plus and GenEx analysis software.
As described above, none of the three normalization strategies provides an ideal solution for profiling circulating miRNAs. To assist analyses, the term Cq is the abbreviation of "quantification of cycle" recommended by the Minimum Information for Publication of Quantitative Real-Time PCR Experiments (MIQE), a substitution of the previously used terms of threshold cycle (Ct), crossing point (Cp) and take off point (TOP) [47,48]. This parameter provides an ideal reference point for the analysis of qRT-PCR assays. Here, we propose a comprehensive strategy. First, omit the miRNA(s) with the Cq value difference > 5 (32-fold change) between the healthy control and the patient sample. Second, calculate the global mean of Cq value of the extra miRNAs in each sample (the healthy control or the patient), which is used as a normalization factor (NF). Third, normalize all miRNAs, including those omitted before, with the formula, ΔCq = Cq miRNA -Cq NF . Finally, calculate the fold-change between the healthy control and patient with the formula, 2 -ΔΔCq [49,50]. Those miRNAs with the least foldchanges might also be used as excellent reference controls in the case studied.
Step 4: Selection and validation of miRNA biomarkers In this step, large numbers of samples (50 to 200) are separately used for validating the selected miRNAs in the profiling step. Around 10 to 20 miRNAs with the greatest fold-changes can be selected as candidate biomarkers, each of which needs to be measured in all samples using qRT-PCR assays. Using these analyses, only the miRNAs that are detected in all samples can be considered candidate biomarkers. To identify ideal reference controls for normalizing candidate biomarkers, approximately ten miRNAs with the least fold-changes are also detected in all samples. The algorithm used by the geNorm or Normfinder programs can be used to determine the most stable miRNAs that can serve as the normalization controls. The Cq value of each candidate miRNA marker is then normalized and transformed to the relative expression level using the formula 2 -ΔCq [24]. A box plot can then be made to present the relative expression level of the same miRNAs found in all normal or patient samples, with the median line showing whether a differentiallyexpressed miRNA exists between the healthy and the diseased samples [21,24]. The miRNA(s) with the most significant difference in expression can then act as promising circulating biomarkers.
Significant effort need to be performed to identify a miRNA biomarker with clinical value for the following reasons. First, the selected miRNA must be measured on a more extensive scale for the determination of its clinical applicability. Correlation between the expression level of candidate miRNA markers and other clinical factors, such as age, gender, severity of disease, needs to be evaluated. For example, with a constant expression level during different stages of disease, the miRNA may be a good biomarker for the diagnosis of the early-stage disease. However, if the expression level varies during the development of the disease, the miRNA can serve as a useful indicator for disease classification and prognosis [28].
Furthermore, the candidate miRNAs should be evaluated in pathological tissue, as the abnormal miRNA expression level found in the circulation might be derived from the release of deregulated miRNA in the tissue in the form of microvesicles [51] or Argonaute 2 protein complex [52]. This then provides a solid support for the candidate miRNA to be used as a biomarker if the expression level in the circulation is similar to that in the diseased tissue. Further analyses require blood samples obtained from patients of pre-and post-treatment, for example, surgical removal of carcinoma, to determine the utility of the miRNA biomarker for prognosis.
Finally, it is necessary to detect the candidate miRNA in other relevant diseases since many diseases may have a similar miRNA expression pattern. For example, miR-885-5p is elevated in liver-associated diseases encompassing hepatocellular carcinoma, liver cirrhosis, and chronic hepatitis B [24]. In this case, a set of miRNA biomarkers may provide better specificity for the diagnosis of each disease.
For the analyses of potential biomarkers of disease, it is necessary to performed appropriate statistical analyses of data obtained from the qRT-PCR assay. Generally, the Mann-Whitney and Kruskal-Wallis tests are used for the evaluation of the healthy control versus patient data and among different sets of samples, respectively. However, the Wilcoxon test is appropriate for the analysis of samples collected from the same patient over different periods of treatment. The results that display a p-value < 0.05 represent statistically significant changes. The receiver operating characteristic curve (ROC) can further determine whether the candidate miRNA biomarker is sufficiently specific for discriminating certain type of diseased samples from others (Table 2) [21,30].

Future directions
Although many circulating miRNAs have already been identified as potential biomarkers, analyses including a larger number of clinical samples are needed to validate these selected miRNAs as effective biomarkers. Moreover, the accuracy of circulating miRNA profiling is wholly reliant on the sample type, sample processing and profiling method, and normalization strategy, so that the procedure should be standardized to make the data obtained from different labs comparable. The highest number of circulating miRNA detected in human serum is 368 [28], however it is unknown whether more miR-NAs exist in the serum due to the sensitivity limit of current techniques. Given that many miRNAs with a Cq value of > 35 in qRT-PCR assays are frequently excluded in the normalization step, there are presumably more than 368 miRNAs in human serum. Thus, the development of more sensitive and specific techniques to increase the range of detectable miRNA in serum could reveal more potential biomarkers. To date, a great number of studies on circulating miRNA profiles have mainly focused on human specimens. However, miRNA profiling in animal is in its early stages, such as miRNA profiling performed on cow milk using solexa sequencing [53]. Through these analyses, seven promising biomarkers have been identified that can be used to discriminate milk quality. Further application of qRT-PCR in animal miRNA profiling may identify more useful biomarkers for the diagnosis of animal diseases and quality control. Moreover, significant reduction of the operation cost of qRT-PCR will promote its use in animal studies.

Conclusion
The use of qRT-PCR is by far the most sensitive and specific means for the evaluation of miRNA profiles. Using a four-step operation of sample preparation, profiling, normalization and validation, promising miRNA biomarkers can be elucidated for clinical diagnosis and prognosis. It is possible that the detection of circulating miRNA biomarkers can be included in future routine clinical examinations for the diagnosis of early stages of diseases, such as cancer and cardiovascular disease.