Repertoire of novel sequence signatures for the detection of Candidatus Liberibacter asiaticus by quantitative real-time PCR

Background Huanglongbing (HLB) or citrus greening is a devastating disease of citrus. The gram-negative bacterium Candidatus Liberibacter asiaticus (Las) belonging to the α-proteobacteria is responsible for HLB in North America as well as in Asia. Currently, there is no cure for this disease. Early detection and quarantine of Las-infected trees are important management strategies used to prevent HLB from invading HLB-free citrus producing regions. Quantitative real-time PCR (qRT-PCR) based molecular diagnostic assays have been routinely used in the detection and diagnosis of Las. The oligonucleotide primer pairs based on conserved genes or regions, which include 16S rDNA and the β-operon, have been widely employed in the detection of Las by qRT-PCR. The availability of whole genome sequence of Las now allows the design of primers beyond the conserved regions for the detection of Las explicitly. Results We took a complimentary approach by systematically screening the genes in a genome-wide fashion, to identify the unique signatures that are only present in Las by an exhaustive sequence based similarity search against the nucleotide sequence database. Our search resulted in 34 probable unique signatures. Furthermore, by designing the primer pair specific to the identified signatures, we showed that most of our primer sets are able to detect Las from the infected plant and psyllid materials collected from the USA and China by qRT-PCR. Overall, 18 primer pairs of the 34 are found to be highly specific to Las with no cross reactivity to the closely related species Ca. L. americanus (Lam) and Ca. L. africanus (Laf). Conclusions We have designed qRT-PCR primers based on Las specific genes. Among them, 18 are suitable for the detection of Las from Las-infected plant and psyllid samples. The repertoire of primers that we have developed and characterized in this study enhanced the qRT-PCR based molecular diagnosis of HLB.


Background
Huanglongbing (HLB) or citrus greening is the most devastating disease of citrus, threatening the citrus industry worldwide, and leading to massive reduction in fruit production as well as death of infected trees [1]. The causal agents of HLB are three closely related gramnegative, phloem-limited α-proteobacteria Candidatus Liberibacter species [2,3]. The heat tolerant strain Ca. L. asiaticus (Las) is the most widespread in Asia as well as in the USA whereas Ca. L. americanus (Lam) is mostly limited to South America [2][3][4]. Ca. L. africanus (Laf ) is heat sensitive and localized to the African continent. All the three Liberibacter species are currently uncultured and are known to reside in the sieve tubes of the plant phloem [5] or in the gut of the phloem-feeding psyllids [6]. Psyllids are the natural vectors in transmitting the bacteria between plants [1,6]. The Asian psyllid, Diaphorina citri Kuwayama (Homoptera: Psyllidae) is responsible for transmitting Las and Lam in Asia and America, while the African citrus psyllid, Trioza erytreae Del Guercio (Homoptera: Psyllidae), is the natural vector of Laf in Africa [7]. The characteristic symptoms of the infected plants include the yellow shoots, foliar blotchy mottles, along with poor flowering and stunting [1]. HLB also results in poorly colored, unpleasant tasting, reduced size fruit that shows staining of vascular columella and seed abortion [1]. Generally the fruit may remain partially green, for this reason HLB is also called citrus greening [1]. Chronically infected trees are sparsely foliated and display extensive twig or limb die-back and eventually die within three to five years [1]. Moreover, the disorders induced in diseased plants vary with cultivar, tree maturity, time of infection, stages of disease and other abiotic or biotic agents that affect the tree [1]. HLB symptoms also share certain similarities to nutrient deficiency [1], citrus stubborn disease caused by Spiroplasma citri [8] and a HLB-like disease caused by a phytoplasma [9,10]. Early diagnosis and differentiation of Las infections from those defects and agents mentioned above, is thus critical to reducing the spread and devastation of this disease locally and via international trade, as well as minimizing the economic impact of potential false positive diagnoses.
Importantly, HLB and the Asian citrus psyllid (D. citri) are expanding to new citrus production areas. Currently, Asian citrus psyllid has been found in Florida, Texas, California, Arizona, Hawaii, Louisiana, Georgia, and Alabama in the USA, as well as in parts of South and Central America, Mexico, and the Caribbean. Meanwhile, HLB has not only been identified in Florida, Louisiana, South Carolina, Louisiana, Georgia, Texas and California of the USA; it has also been discovered in Cuba, Belize, Jamaica, Mexico, and other countries in the Caribbean [11]. While HLB and D. citri have been found in different producing areas, the number of infected trees and the psyllid vector population vary dramatically among different regions. Thus, different strategies of management of HLB are recommended for different regions, according to the corresponding severity of HLB and occurrence of psyllid vectors.
Currently, no efficient management strategy is available to control HLB. For the recently Las-infected citrus producing areas such as California, prevention and eradication of HLB are the most efficient and costeffective approaches. Additionally, Las infected trees are most often found to be asymptomatic during the early stage of infection. Thus, accurate early detection of Las in citrus plants and psyllids is critical for enacting containment measures in non-endemic citrus producing areas. For the citrus producing areas without HLB, such as the Mediterranean region, accurate detection is critical for the success of quarantine measures against Ca. Liberibacter.
Here we took a complimentary approach to identify the genes that are unique to Las by a bioinformatic analysis with the goal of expanding the arsenal of tools for Las detection. The advancement in the genome sequencing of Las [29] provides an opportunity to design primers based on species specific sequences for the detection of Las. We designed the oligonucleotide primer pairs specific to the identified unique genic signatures. We further validated their specificities and selectivity against closely related strains that demonstrated the application to Las-infected tissues and insect vectors by a qRT-PCR.

Results and discussion
Recently, the whole genome sequences of Las [29,30] have been sequenced. This allows for systematic screening of unique Las genes in a genome-wide fashion. The availability of the genome sequences of the closely related species Lam [31], L. crescens (Lcr) [32] and Ca. L. solanacearum (Lso) [33], further effectively helps in identification of unique regions, by minimizing the cross-species reactions, thereby enhancing the diagnostic identification of Las in a more distinct manner.

Bioinformatic analysis
Several high-throughput applications have been developed recently to design diagnostic primers using the whole genome sequence information including KPATH, Insignia, TOFI, and TOPSI [34][35][36][37][38][39][40]. Among them, KPATH, Insignia, and TOPSI have the potential to be used for design of real-time PCR primers for qRT-PCR based assays for Las, whereas TOFI is used to design signatures for microarray-based assays. These methods mentioned above can be basically categorized into alignment-free and alignment-based approaches. The alignment-free approach uses both coding and non-coding regions of the genome and is useful for the genomes with less accurate sequence information, but generally result in high false positive rates as it does not involve pre-screening of the selected genomic loci for their discriminatory ability [37]. The alignment-based approach involves pre-screening of the selected genomic loci for their discriminatory ability [34]. This approach does not consider the genome annotation of genic and non-genic information, but rather aligns bigger regions of the genome, hence prone to lose shorter discriminatory sequence regions. Additionally, discriminatory ability of the selected regions are screened bioinformatically only on limited number of closely related species, which provide more opportunities for false positives. We therefore took a complementary bioinformatics approach by pre-screening shorter genic regions against the nucleotide sequence database (nt) at NCBI, to identify all the possible unique genic regions from the Las genome. The natural selection acts more strongly on genic region, hence use of discriminatory sequences in this region results in less false positives as the organisms are under selection pressure [41]. Additionally, pre-screening against the nt is more effective as it contains the largest pool of well-annotated nucleotide sequences from different organisms. We envisioned that these two steps would result in more specific detection of target organism with less false positives, hence are included in our bioinformatics approach.
There are~1100 genes assigned to the Las genome. Therefore, manual searching of each of these sequences against the nt database using BLAST program [42,43] is a laborious and time consuming procedure. Hence, we automated this sequence similarity search step by developing a standalone PERL script (Additional file 1). This script performed the similarity searches for each of the Las gene against the specified database with hard-coded parameters for the BLAST program. Further, manual analysis of the resulting BLAST search output files is also laborious and time consuming; we therefore, automated this step by developing a second PERL script (Additional file 2). This script automatically parsed all the BLAST output files and returned the Las sequences for which, no hits were found in other organisms. We refer to these sequences as probable unique sequences, because there are nearly no identical sequences found in other organisms ( Figure 1).
We performed the sequence similarity searches first by using stringent E-value of ≤ 1 × 10 -3 against nt database ( Figure 1). This search resulted in~200 sequences that are unique to Las. This set of sequences is relatively high to validate experimentally; therefore, to further reduce the number of unique sequences, we performed the Figure 1 Pictorial representation of the bioinformatics strategy employed to churn out the unique genic regions from Las genome. The input and output of each step are shown in oval or square boxes. Actions taken are noted to the left side of the arrow mark, while the information used is indicated to the right side of the arrow. second sequence similarity search with a relaxed E-value of ≤ 1. This search resulted in 38 unique sequences. The E-value of ≤ 1 excludes the sequences with even little similarity to other organisms. Therefore, the resulting 38 unique sequences are considered unique to Las and constitute the promising candidates for qRT-PCR based detection ( Figure 1).
We further searched the 38 unique sequences of Las against the phylogenetically closely related Lso, Lam, and Lcr. Because these organisms are closely related, we used the stringent E-value threshold of ≤ 1 × 10 -3 for this similarity search. In order to achieve this E-value, the sequences need to be highly similar between the Las, Lso, Lam, and Lcr. Therefore, this close species filter procedure potentially eliminates all the Las sequence targets that could lead to false positive results in qRT-PCR based molecular diagnostic assays. Consequently, we further eliminated four conserved sequences from the list of 38 unique sequences, resulting in a total of 34 potential sequence signatures. We could not apply this close species filter step against Laf genome as its genome is yet to be sequenced.
Five (~15%) of the 34 unique gene sequences namely CLIBASIA_05545, CLIBASIA_05555, CLIBASIA_05560, CLIBASIA_05575 and CLIBASIA_05605 are in the prophage region of the Las genome. All these five unique sequences are located upstream of the genomic locus CLIBASIA_05610 encoding a phage terminase. There are possibly 30 genes that represent the complete prophage genome within the Las genome [25,44], of which 16 open reading frames (ORFs) are upstream of the phage terminase, while the remaining 13 ORFs are downstream. The prophage genes CLIBASIA_05610 (primer pair 766 F and 766R) and CLIBASIA_05538 (primer pair LJ900F and LJ900R) have been targeted in previous studies by both conventional as well as qRT-PCR based assays [25,44].
We further analyzed the genomic orientation of the 34 unique genes. This analysis revealed that 15 (~44%) of them are oriented on the sense strand, while the remaining 19 (~56%) were present on the anti-sense strand (Additional file 3: Figure S1). The sequence length of these unique genes ranged from 93 to 2595 base pairs (bp) (Additional file 4: Table S1).
Designing of Las specific primers and experimental validation of the specificity and sensitivity of qRT-PCR assay to detect Las Based on the genome sequence of Las strain psy62, we designed 34 qRT-PCR primer pairs that specifically target the 34 unique sequences identified in our bioinformatic analyses (Additional file 4: Table S1). We designed the melting temperature (Tm) of each of these primers to range from 59°C to 65°C with an optimum of 62°C.
The GC content of the primers ranged from 35% to 65% with an optimum of 50%. The PCR amplicon sizes for each primer set are between 84 to 185 bp (Additional file 4: Table S1).
In addition to the novel primers designed in this work, we also used a set of control primers that have been previously used in a qRT-PCR based detection of Las. These known primers include 16S rDNA pairs specific to the three different Candidatus Liberibacter species (HLBasf/ r: Las, HLBamf/r: Lam and HLBaf/r: Laf ) [23], β-operon (CQULA04f/r: β-operon) [26], intragenic repeats regions of the prophage sequence (LJ900f/r: Prophage) [25], and the primer pair specific to the plant cytochrome oxidase (COXf/r: COX) gene [23] as a positive endogenous control.
We performed qRT-PCR assays to test the specificity of the designed primers using total DNA extracted from Las-infected citrus plants as a template. To further validate the specificity of these primers, we also included total DNA from the phylogenetically closely related species Lam and Laf in our test. Additionally, DNA extracted from healthy citrus plant was used as a negative control, whereas water served as a no template control. The results of qRT-PCR assays are listed in Table 1.
Most of our novel custom designed primer pairs targeting the unique gene sequences were indeed found to be highly specific to Las, as assessed by qRT-PCR assays ( Table 1). Among the 34 primer pairs, 29 produced amplicons only when Las-infected citrus plant DNA was used as a template, with an average C T value ranged from 19.48 to 27.47. Two primer pairs, P13 and P15, didn't produce any amplicons under the standard conditions tested. The other three primer pairs, P19, P27 and P28, produced amplicons when Las or Laf infected plant DNA was used as a template, indicating P19, P27 and P28 could be used to detect both Las and Laf. We were unable to filter for cross-reactivity of P19, P27 and P28 in the bioinformatic analysis, because the Laf genome sequence is currently unavailable. With the exception of these three primer sets that showed amplicons with Laf template, none of the other primer sets produced any amplicons with DNA of Lam, Laf, and healthy citrus or water as template, which further confirms the specificity of these primers to the Las.
We further evaluated the specificity of these primer sets using DNA templates from various citrus associated fungal and bacterial pathogens including Colletotrichum acutatum KLA-207, Elsinoe fawcettii, Xanthomonas axonopodis pv. citrumelo 1381, X. citri subsp. citri strains 306, A w , and A * . Only two primers sets, P20 and P21 showed unspecific amplification against template DNA extracted from fungal pathogen C. acutatum KLA-207 (Table 1). C. acutatum causes citrus blossom blight, post-bloom fruit drop and anthracnose symptoms that   are phenotypically distinguishable from citrus HLB. The P20 and P21 were not filtered by the bioinformatic analysis since C. acutatum genome sequence was unavailable in the database. Because of the complexity of the natural microbial community and the limited number of sequences available in the current nucleotide sequence database, it is impossible to completely filter out all the potential false positives bioinformatically. However, false positives could be identified experimentally by combining the different sets of primer pairs by a consensus approach [37]. We eliminated these two primer sets from further evaluation in this study.
The melting temperature analysis of the amplicons produced from our novel primer set with Las as a template indicated that amplicons were of a single species. This suggests that there is no off target amplification for our primer pairs on the Las genome. Overall, the experimental validation of the 34 novel primer sets specific to unique targets revealed that 27 (~80%) of these targets are indeed specific to the Las genome (Table 1). This demonstrates the significance of the bioinformatics strategy employed here for identifying the suitable target regions for the detection of the bacteria by qRT-PCR based methods. These 27 novel primer pairs were selected for further characterization.
To test the sensitivity of our designed novel primers, serial dilutions of Las-infected psyllid DNA was used as a template in the qRT-PCR assay. This serial dilution qRT-PCR assay indicated that most of our novel primer pairs were able to detect Las up to 10 4 dilutions from the initial template DNA concentration, which is comparable to that of the primer set targeting Las 16S rDNA (Table 1). However, lower sensitivity was observed in the case of primer pairs P9, P12, P14 and P22, which were eliminated from further study. The remaining 23 primer pairs were able to detect Las up to 10 4 dilutions, with a correlation co-efficient (R 2 >0.94) between the C T values and dilutions (Table 1). This demonstrates the high sensitivity of these 23 primers in the detection of Las.

qRT-PCR detection of Las from plant and psyllid DNA samples isolated from diverse locations in USA and China
In order to further demonstrate the degree of applicability of the 23 primer pairs in the detection of Las from infected biological material, we performed qRT-PCR on the various Las-infected plant and psyllid DNA samples. Considering the potential variation in nucleotide sequences of Las isolates in different geographic locations that might affect our detection due to the potential nucleotides changes of the target unique genes, we collected Las-infected plant DNA samples as tabulated in Table 2, from not only USA, but also from China, where Las was reported more than 100 years ago [1]. We tested the 23 primer pairs on 17 Las-infected plant DNA samples. Of these 17, 12 were collected from different locations in Florida, USA ( Figure 2, Table 2), and the remaining five were collected from different locations in China (Table 2). Additionally, Las-infected psyllid DNA samples collected from five different locations in Florida, USA, were also included in the qRT-PCR assays (Table 3, Figure 2).
All the 23 primer pairs detected Las from all 12 Florida HLB diseased plant samples ( Table 2) and 5 psyllid DNA samples (Table 3) in a qRT-PCR assay, which further validated the detection applicability of our novel primers ( Figure 2). However, 4 of the 23 primer pairs (P1, P7, P8 and P10) failed to produce amplicons with the infected plant DNA sample from Jiangxi and Guangdong Province, China (Table 2). Primer pair P3 produced no amplicon with Jiangxi sample, and produced unspecific amplicon with the Guangdong sample (with an altered PCR product size, data not shown). Interestingly, all these 5 primer pairs target the genes located in prophage region of the Las genome (Additional file 3). These primers (P1, P3, P7, P8 and P10) based on prophage genes could detect Las from Florida, but not from Jiangxi and Guangdong province, China. This is consistent with previous report [44], that prophage was detected in only 15.8% of the 120 HLB diseased citrus samples acquired in Guangdong Province, China, but was detected in 97.4% of the 39 Las positive citrus samples acquired in Yunnan Province, China. This suggests that those prophage genes are not universally present in all strains of Las. Alternately, the prophage sequences were found to be highly variable among the strains tested.

Conclusions
We have successfully designed 18 novel primer pairs, which are specific to Las. These primers will provide an additional arsenal to qRT-PCR based detection of Lasinfected plants and psyllids. Compared to the commonly used primers based on 16S rDNA and β-operon, the 18 primers developed in this study have comparable sensitivity. Moreover, these primers could successfully differentiate Las from Lam, Laf and other common microbes associated with citrus.

Bioinformatics
The nucleotide sequences of Las with accession number NC_012985 [29,45], Lso with accession number NC_014774 [33], Lcr with accession number NC_019907 and comprehensive nucleotide (nt) database (26 th July 2012) were downloaded from the NCBI ftp server (ftp.ncbi.nih. gov). The stand-alone BLAST [42,43] was used to search the Las genes against nt, Lso and Lcr databases using a custom-made PERL script 1 (Additional file 1) by varying the E-value with all other parameters kept to a default value. The output files of the BLAST searches were further parsed using a second custommade PERL script 2 (Additional file 2).

Plant and psyllid materials and extraction of DNA
Las infected citrus leaf samples with typical visible symptoms were collected from 2 years old infected sweet orange (Citrus sinensis) plants maintained at the Citrus Research and Education Center (CREC), Lake Alfred, Florida, USA. As a negative control, the leaves from healthy citrus plants were collected from pathogen-free seedlings grown in the healthy plant greenhouse maintained at CREC, Lake Alfred, Florida, USA. The Laf and Lam infected samples were obtained from South Africa and Brazil respectively. The total DNA from the leaves of citrus was extracted using the protocol mentioned elsewhere [46]. Briefly, the leaves were washed under tap water and surface sterilized in 35% bleach (2% active Chlorine) and 70% (v/v) ethanol for 2 min each. The sterilized leaves were further rinsed three times in sterile water. The midribs from the leaf samples were separated and cut into small pieces. Approximately 100 mg of midrib pieces were used from each sample to extract the DNA using the Wizard® genomics DNA purification kit (Promega, Madison, WI, USA). The extracted DNA was suspended in 100 μl H 2 O. Las infected psyllids (Diaphorina citri) were maintained on confirmed Las-infected sweet orange plants at the CREC, Lake Alfred, FL, USA. In this work, 16 psyllids (around 20 mg) were pooled and the total DNA was extracted using a DNeasy Blood & Tissue Kit (Qiagen, Valencia, CA). The extracted DNA was suspended in 100 μl H 2 O. The quality and quantity of the extracted DNA was determined using a NanoDrop™ 1000 spectrophotometer (NanoDrop Technologies, Inc., Wilmington, DE).

Quantitative real-time polymerase chain reaction (qRT-PCR)
Gene specific primers were designed using Primer-QuestSM from Integrated DNA technologies (IDT), Coralville, Iowa (Additional file 4: Table S1). qRT-PCR experiments were performed using ABI PRISM 7500 FAST Real-time PCR System (Applied Biosystems, Foster City, CA, US) in a 96-well plate by using an absolute quantification protocol. The reaction mixture in each well contained 12.5 μL 2x FAST SYBR® Green PCR Master Mix reagent (Applied Biosystems), 2 μL DNA template (~30 ng), 0.625 μL of 10 μM of each gene-specific primer pair in a final volume of 25 μL. The standard thermal profile for all amplifications was followed, which involved 95°C for 20 min followed by 40 cycles of 95°C for 3 sec, and 50°C for 30 sec. All assays were performed in triplicates.
Melting curve analysis was performed using ABI PRISM 7500 FAST Real-time PCR System Software version SDS v1.4 21 CFR Part 11 Module (Applied Biosystems®) to characterize the amplicons produced in a PCR reaction.