Inter- and Intra-subtype genotypic differences that differentiate Mycobacterium avium subspecies paratuberculosis strains

Background Mycobacterium avium subspecies paratuberculosis (Map) is the aetiological agent of Johne’s disease or paratuberculosis and is included within the Mycobacterium avium complex (MAC). Map strains are of two major types often referred to as ‘Sheep’ or ‘S-type’ and ‘Cattle’ or ‘C-type’. With the advent of more discriminatory typing techniques it has been possible to further classify the S-type strains into two groups referred to as Type I and Type III. This study was undertaken to genotype a large panel of S-type small ruminant isolates from different hosts and geographical origins and to compare them with a large panel of well documented C-type isolates to assess the genetic diversity of these strain types. Methods used included Mycobacterial Interspersed Repetitive Units - Variable-Number Tandem Repeat analysis (MIRU-VNTR), analysis of Large Sequence Polymorphisms by PCR (LSP analysis), Single Nucleotide Polymorphism (SNP) analysis of gyr genes, Pulsed-Field Gel Electrophoresis (PFGE) and Restriction Fragment Length Polymorphism analysis coupled with hybridization to IS900 (IS900-RFLP) analysis. Results The presence of LSPA4 and absence of LSPA20 was confirmed in all 24 Map S-type strains analysed. SNPs within the gyr genes divided the S-type strains into types I and III. Twenty four PFGE multiplex profiles and eleven different IS900-RFLP profiles were identified among the S-type isolates, some of them not previously published. Both PFGE and IS900-RFLP segregated the S-type strains into types I and III and the results concurred with those of the gyr SNP analysis. Nine MIRU-VNTR genotypes were identified in these isolates. MIRU-VNTR analysis differentiated Map strains from other members of Mycobacterium avium Complex, and Map S-type from C-type but not type I from III. Pigmented Map isolates were found of type I or III. Conclusion This is the largest panel of S-type strains investigated to date. The S-type strains could be further divided into two subtypes, I and III by some of the typing techniques (IS900-RFLP, PFGE and SNP analysis of the gyr genes). MIRU-VNTR did not divide the strains into the subtypes I and III but did detect genetic differences between isolates within each of the subtypes. Pigmentation is not exclusively associated with type I strains.

to the sheep lineage have been referred to as 'Sheep or S-type' and those of the cattle lineage 'Cattle or C-type' according to the species from which they were first isolated. As the technologies for molecular typing advanced and more genotyping studies were undertaken, greater genetic diversity was detected within both the S-and Ctype strains. Pulsed-field gel electrophoresis (PFGE) revealed three strain types designated Types I, II and III [7,8]. Type II is synonymous with C-type and types I and III comprise the S-type. In this paper we will use the term S-type to describe collectively type I and III strains and have designated the types I and III as subtypes. S-type strains have not been characterized to the same extent as C-type strains due to the difficulty in culturing the strains in vitro resulting in a limited number of strains available for such studies. Here we undertook the first comprehensive genotyping study of a large representative panel of S-type strains using various typing methods that have been applied to Map strains, individually or in combinations, to draw a portrait of S-type strains. We studied both inter and intra-subtype genotypic strain differences using restriction fragment length polymorphism analysis coupled with hybridization to IS900 (IS900 RFLP), PFGE and various PCRs based on variable-number tandem repeat (VNTR) loci and mycobacterial interspersed repetitive units (MIRUs) [9,10] MIRU-VNTR typing [11], the presence or absence of large sequence polymorphisms (LSPs) [12] and the gyrA and B genes [13]. Our panel of S-type strains comprised strains from different geographic origins with different restriction enzyme profiles and includes pigmented strains. We also incorporated typing data obtained for additional Map C-type isolates to represent the all diversity of the genotypes described and Mycobacterium. avium subsp. avium (Maa) Mycobacterium. avium subsp. silvaticum (Mas) and Mycobacterium avium subsp. hominissuis (Mah) for comparison.

PFGE analysis
PFGE analysis was carried out using SnaBI and SpeI according to the published standardized procedure of Stevenson et al. [8] with the following modifications. Plugs were prepared to yield a density of 1.2 × 10 10 cells ml -1 and the incubation time in lysis buffer was increased to 48 hr. The concentration of lysozyme was increased to 4 mg ml -1 . Incubation with proteinase K was carried out for a total of seven days and the enzyme was refreshed after four days. Restriction of plug DNA by SpeI was performed with 10U overnight after which the enzyme was refreshed and incubated for a further 6 hr. The parameters for electrophoresis of SpeI restriction fragments were changed to separate fragments of between 20 and 250Kb as determined by the CHEF MAPPER and electrophoresis was performed for 40 hr. Gel images were captured using an Alphaimager 2200 (Alpha Innotech). Profiles were analysed using Bionumerics ™ software version 6.5 (Applied Maths, Belgium).

SNP analysis of gyrA and gyrB genes
Primers (Additional file 2: Table S2) were designed for both gyrA (GenBank accession no. 2720426 [Genome number: NC_002944.2]) and gyrB genes (GenBank accession no. 2717659 [Genome number: NC_002944.2]). The PCR mixture was composed as follows using the GoTaq Flexi DNA polymerase (Promega). Two microliters of DNA solution was added to a final volume of 50 μl containing 0.2 μl of GoTaq Flexi DNA polymerase (5 U/μl), 2 mM (each) dATP, dCTP, dGTP, and dTTP (Promega); 10 μl of 5x PCR buffer supplied by the manufacturer; 1 μM of each primers; and 1.5 mM of MgCl 2 . The reactions were carried out using a TC-512 thermal cycler (Techne). PCR conditions were as follows: 1 cycle of 5 min at 94°C; 30 cycles of 30 s at 94°C, 30 s at 58°C, and 30 s at 72°C; and 1 cycle of 7 min at 72°C. PCR products were visualized by electrophoresis using 1.5% agarose gels (agarose electrophoresis grade; Invitrogen), purified using NucleoSpin W Extract II (Macherey-Nagel) and sequenced by GenomExpress (Grenoble, France). Sequence analysis and SNP detection were performed by using the Bionumerics ™ software version 6.5 (Applied Maths, Belgium).

LSP analysis
Primers were used according to Semret et al. [12] and described in Additional file 2: Table S2. The PCR mixture comprised 2 μl of DNA solution added to a final volume of 50 μl containing 0.2 μl of GoTaq Flexi DNA polymerase (5 U/μl Promega), 2 mM (each) dATP, dCTP, dGTP, and dTTP (Promega); 10 μl of 5x PCR buffer supplied by the manufacturer; 1 μM of each primers; 1 μL of dimethyl sulfoxide (Sigma) and 1.5 mM of MgCl 2 . The reactions were carried out using a TC-512 thermal cycler (Techne). PCR conditions were as follows: 1 cycle of 5 min at 94°C; 30 cycles of 30 s at 94°C, 30 s at 55°C, and 30 s at 72°C; and 1 cycle of 7 min at 72°C. To detect presence or absence of each LSP, PCR products were analyzed by electrophoresis using 1.5% agarose gels.

MIRU-VNTR analysis
DNA in agarose plugs prepared for PFGE analysis was used for MIRU-VNTR analysis according to Stevenson et al. [23]. Small pieces of agarose plug, approximately 2 mm thick, were washed in TE buffer (pH 8) to remove residual EDTA in the storage buffer. One hundred microlitres of TE buffer were added to the agarose and the sample boiled for 10 min to melt the agarose. Five microlitres were used for PCR and MIRU-VNTR analysis interrogating eight polymorphic loci was performed as described by Thibault et al. [11]. The allelic diversity (h) at a locus was calculated by using Nei's index (see Additional file 3: Table S4 where x i is the frequency of the i th allele at the locus, and n the number of isolates [24].

Calculation of the discriminatory power
The Simpson Discrimination Index (DI) described by Hunter and Gaston [25] was used as a numerical index for the discriminatory power of each typing method PFGE, IS900-RFLP and MIRU-VNTR and combinations of the typing methods (see Table 2 and Additional file 3: Table  S4). The DI was calculated using the following formula: Where N is the total number of isolates in the typing scheme, s is the total number of distinct patterns discriminated by each typing method and strategy, and n j is the number of isolates belonging to the jth pattern.

LSP analysis
The existence of two major Map lineages was previously supported by the distribution of two LSPs specific for Map as described by Semret et al. [12,26,27]: the region corresponding to LSP A 20 is specifically absent from strains of the sheep lineage whereas that corresponding to LSP A 4 is specifically absent from strains of the cattle lineage. The distribution of these LSPs was thus investigated across our representative panel of Map S-type strains from various origins. As shown in Figure 1, analysis by PCR supports the association of the LSP A 20 region with C-type strains whereas the LSP A 4 region is present in all S-type strains. Presence of the LSP A 4 region was not related to PFGE subtype I versus III, of the country of origin and pigmentation status (Table 1).

SNP analysis
Since SNPs found in gyrA and B genes have been reported to be subtype (I, II, III)-specific, the panel of Map S-type strains was subjected to SNP analysis and compared to C type K-10 strain. As shown in Table 3, consensus sequences obtained matched those previously published and distinguished types I, II and III of Map.

PFGE typing
PFGE analysis results were obtained for 15 S-type and 24 C-type strains (Figure 2A and 2B). The sequenced K10 type II strain was also included. SnaB1 or SpeI analyses segregated strains according to the two sheep and cattle lineages and at the subtype level I, II and III. With SnaBI and SpeI individually, 5 different profiles were obtained for the 5 type I strains and 9 different profiles for the 10 type III strains. The type II strains exhibited 15 different SnaBI profiles, with profile [2] being the most frequent (8 strains) and 14 different SpeI profiles with profile [1] being the most frequent (11 strains). The DI of the subtype I and subtype III were respectively 1 and 0.956 for SnaB1 and 1 and 0.978 for SpeI and that of C-type (Type II) was 0.895 for SnaBI and 0.801 for SpeI (see Table 2 and Additional file 3: Table S4). DI of 0.96 and 0.924 for SnaBI and SpeI respectively was achieved for the 39 Map strains presented in Figure 2A and 2B. The combination of both enzymes gave 39 unique multiplex profiles (see Table 1 and Additional file 1: Table S1).

IS900-RFLP typing
IS900-RFLP typing clearly separated the strains into three groups that correlate with the PFGE subtypes I, II and III ( Figure 3). Ten strains of S-type, subtype I cluster into two groups of profiles S1 (n = 2) and S2 (n = 8). The 14 strains of S-type, subtype III display more polymorphism with 9 profiles, including 6 new ones. Profiles previously described included I1 (n = 1), I2 (n = 1) and I10 (n = 2). The new profiles were called A (n = 3), B (n = 2), C (n = 2), D, E and F (n = 1 each) (indicated in the Additional file 4: Figure S1). The strains of C-type were well distinguished from S-type and were not highly polymorphic. In this panel of strains the most widely distributed profile R01 was found for 21 strains, then R09 (n = 2) and R34 (n = 2) and 10 profiles were identified in only one isolate, R04, R10, R11, R13, R20, R24, R27, R37, C18 and C20. With this Map panel of strains the discrimination index (DI) of RFLP was shown very variable depending on the type and the subtype of the strains. The DI of the subtype I was very low (0.356), for the subtype III high (0.934) and that of C-type (Type II) was low (0.644) ( Table 2). A DI of 0.856 was achieved for the 59 Map strains presented in Figure 3.

MIRU-VNTR typing
The result of MIRU-VNTR typing of the S-type strains is shown in Table 1. MIRU-VNTR data from 148 C-type (type II) strains previously described [11,18,19] were included in the analysis (see Additional file 1: Table S1). MIRU-VNTR using the eight markers described previously [11] could differentiate between S-and C-type strains but not between the subtypes I and III.

Discussion
In comparison to Map C-type strains, investigation of the epidemiology and genetics of S-type strains has been hampered due to difficulties in their isolation and their extremely slow growth-rate in laboratory culture [28,29]. Indeed, the isolation and maintenance of Map S-type strains continues to be a challenge for laboratories worldwide and relative to Map C-type strains a paltry number are available for study. Nowadays representative genome sequences are available for both C-and S-type subtype III Map strains [30,31]. This has facilitated the identification of specific genetic elements that can be used to identify isolates and discriminate between types and, in some cases subtypes of strains [14,16,22,[32][33][34].
In this study we assembled a panel of S-type strains from different geographic origins and host species and undertook extensive molecular typing to improve our knowledge on the genetic diversity of these strains and their phylogenetic relationship with respect to Map Ctype strains and other members of MAC. This is the largest panel of S-type strains investigated to date. Additionally, the study also permitted identification of the most efficient typing techniques for S-type strains.
The results of the study coupled with previous results on genotypic and phenotypic characterization of Map strains concur with the division of this subspecies into  I   I   I   I   I   II   II   II   II   II   II   II   II   II   II   II   II   II   II   II   II   II   II   II   II   II   II   III   III   III   III   III   III   III   III   III   III   II I   I   I   I   I   III   III   III   III   III   III   III   III   III   III II   II   II   II   II   II   II   II   II   II   II   II   II   II   K10  1  Bovine  C  II  PCR219 1  I   I   I   I   I   I   I   I   I   I   III   III   III   III   III   III   III   III   III   III   III   III   III   III   II   II   II   II   II   II   II   II   II   II   II   II   II   II   II   II   II   II   II   II   II   II   II   II   II   II   II   II   II   II   II   II   II  two major lineages comprising S-type and C-type strains. However, the results of IS900-RFLP, PFGE and SNP analysis of the gyr genes clearly divide Map strains into three subtypes, Type II or C strains, Type I and Type III strains. But from the data available on these strains, the two subtypes do not seem to be associated with a particular phenotype and may just reflect regional genetic differences. Type I was first proposed to describe a group of ovine pigmented Map strains with distinctive PFGE profiles [8]. However, as more ovine strains were typed by PFGE, it became apparent that there was another cluster of non-pigmented ovine Map strains that were designated Type III strains [7]. The pigmented phenotype consequently became associated with the Type I strains. However, in this study we included two pigmented strains originating from different geographic locations, which were typed as type III by SNP analysis of the gyr genes, IS900 RFLP and PFGE. The pigmentation phenotype is not therefore restricted to type I and there is no other obvious phenotype currently known to differentiate between types I and III. MIRU-VNTR, despite being highly discriminatory between strains did not separate the S-type strains into the two types I and III. There is therefore an argument for simplifying the current nomenclature for Map strain grouping [14]. Due to the historical nomenclature, to the absence of other comprehensive studies including all strain types and typing methods, to the inability of several techniques to distinguish between Type I and III and to the genetic and phenotypic similarities found between them in previous studies, we propose that S-and Ctype nomenclature could be used to denote the two major groups or lineages and the Type I and III used to distinguish subtypes within S-type strains as we have done in this paper. In agreement with previous studies both PFGE and IS900-RFLP revealed little heterogeneity between isolates of the S subtype I. By comparison, this study shows that strains of S subtype III are more polymorphic. Diverse genotypes clustered within S subtype III have been identified circulating in small regional areas in Spain or even in the same farm [34], making more evident the higher heterogeneity of these strains. Interestingly, as far as we know no evidence of S subtype I strains has been found in Spain, a country with a significant sample of S-type strains in our panel and in previous works [8,16]. For molecular epidemiology (i.e. strain tracking), of the typing techniques used MIRU-VNTR would be the preferred technique for studying S-type strains. This technique gave a high discriminatory index with the eight loci employed in this study and could segregate the different members of MAC and the Map S-and C-type strains, although it has limitations in that it cannot differentiate between the subtypes I and III. For detecting genetic variability between S-type strains the number of loci used could be reduced to 3 (292, X3 and 25). The greatest genetic variation occurred at locus 292 with Stype strains typically having a much higher number of repeats than C-type strains (up to 11 were detected in this study). No more than 4 repeats at locus 292 were detected in C-type strains. The locus 292 locus is flanked by loci MAP2920c and MAP2921c referenced as acetyltransferase and quinone oxidoreductase, respectively. There has been only one other report of MIRU-VNTR typing of S-type strains [22]. In the latter study MIRU-VNTR loci 3 and 7 were thought to be of special importance for identifying subtype III strains but only two subtype III strains were typed. In our study all 14 subtype III and 10 subtype I strains had the same, onerepeat unit alleles at each of these two loci, as found in the two strains typed previously [22]. Although uncommon, a few C-type strains in this study were also found with a single copy at these loci so this is even not unique to S-type strains. All Mah, Maa and Mas strains tested in this study also had one repeat unit at locus 3 and all Maa and 61% of Mah strains had a single copy at locus 7. The discriminatory power of MIRU-VNTR to differentiate between the subtypes I and III could be improved by identifying additional loci. Although MIRU-VNTR cannot distinguish between the subtypes I and III, currently it is the only PCR-based typing technique to reveal significant genetic diversity between S-type strains useful for epidemiological investigations. The technique requires only a small amount of DNA and can therefore be carried out on single colonies as well as cell pellets from liquid culture systems. LSP analysis rapidly differentiates the Stype from C-type strains by the absence of LSP A 20 and presence of LSP A 4 but provides no information regarding genetic diversity within S-type strains. SNP analysis of the (See figure on previous page.) Figure 4 Minimum spanning tree based on MIRU-VNTR genotypes among Mycobacterium avium subsp. paratuberculosis of types S and C, Mycobacterium avium subsp. avium, Mycobacterium avium subsp. hominissuis, and Mycobacterium avium subsp. silvaticum. 135 strains were isolated from cattle (sky blue), 23 strains from sheep (orange), 17 strains from goat (dark blue), 63 strains from pigs (light green), 17 strains from birds (yellow), 17 strains from humans (white), 6 strains from deer (purple), 5 strains from other sources (red), 4 strains from wood pigeons (brown), and 2 different vaccine strains (316 F from France and United Kingdom) (light blue). Each genotype is displayed as a pie chart, the size of which is proportional to the number of strains, with color-coded distribution of the strain origins. The number of loci differing between the genotypes is indicated by the style of the connecting lines: thick and short, 1 difference; intermediate, 2 differences; thin and long: 3 differences.
gyr genes is more complex requiring sequencing of the PCR product to differentiate between S-and C-types and between subtypes I and III [13]. However, the S subtype information would be of limited value for epidemiological studies and tracing the source of infection. Furthermore, as we become better at isolating S-type strains and type more strains it is likely that further S subtypes will become apparent. PFGE and IS900-RFLP both give good discrimination between the Map strain types and subtypes but require larger amounts of high quality DNA, which necessitates in vitro growth of the strains and therefore is not ideal for S-type strains.

Conclusions
This is the largest panel of S-type strains investigated to date. The S-type strains can be further divided into two types, I and III, by some (IS900-RFLP, PFGE and SNP analysis of the gyr genes) but not all (not by MIRU-VNTR typing) of the typing techniques. Pigmentation is not exclusively associated with S subtype I strains. Therefore, a simplified nomenclature is proposed designating types I and III as subtypes of S-type strains. The epidemiological and phylogenetic significance of S type subdivision into I and III subtypes needs, however, to be further clarified. Molecular typing using IS900-RFLP, PFGE and MIRU-VNTR demonstrates that S-type strains are genetically diverse, subtype III being the most heterogeneous group. Due to the scarcity of Stype strains in culture, typing techniques have been largely optimized using C-type strains. Further genomic sequencing of S-type strains should reveal variable genetic loci unique to S-type strains that could be exploited to further improve discrimination of S-type strains. Genome sequence data of isolates belonging to subtypes I and III should ultimately clarify the phylogeny and provide a framework to classify different phenotypic, pathogenic and epidemiological characteristics of Map strains.

Additional files
Additional file 1: Table S1. Description of the strains used in this study indicating their origins and details of their genotypes and phenotypes data.
Additional file 2: Table S2. The table shows the primers sequences used in this study.