Genotyping analysis of Helicobacter pylori using multiple-locus variable-number tandem-repeats analysis in five regions of China and Japan

Background H. pylori (Helicobacter pylori) is the major causative agent of chronic active gastritis. The population of H. pylori shows a high genomic variability among isolates. And the polymorphism of repeat-units of genomics had participated the important process of evolution. Its long term colonization of the stomach caused different clinical outcomes, which may relate to the high degree of genetic variation of H. pylori. A variety of molecular typing tools have been developed to access genetic relatedness in H. pylori isolates. However, there is still no standard genotyping system of this bacterium. The MLVA (Multi-locus of variable number of tandem repeat analysis) method is useful for performing phylogenetic analysis and is widely used in bacteria genotyping; however, there's little application in H. pylori analysis. This article is the first application of the MLVA method to investigate H. pylori from different districts and ethnic groups of China. Results MLVA of 12 VNTR loci with high discrimination power based on 30 candidates were performed on a collection of 202 strains of H. pylori which originated from five regions of China and Japan. Phylogenetic tree was constructed using MLVA profiles. 12 VNTR loci presented with high various polymorphisms, and the results demonstrated very close relationships between genotypes and ethnic groups. Conclusions This study used MLVA methodology providing a new perspective on the ethnic groups and distribution characteristics of H. pylori.


Background
Helicobacter pylori (H. pylori) is a spiral-shaped, Gramnegative bacterium that infects half the world's population and is the major cause of chronic gastritis, peptic ulcers and gastric malignancies, including gastric noncardia adenocarcinoma and mucosal-associated lymphoid tissue lymphoma [1,2]. Most infected individuals present with no clinical symptoms, but approximately 10~20% will develop peptic ulcers and 1% will develop gastric cancer [3,4], which could be associated with the diversity of H. pylori.
H. pylori exhibits exceptionally high rates of DNA point mutations and intra-and inter-genomic recombination. Recently, many molecular typing tools have been developed to investigate genetic relatedness among H. pylori isolates. However, these methods have limitations including lower discrimination power, or preventing results from different labs being compared [5,6].
In 1999, MLVA analysis was proposed as a general approach to providing accurate, portable data that were appropriate for the epidemiological investigation of bacterial pathogens [7][8][9][10][11]. However, there's little information concerning populations of H. pylori species using MLVA. Whether this method is available for the H. pylori population is still uncertain.
H. pylori infections in China are common and extensively distributed, with an average infection rate of about 58%. In this study, 12 VNTR loci of the H. pylori genome were identified and used to analyze 202 strains of H. pylori which originated from different regions of China and Japan.

Results
Multi-VNTR loci for H. pylori genome PCR products amplified from the reference strains 26695, HPAG1 and J99 were identical to the published sequences sizes. Of the locus VNTR-2576 and VNTR-614, the PCR products sequencing were consistence with our electrophoresis results. The exact number of tandem repeats at each locus could be determined from the sizes of the PCR products.
In this study, 30 VNTR loci were candidated from the H. pylori database. And we finally identified 12 VNTR loci using analysis, which were also distributed throughout the H. pylori genome (Table 1). There's no variation in the other 18 loci, which were removed in the following study. The variation in repeat numbers is divergence at the 12 VNTR loci. The main characteristics of the 12 VNTR loci are listed in Table 2, including the diversity index of each locus.
Clustering trend of the strains from different regions and ethnic groups A MLVA system to the molecular typing of H. pylori strains has been developed. On the basis of the 12 VNTR loci, the profiles of each isolate were obtained ( Figure 1). The clinical H. pylori strains were divided into 127 MTs, which has not been described previously. According to cluster analysis, most strains from the same focus presented with the same or similar MTs ( Figure 1). In addition, strains from the same focus were dispersed in the cluster tree. As shown in Figure 1, the 86.7% (13/15) of the Tokyo isolates had very similar MTs and could be clustered into Group A. One of the remaining Tokyo isolates belonged to the Group C, and the others were scattered distribution. Of the Southern and Eastern Chinese isolates, 74.4% (43/32) were clustered into group B, including B 1 , B 2 and B 3 subgroups, and the rest strains were related to Group A, C and D. Of the isolates from Northern China, 60.7% were clustered into two major branches, groups C 1 (37.5%, 21/56) and C 2 (23.2%, 13/56), and other strains were scattered. Of the Western China isolates, 86.0% (37/43) were clustered into group D. The strains Tibet 1, 14, 23 and 43 were related to Group A, Tibet 37 and Tibet 35 were related to Group B 2 and C 2 .
A minimum spanning tree was constructed on the basis of strains from different ethnic groups: 43 Tibetan, 33 Mongolian, 15 Yamato as well as 27 Han ( Figure 2). There was a tendency to cluster into four main subgroups. However, there're still some exceptions, such as the Hangzhou-12 and 21, of Han strains (associated with gastritis and peptic ulcer), were related to the Tibetan strains group. Tibetan strains 1 and 43 (gastritis), were related to the Mongolian group, and Mongolian 16, (gastric cancer), was related to the Japanese group.

Correlation between H. pylori MTs and the related diseases
Among the 202 samples, 14.9%, 55.9%, 25.2% and 4.0% of patients presented with non-ulcer dyspepsia (NUD), gastritis (G), peptic ulcer (PU) and gastric cancer (GC), respectively. And in our study there's no significant

Discussion
Recently, many bacterial genomes have been fully sequenced, and analysis of the sequenced genomes has revealed the presence of variable proportions of repeats, including tandem repeats. Short repeat motifs undergo frequent variation in the number of repeated units. MLVA is an appropriate method for bacterial typing and identification, for determining genetic diversity, and for the tracing-back of highly mono-morphological species [12][13][14]. The MLVA typing was reported to have a high-quality species identification capability and a high discriminatory power. The method has been used in the analysis of many bacteria [15][16][17][18], but little research has been carried out in H. pylori. Therefore, this study established the H. pylori MLVA system and applied to type clinical strains. The H. pylori genome has a number of repeat sequences, and their repeat number results in divergence. The 12 loci identified were distributed throughout the genome. These loci had different variations in different isolates and were able to typing H. pylori successfully.
The H. pylori MTs were clustered with ethnic groups, consistent with the previous reports [19,20]. The Han strains were selected from Southern China and had little relationship to Mongolian strains from Northern China or Tibetan strains from Western China. It may demonstrate an apparent cluster tendency in different regions and ethnic groups, but there were some exceptions, which may because, unlike other Asian countries with relatively homogeneous populations, China has a heterogeneous population from various ethnic groups. Therefore, there may be more opportunity for DNA transfer between strains of different genotypes in China than other countries. While Tibet is a relatively closed region, H. pylori strains from this area have a good cluster.
The H. pylori genome shows a high degree of genetic diversity among strains [21,22], but weakly clonal groupings of different diseases were detected, and these could be superimposed on a pattern of free recombination. And the relationship between particular H. pylori genotype and related disease has not been sure.
MLVA is a useful molecular tool for epidemiological investigations and recognition of laboratory cross-contamination [23][24][25]. VNTR analysis thus provides multiple independent characteristics for phylogenetic analysis. Studies have indicated that MLVA is sufficient to resolve closely related isolates. In contrast, combining loci with lower variability values is suitable for establishing clear phylogenetic patterns among strains that have evolved over a longer time period. Theoretically, the greater the number of loci used, the higher the discriminatory power that can be achieved, and subtler phylogenetic relationships among bacterial strains can be established.  (N-J) was using the categorical distance coefficient and the wards method. From left to right, the columns designated to the 12 VNTR loci, the strain ID, geographic origin (location) and H. pylori related disease. NC, SC, EC and WC under the column of 'Region' stand for the Southern, Northern, Eastern and Western of China respectively. Disease NUD and G represents the non-ulcer dyspepsia (NUD) and gastritis. And diseases PU (peptic ulcer) comprise duodenal and gastric ulcer as well as disease GC is with the gastric cancer. The branches color code reflects the focus of origin, the same color of the columns stand for origin from the same geographic origin (location). Isolates from different regions showed a certain cluster tendency, as Tokyo isolates were clustered into Group A, Southern and Eastern China isolates were clustered into group B, Northern China were clustered into two major branches, groups C1 and C2. Western China isolates were clustered into group D. While there's no significant relationship between MTs and H. pylori related diseases.
At the present time, the MLVA was established and applied to examine the clonal relationships between H. pylori isolates from China and Japan. The loci used in this study provided high discriminatory power and successfully separated isolates of different strains from different geographical areas. And there was a particularly evident of H. pylori from Tibet, a relatively closed region, which showed better cluster than other ethnic groups. The data will aid in the development of a genomic polymorphism database of H. pylori. We have established a preliminary profile of MLVA but more information is required for a comprehensive profile.
China is a large country containing 56 ethnic groups and a large population. Therefore, further studies are required including isolates from more regions and over several more time-frames.

Conclusions
The studies indicated that MLVA method, based on 12 VNTR loci, is sufficient to resolve closely related isolates for the purpose of H. pylori genotyping analysis. This study used MLVA methodology provided a new perspective on the ethnic groups distribution characteristics of H. pylori.

H. pylori strains and DNA preparation
A total of 202 H. pylori strains were included in this study and the background information of the strains is listed in Table 3. The 187 clinical strains were isolated from various regions of China during 1998 and 2010; an additional 15 strains were presented as a gift by Institute of Medical Science University of Tokyo Japan in 2008. Patients ranged from 12 to 75 years old (mean age 44 years). All the patients reporting the symptoms of gastritis (G), peptic ulcer (PU) or gastric cancer (GC) underwent upper gastroendoscopy for both visual examination and biopsy collection. The strains were isolated from gastric biopsy gastrointestinal endoscopy of selected patients, who had not received non-steroidal anti-inflammatory drugs, proton pump inhibitors or other antibiotics during the last 2 months, revealed that out of 202 patients, 172 had either G, DU or GC and 30 had nonulcer dyspepsia (NUD). Written consent was taken from all the patients before collection of the biopsy. The study was approved by the ethics review board at Third Military Medical University, and informed consent was obtained from all patients before participation.
Bacteria were separated and cultured in Skirrow medium with 5% fresh sheep blood at 37°C for 24 h~72 h in a micro-aerobic environment. H. pylori genomes were extracted using genomic DNA isolation kits (Omega Biotek Inc). Culture and identification of H. pylori were done by appropriate biochemical tests and amplification of 16S rDNA using species-specific primers Selection and identification the VNTR loci of H. pylori VNTR loci were selected from the MLVA database http://minisatellites.u-psud.fr/ASPSamp/base_ms/bact. php by estimating the size of PCR products on agarose gels. The repeat sequence of loci ≥ 10 bp, consistency of repeat unit ≥ 90% and a minimum of two alleles in three reference strains of H. pylori (26695, HPAG1, J99) were selected for this research. The locations, copy numbers, sizes of the loci and the gene(s) involved are also listed in Table 1.

PCR amplification
A PCR reaction mixture (30 ml) containing 10 ng of DNA template, 0.5 mM of each primer, 1 unit of Taq DNA polymerase, 200 mM of dNTPs and 10 × PCR buffer (500 mM KCl, 100 mM TrisHCl (pH 8.3) 25 mM MgCl 2 ) was utilized. Amplification was carried out in a DNA thermocycler (MJ Research PTC-225) with denaturation at 94°C for 8 min, followed by 30 cycles of denaturation at 94°C for 45 s, annealing at 52°C for 45 s and elongation at 72°C for 1 min [26]. A 10-min elongation at 72°C was performed after the last cycle to ensure complete extension of the amplicons.
Five μl of the PCR products were run on standard 3% agarose gels in 0.56TBE buffer at 8-10 V/cm. Gel lengths of 10 to 40 cm were used according to PCR product size and repeat unit size. Strains in which alleles had been precisely measured by re-sequencing or by direct comparison with a sequenced reference strain were used (In this study DNA from 26695, HPAG1 and J99 were used for this purpose). Multiple interspersed negative controls containing no DNA were included each time PCR was performed.

Data analysis
The number of repeat units in 12 VNTR loci were analyzed and inputted into BioNumerics version 5.1 software (Applied-Maths, Sint-Martens-Latem, Belgium), and gel images were obtained using the BioNumerics software package version 6.0 (Applied-Maths, Sint-Martens-Latem, Belgium) or using UVB gel image analysis. The number of repeat units in each locus was deduced by the amplicon size, flanking sequence length and repeat unit size. Data from agarose gel electrophoresis and UVB gel image analysis, obtained by capillary electrophoresis machines, were imported into BioNumerics by creating a virtual gel image. Gel image data were converted into characteristics data sets. Cluster analysis of Neighbor-joining tree (N-J) was carried out using the categorical similarity coefficient and the Ward method. A minimum spanning tree was inferred using characteristic data from cluster analysis. The polymorphism of each locus was represented by Nei's diversity index [27], calculated as DI = 1-∑(allelic frequency) 2 .

Reproducibility and stability of 12 VNTR loci via in-vitro passage
Twenty clinical strain genomes from China and Japan were amplified and multiple DNA samples from each strain yielded PCR products with identical sizes at all loci. Chongqing26 and Tibet36 each yielded no product at one locus, possibly because of mutations or poor quality DNA samples.
The stabilities of the VNTR loci were investigated in a long-term experiment in which the 20 test H. pylori isolates used were sub-cultured into fresh Skirrow medium 30 times by serial passages at two or three day intervals. The DNA from the strains cultivated in each passage was extracted and subjected to MLVA analysis. The results of the VNTR analysis demonstrated no difference in tandem repeat numbers (data not shown).