Novel Tn4371-ICE like element in Ralstonia pickettii and Genome mining for comparative elements

Background Integrative Conjugative Elements (ICEs) are important factors in the plasticity of microbial genomes. An element related to the ICE Tn4371 was discovered during a bioinformatic search of the Ralstonia pickettii 12J genome. This element was analysed and further searches carried out for additional elements. A PCR method was designed to detect and characterise new elements of this type based on this scaffold and a culture collection of fifty-eight Ralstonia pickettii and Ralstonia insidiosa strains were analysed for the presence of the element. Results Comparative sequence analysis of bacterial genomes has revealed the presence of a number of uncharacterised Tn4371-like ICEs in the genomes of several β and γ- Proteobacteria. These elements vary in size, GC content, putative function and have a mosaic-like structure of plasmid- and phage-like sequences which is typical of Tn4371-like ICEs. These elements were found after a through search of the GenBank database. The elements, which are found in Ralstonia, Delftia, Acidovorax, Bordetella, Comamonas, Acidovorax, Congregibacter, Shewanella, Pseudomonas Stenotrophomonas, Thioalkalivibrio sp. HL-EbGR7, Polaromonas, Burkholderia and Diaphorobacter sp. share a common scaffold. A PCR method was designed (based on the Tn4371- like element detected in the Ralstonia pickettii 12J genome) to detect and characterise new elements of this type. Conclusion All elements found in this study possess a common scaffold of core genes but contain different accessory genes. A new uniform nomenclature is suggested for ICEs of the Tn4371 family. Two novel Tn4371-like ICE were discovered and characterised, using the novel PCR method described in two different isolates of Ralstonia pickettii from laboratory purified water.


Background
Integrative Conjugative Elements (ICEs) carry functional modules involved in their conjugative transfer, chromosomal integration and for control of expression of ICE genes [1]. ICEs are maintained in their host via site-spe-cific integration and establishment at a unique site or sites in their host [2][3][4][5][6][7]. ICEs have been discovered in the genomes of various low G+C Gram-positive bacteria, various α, -and -Proteobacteria, and Bacteroides species [8]. The first ICE found was Tn916 from Bacteroides species [8].
One of the best models of ICEs is a family of elements called the R391\SXT family that are found in -Proteobacteria. These are interesting elements as over 25 have been found to date in organisms spread across the world. They share a common core scaffold of genes related to integration, excision, transfer and regulation. Different elements can possess different fitness determinants such as antibiotic resistances, heavy metal resistances, and error-prone DNA repair systems [9].
Tn4371 is a 55-kb ICE, which allows its host to degrade biphenyl and 4-chlorobiphenyl. It was isolated after mating between Cupriavidus oxalaticus (Ralstonia oxalatica) A5 carrying the broad-host-range conjugative plasmid RP4 and Cupriavidus metallidurans (Ralstonia metallidurans) CH34. Selection was applied for transconjugants that expressed the heavy metal resistances from CH34 and grew with biphenyl as a sole source of carbon and energy [10]. The transconjugants carried an RP4 plasmid with a 55-kb insert near its tetracycline resistance operon. The insert was shown to transpose to other locations and hence was called Tn4371 [10][11][12]. Tn4371 has been sequenced [13] and closely related elements have been found in the genome sequences of a number of bacteria including Ralstonia solanacearum GMI1000, a phytopathogen from French Guyana [14], Cupriavidus metallidurans CH34, a heavy metal resistant bacteria from Belgium [15], Erwinia chrysanthemi 3937, aphytopathogen [16] and Azotobacter vinelandii AvOP, a nitrogen-fixing bacterium isolated from soil in the USA [13,17]. None of these other elements possessed the biphenyl and 4-chlorobiphenyl degradation genes.
The Tn4371-like ICEs characterised to date are mosaic in structure consisting of Ti-RP4-like transfer systems, an integrase region, plasmid maintenance genes and accessory genes [13]. All the characterised elements integrate into sites on the bacterial genomes with a conserved 5'-TTTTTCAT-3' sequence, termed the attB site [11]. Tn4371 transposition most likely involves a site-specific integration/excision process, since the ends of the element can be detected covalently linked as a transfer intermediate [11,13]. Integration is catalysed by a tyrosine based site specific recombinase related to bacteriophage and ICE family integrases [18].
We now report the discovery and comparative analysis of a number of novel uncharacterised Tn4371-like ICEs from several different bacterial species. These elements are also mosaics of plasmid and other genes and posses a com-mon scaffold with apparent hotspots containing insertions of different presumably adaptive genes. Using sequences from the common scaffold a PCR method was developed to discover and characterise new Tn4371-like ICEs in different bacteria. Here we report on the use of this method to discover and characterise two new Tn4371-like ICEs in Ralstonia pickettii strains isolated from a purified water system. Furthermore we propose a uniform nomenclature for newly discovered ICEs of the Tn4371 family

Bioinformatic analysis of Tn4371-like ICEs
Using bioinformatic analysis tools, searches of the genome databases for elements similar to the Tn4371 element were carried out using the original Tn4371 sequence as a probe. The method used was similar to that used to detect novel members of the R391/SXT family of ICEs in Enterobacteriaceae [22]. In this study novel unreported ICEs closely related to Tn4371 were discovered in the genome sequences of several different bacteria including the β-proteobacteria, two elements in Delftia acidovorans SPH-1, and a single element Comamonas testosteroni KF-1, Acidovorax avenae subsp. citrulli AAC00-1, Bordetella petrii DSM12804, Acidovorax sp. JS42, Polaromonas naphthalenivorans CJ2 plasmid pPNAP01, Burkholderia pseudomallei MSHR346 and Diaphorobacter sp. TPSY [ Table 1]. Novel elements were also found in the γ-proteobacteria Congregibacter litoralis KT71, Shewanella sp. ANA-3, Pseudomonas aeruginosa 2192, Pseudomonas aeruginosa PA7, Pseudomonas aeruginosa PACS171b, Pseudomonas aeruginosa UCBPP-PA14, Stenotrophomonas maltophilia K279a, Thioalkalivibrio sp. HL-EbGR7 [ Table 2]. The element in Bordetella petrii DSM12804 was previously identified but not analyzed in a paper by Lechner et al., [24]. The elements found in Delftia acidovorans SPH-1, Comamonas testosteroni KF-1 and Bordetella petrii DSM12804 were also partially characterised along with further information on the elements in Cupriavidus metallidurans CH34 in a paper by Van Houdt et al., [25]. Geographically all these bacteria were found in different locations in both Europe and the Americas and were isolated from many different environments including activated sludge, polluted water and clinical situations [ Table 1 and 2]. All elements contained different inserts [containing accessory genes] in the core backbone except for those found in Delftia acidovorans SPH-1 and Comamonas testosteroni KF-1. The size of the newly discovered elements varied from 42 to 70 Kb and the GC content from 59 to 65% [ Table 1 and 2].

Characterisation of Tn4371-like ICEs in whole genome sequences
The core structure conserved amongst known Tn4371-like ICEs is presented in Fig. 1. At the attR end of the elements a putative int gene [that bears similarities to tyrosine based site-specific recombinases historically called phage-like integrases [20], possessing the R-H-R-Y tetrad] is found [Additional file 1]. A phylogenetic study was carried out on all available Tn4371-like int genes and tyrosine recombinases from phages and other ICEs. The phylogenetic tree can be seen in Additional file 2. These Tn4371-like int genes grouped with the int genes of ICE Hin1056, an ICE from Haemophilus influenzae and from phages related to the P22 phage. The int gene was found in all characterised elements and was followed by nonconserved ORFs which differed from element to element. These ORFs include putative DNA helicases and nucleases, proteins with β-lactamase domains, similar to RadC DNA repair proteins, putative reductases, transposases of insertion sequences, putative ubiquitin-activating enzymes, putative transcriptional regulators and many different hypothetical proteins whose functions are unknown [ Fig. 1, Additional file 3]. These ORF's were found in differing arrangements in each of the different elements. Polaromonas naphthalenivorans CJ2 plasmid pPNAP01 contained biphenyl degradation genes in this area of the element and these genes are similar to those found in the original Tn4371 element but are found in a different part of the element. Pseudomonas aeruginosa PACS171b and the second Delftia acidovorans SPH-1 ele-  Common core scaffold of Tn4371-like ICEs (in blue) and above inserted genes present in R. pickettii ICE Tn4371 6033 (in yellow) Figure 1 Common core scaffold of Tn4371-like ICEs (in blue) and above inserted genes present in R. pickettii ICE Tn4371 6033 (in yellow).
ment have an arsenate resistance system located in this region. This system is related to the ars system, and has the genes arsH, arsC, arsB and arsA in the operon in this bacterium. The function of arsH is unknown; however it is necessary for resistance to arsenic in the Yersinia enterocolitica virulence plasmid pYV [27]. The arsC gene encodes a soluble arsenate reductase which reduces intracellular arsenate to arsenite for efflux from the cell [28]. The arsA gene codes for a unique ATPase which binds to the ArsB membrane protein forming an anion transporting arsenite pump [28]. The arsD gene encodes an inducer independent regulatory protein which controls the upper level of operon expression [29]. The second Delftia acidovorans SPH-1 element has genes related to the Mer (Mercury Resistance) operon: merR, merT, merP and merA. The merR gene controls regulation of the operon, merT and merP transport of the mercury ions and merA reduction of the mercury ions [30]. This region also contains a predicted czc [Cd/Zn/Co] efflux system [31,32]. Czc mediates the inducible resistance to Co 2+ , Zn 2+ and Cd 2+ , the protein products of gens czcA, czcB and czc form a membranebound protein complex catalysing an energy dependant efflux of these three metal ions [33].
Following the integrase gene two conserved genes [ORF00013 and ORF00014 in Tn4371] were present in most elements except those in C. litoralis KT71 and Shewanella sp. ANA-3. These ORF's are related to proteins encoded by genes located near the transfer origin of Escherichia coli F plasmid [Q9WTE4 and Q9S4W2].
Although the function of the first protein is unknown, the second shows similarity to ParB-like nucleases initially identified as a critical element in the faithful partitioning of plasmid DNA during cell division in the absence of selection pressure [34,35]. Subsequently, a number of similar proteins have been identified in prokaryotes and archea which carry out the function of segregation of genomic DNA during cell division. ParB homologs are present in almost all eubacteria chromosomes [36].  [13]. The ParA partition protein of the type Ib family [45] and its associated ParB protein was also found but in all cases the ParB was truncated. Rep and Par proteins have been proposed to act as a stabilisation system for the maintenance of mobile elements in bacterial genomes [19,36], similar to the toxin-anti-toxin system encoded by ORFs s044 and s045 of the SXT-ICE [46]. Qui et al. found that the P. aeruginosa ICE PAPI-1 contains a homologue of the plasmid and chromosome partitioning gene soj (parA). They demonstrated that deletion of the soj homologue from PAPI-1 resulted in complete loss of PAPI-1 from P. aeruginosa. The mechanism by which the Soj protein promotes PAPI-1 maintenance remains to be elucidated [47]. Similar genes to soj have been found in ICE Hin1056 and ICEA [20,48]. This region was followed by an ORF encoding a conserved hypothetical protein [ORF00040 in Tn4371] whose function is unknown [ Fig.  1].
This sequence is followed by a region containing transfer like proteins, the first being a putative conjugation protein TraF related to the pilus assembly proteins of IncP plasmids. This TraF protein is a protease that acts upon the pilus assembly protein TrbC [49].  [52]. A. avenae subsp. citrulli AAC00-1 contained insertion sequences and homologues to general metabolism proteins whose exact functions are unknown. D. acidovorans SPH-1 and C. testosteroni KF-1 contain a predicted czc [Cd/Zn/Co] efflux system [31,32] in their variable regions. The novel element in Acidovorax sp. JS42 contains genes that show similarity to a multidrug resistance pump and insertion sequences [InterPro Scan] in this region. In the variable region in B. petrii DSM 12804 there are various proteins that are putatively involved in degradation, however their exact function is unknown. Burkholderia pseudomallei MSHR346 has genes that are putatively involved in xenobiotic metabolism; however again their exact function is unknown. Polaromonas naphthalenivorans CJ2 plasmid pPNAP01 contains a putative antibiotic resistance pump and metabolism proteins whose role have not been identified. Diaphorobacter sp. TPSY contains a predicted czc [Cd/Zn/Co] efflux system similar to those in D. acidovorans SPH-1 and C. testosteroni KF-1. The second D. acidovorans SPH-1 contains a copper resistance system Cop related to that of Pseudomonas syringae. The genes in this system are laid out in the following order copSR cop-ABFCD. copSR is a two-component signal transduction system, which is required for the copper-inducible expression of copper resistance [53]. CopA and CopC are abun-dant periplasmic copper binding proteins, and CopB is associated with copper accumulation in the outer membrane. No specific function for CopD has been determined yet [54]. CopF is involved in the cytoplasmic detoxification of copper ions [55].  [59]. Stenotrophomonas maltophilia K279a had a putative Major Facilitator Superfamily (MFS) efflux pump that usually function as specific exporters for certain classes of antimicrobial agents. This is related to the emrAB system from E. coli [60]. P. aeruginosa UCBPP-PA14 has a predicted czc [Cd/Zn/Co] efflux system similar to those in D. acidovorans SPH-1 and C. testosteroni KF-1. P. aeruginosa PACS171b contains a homolog of UspA-the Universal Stress Protein. The UspA protein is important for survival during cellular growth arrest in E. coli, but the exact physiological role of the protein is unknown [61]. Thioalkalivibrio sp. HL-EbGR7 has a set of genes with approximately 88% aa identity to the putative KdpFABC system in P. aeruginosa PA7. This variability is suggestive that this region may be a hotspot for insertion or recombination where insertion clearly does not disrupt or affect the expression of neighbouring genes. The variation in predicted gene function, size and lack of homology between elements is suggestive of this region contributing a number of different adaptive traits to hosts containing these ICEs.
Following this variable region is encoded a putative transcriptional regulator protein TraR and a homologue of the type IV coupling protein TraG [similar to those in IncP plasmids]. TraG is responsible for DNA transfer during conjugation and is a putative DNA binding protein [62].
Interestingly the gene order of this region and the order of genes preceding it are also suggestive of an insertion [of the variable region just discussed] into a primordial transfer module.
The putative DNA binding gene traG is followed by a group of genes encoding proteins [TrbBCDEJLFGI] with similarity to the mating-pair formation [mpf] apparatus or type IV secretion system closely related to IncP and Ti plasmids. This system presumably mediates the DNA transfer of the ICE to recipient cells [63,64]. These genes show similarity to those required for conjugative transfer of the Agrobacterium Ti plasmid, pNGR234a and RP4, except that two genes, trbK and trbH, found on these plasmids are missing [65]. In the Tn4371-like elements the gene order was trb BCDEJLFGI in all the characterised elements found in this study and similar to the molecular organisation in ICEMlSym R7A [ [19], Fig. 1]. The TrbB, TrbC, TrbE, TrbG, and TrbL proteins are involved in the creation of the mpf apparatus, TrbC is involved in pilus formation and TrbE displays ATPase activity [65].
The novel ICEs detected in this study are integrated into various locations in the genomes of the host bacteria where they were discovered. In Acidovorax sp. JS42 other partial copies of Tn4371-like elements were also found in addition to the full element reported here. Two elements were discovered and characterised in D. acidovorans SPH-1. A further partial element was found in B. petrii this however lacked the int Tn4371 gene. This situation is similar to that found in R. metallidurans CH34 and indicates that duplication or multiple insertions of the elements occur in bacteria. Near complete copies of Tn4371-like elements were also found in Burkholderia ambifaria AMMD and Burkholderia multivorans ATCC17616, where both were found to lack the Tn4371-like integrase gene suggesting that the elements may no longer be mobile. New elements were also found in Ralstonia solanacearum MolK2 and a second element in Diaphorobacter sp. TPSY, these share similarities in the stabilisation and transfer regions of the element to Tn4371-like elements but they have a different integrase region not related to the int Tn4371 gene.
All of the elements reported here [ Table 1 and 2] appear to share a common scaffold or backbone that is approximately 24 kb in size containing a 1.5 kb integrase gene; an 8.5 kb replication/stability gene cluster and a 14 kb conjugal transfer/mating pair formation cluster [ Fig. 1]. A visual representation of this can be seen in Figs. 2, 3, 4 and 5 where the various sequences were aligned for comparison, the core scaffold identified and 'adaptive' genes highlighted which vary from element to element.
Bioinformatic comparisons were performed between the genes that make up the core scaffold region of the ICE and these ranged from the highly conserved traG gene, with 84 to 96% aa identity, trbE gene, with 76 to 94% aa identity, and the parA gene, with 90 to 97% aa identity, to the lessconserved traR gene, with 53 to 84% aa identity. On average the genes that we ascribed to the core showed > 75% aa identity and were also related by gene order. All gene numbers and a basic description of the genes are included in Additional file 3.

Defining the Tn4371 family of ICEs and nomenclature
These elements have been classed as ICEs as we believe at this moment in time this is the best terminology currently available. They follow all the criteria of ICEs having integration and transfer modules, possessing an excisionase gene and having genes and gene layout (rdfS, rlxS and the trb genes) similar to other ICEs namely ICEMlSym R7A . The original element can also excise from bacterial chromosome and form a circular intermediate [9], however the element has not been shown to transfer between different bacteria, and this could be due to the original element lacking the trbD gene [13].
Although the elements identified in this study are not identical, they share a similar core backbone that, in our view, warrants their inclusion into the Tn4371 ICE family. All encode a related integrase, related maintenance and transfer genes and the gene order of homologous genes are similar, if one were to remove variable inserted regions which differ from element to element. We propose that any ICE that encodes an integrase gene closely related to int Tn4371 , defined as over 70% protein homology and that has similar maintenance and transfer genes be considered part of the Tn4371 family of ICEs.
Given the number of Tn4371-like elements discovered in this study, it seems sensible to name newly described ICEs of the Tn4371 family with a uniform nomenclature. We propose adapting the system used for naming transposons described by Roberts et al., [66]. This system is a website http://www.ucl.ac.uk/eastman/tn/ based system which assigns Tn numbers in sequence e.g. Tn6033, Tn6034, etc and the elements were then called ICE Tn4371 6033, ICE Tn4371 6034, etc to distinguish that they are ICEs of the Tn4371 family. The names assigned to the elements discovered in this study are listed in Table 1 and 2. This system was chosen as other systems such as that used by Burrus et al., [8] for naming members of the SXT\R391 family of ICEs are not regulated and can differ between laboratories leading to confusion.

Tn4371-like ICE detection and molecular characterisation
Following the discovery of the widespread nature of Tn4371-like ICEs in the genomes of many new organisms, PCR primers were designed to amplify important genes of the core scaffold to aid in the rapid identification of new Tn4371-like elements. We tested this on a culture collection of fifty-eight Ralstonia pickettii and Ralstonia insidiosa strains from various environments and geographic locations. The PCR primers were based on conserved consensus sequences of core genes identified from all the elements identified in this study and those reported previously.
The results in Fig. 6 Table 3]. Sequencing revealed that the ULM001 int gene showed 85% and 99% nucleotide identity to the Tn4371 int gene and ICE Tn4371 6033 int gene, respectively. The RepAF and RepAR primers also amplified the repA gene and the parA gene in ULM001, ULM003 and ULM006. Sequencing these amplicons revealed that in ULM001 the repA and parB genes were present and showed 88% and 99% nucleotide identity to the RepA and ParA genes from Tn4371 and ICE Tn4371 6033 respectively.
Use of the Artemis comparison tool to analysis Tn4371-like ICE sequences of Tn4371, R. pickettii 12J, both elements from D. acidovorans SPH-1 and C. testosteroni KF-1 Figure 2 Use of the Artemis comparison tool to analysis Tn4371-like ICE sequences of Tn4371, R. pickettii 12J, both elements from D. acidovorans SPH-1 and C. testosteroni KF-1. All ICEs analysed shared extensive sequence homology, and general gene order. Arrows on top delimit the functional regions whose order is well conserved in all Tn4371-like ICEs.
A traG Tn4371 homolog was also detected in ULM001, ULM003 and ULM006 following PCR amplification. Sequencing revealed that the ULM001 traG Tn4371 gene showed 91% and 89% nucleotide identity to traG from Tn4371 and ICE Tn4371 6033 respectively. TrbIF and TrbIR primers were used to amplify the trbI gene in ULM001 and ULM003 while no amplification occurred in ULM006. Sequencing showed that the ULM001 amplicon was a homolog, which had 88% and 99% nucleotide identity to the trbI gene from Tn4371 and ICE Tn4371 6033 respectively. The absence of a trbI gene amplicon in ULM006 may indicate a deleted gene or truncated element in this strain. The use of these primer sets has thus revealed the presence of two new elements, which can then be further characterised. The ICEs detected in this study from Ralstonia pickettii were named ICE Tn4371 6043 and ICE Tn4371 6044 using the nomenclature system described above, a general map of the elements can be seen in Fig. 6.

The attL and attR region of Tn4371 ICEs
Analysis of hosts harbouring Tn4371-like elements indicated that integration occurred at an 8-bp attB site generating attL and attR element chromosomal junctions [ [11], Fig. 7a]. An alignment of the first and last 200 bp of the Use of the Artemis comparison tool to analysis Tn4371-like ICE sequences of Tn4371, P. aeruginosa 2192, P. aeruginosa PA7, P. aeruginosa UCBPP-PA14 and P. aeruginosa PACS171b Figure 3 Use of the Artemis comparison tool to analysis Tn4371-like ICE sequences of Tn4371, P. aeruginosa 2192, P. aeruginosa PA7, P. aeruginosa UCBPP-PA14 and P. aeruginosa PACS171b. All ICEs analysed shared extensive sequence homology, and general gene order. Arrows on top delimit the functional regions whose order is well conserved in all Tn4371-like ICEs.
elements analysed in this study with Tn4371-like element from previous studies showed the attL site had a sequence of TTTTC/TA/GT and attR had a sequence of TTTTC/TA/ GT for some bacteria, while others had no direct repeats. These alignments can be seen in Additional file 4. The exact sequence of the direct repeat for each element is presented in Table 4. The absence of direct repeats in some of these elements may mean that they are no longer mobile. Tn4371 has been shown to excise from the RP4 plasmid in Ralstonia eutropha forming a circular extrachromosomal intermediate [ [10], Fig. 7a Fig. 7b.], indicating that a circular extrachromosomal form of the element is present in these cells, while no PCR product was obtained from ULM006 [ Fig. 7b]. The sequencing of the attP region of ICE Tn4371 6043 gave an attL region of TTTTTCAT and an attR region of TACTTTTT. This rapid amplification across the circular attP junction can also be utilised for the rapid identification of Tn4371like elements. It is possible that the PCR may have picked up tandems of the element if those happened to be intermediates in "transposition".

Conclusion
Tn4371-like ICEs are found in a wide range of γ-proteobacteria and β-proteobacteria from both clinical and environmental sources. These types of bacteria are known for their large metabolic repertoires and these elements could potentially be a source of acquisition of adaptive functions for these organisms. The discovery of the Tn4371like ICEs in the P. aeruginosa strains, S. maltophilia K279a and B. pseudomallei MSHR346 are the first reports of these elements found in human pathogens. This along with the discovery of putative antibiotic resistance genes in their genomes indicates that these elements may have an impact in clinical situations. The discovery and characterisation of novel Tn4371-like elements as reported here adds significantly to the repertoire of such elements and helps define the core scaffold of such elements. It is clear that these elements are highly adaptable and may contribute significantly to the metabolic capabilities of their host. This study increases the knowledge available about these elements adding data on eighteen new elements to the Use of the Artemis comparison tool to analysis Tn4371-like ICE sequences of Tn4371, A. avenae subsp. citrulli AAC00-1, Aci-dovorax sp. JS42, B. petrii DSM12804, Diaphorobacter sp. TPSY and P. naphthalenivorans CJ2 plasmid pPNAP01 Figure 5 Use of the Artemis comparison tool to analysis Tn4371-like ICE sequences of Tn4371, A. avenae subsp. citrulli AAC00-1, Acidovorax sp. JS42, B. petrii DSM12804, Diaphorobacter sp. TPSY and P. naphthalenivorans CJ2 plasmid pPNAP01. All ICEs analysed shared extensive sequence homology, and general gene order. Arrows on top delimit the functional regions whose order is well conserved in all Tn4371-like ICEs.
Amplification of genes of the putative Tn4371-like ICE ICE Tn4371 6043 in Ralstonia pickettii strain ULM001 (a laboratory purified water isolate) Figure 6 Amplification of genes of the putative Tn4371-like ICE ICE Tn4371 6043 in Ralstonia pickettii strain ULM001 (a laboratory purified water isolate). A scheme of the amplified genes is shown above the 0.7% agarose gel of the PCR products generated with the primers listed in Table 2

Bacterial strains and growth conditions
The strains used in this study are shown in   Table 3.
A) Schematic representation of Tn4371 excision and insertion into the R. pickettii chromosome

Bioinformatic Analysis of the Tn4371-like ICEs in genomes
All analysed DNA sequences were retrieved from the Gen-Bank database http://www.ncbi.nlm.nih.gov.
DNA and protein sequences similar to Tn4371 [ [13], AJ536756] were detected within the NCBI nonredundant nucleotide and protein databases http:// www.ncbi.nlm.nih.gov via blastp and blastn analysis using the original Tn4371 sequence as a probe [69]. Assembly and comparison with other Tn4371-like sequences was performed with the Artemis Comparison Tool [ACT] [ [70], http://www.sanger.ac.uk/Software/ ACT]. The complete DNA sequences were also manually annotated to verify the deposited sequence. The similarity of proteins encoded by the element was determined as % aa identities over the entire protein to its Tn4371 equivalent via blastp. Unknown ORFs were analysed using Inter-ProScan http://www.ebi.ac.uk/InterProScan/, [71]] to locate motifs or domains where similarity with known proteins was low or absent. Size and total % GC content was determined using the GC-Profile program [ [72], http://tubic.tju.edu.cn/GC-Profile/]. Phylogenetic and molecular evolutionary analyses were conducted using genetic-distance-based neighbour-joining algorithms within MEGA version 4.0 [ [73], http://www.megasoft ware.net/]

Nucleotide sequence accession numbers
The DNA sequences described in this article have been assigned the accession numbers listed in Table 3.

Authors' contributions
MRP was responsible for conception of the study, experimental design, data collection, and analysis and prepara- tion of the manuscript. JTP and CCA participated in experimental design, data analysis and preparation of the manuscript. All authors read and approved the final manuscript.