Evidence of diversity and recombination in Arsenophonus symbionts of the Bemisia tabaci species complex

Background Maternally inherited bacterial symbionts infecting arthropods have major implications on host ecology and evolution. Among them, the genus Arsenophonus is particularly characterized by a large host spectrum and a wide range of symbiotic relationships (from mutualism to parasitism), making it a good model to study the evolution of host-symbiont associations. However, few data are available on the diversity and distribution of Arsenophonus within host lineages. Here, we propose a survey on Arsenophonus diversity in whitefly species (Hemiptera), in particular the Bemisia tabaci species complex. This polyphagous insect pest is composed of genetic groups that differ in many ecological aspects. They harbor specific bacterial communities, among them several lineages of Arsenophonus, enabling a study of the evolutionary history of these bacteria at a fine host taxonomic level, in association to host geographical range and ecology. Results Among 152 individuals, our analysis identified 19 allelic profiles and 6 phylogenetic groups, demonstrating this bacterium's high diversity. These groups, based on Arsenophonus phylogeny, correlated with B. tabaci genetic groups with two exceptions reflecting horizontal transfers. None of three genes analyzed provided evidence of intragenic recombination, but intergenic recombination events were detected. A mutation inducing a STOP codon on one gene in a strain infecting one B. tabaci genetic group was also found. Phylogenetic analyses of the three concatenated loci revealed the existence of two clades of Arsenophonus. One, composed of strains found in other Hemiptera, could be the ancestral clade in whiteflies. The other, which regroups strains found in Hymenoptera and Diptera, may have been acquired more recently by whiteflies through lateral transfers. Conclusions This analysis of the genus Arsenophonus revealed a diversity within the B. tabaci species complex which resembles that reported on the larger scale of insect taxonomy. We also provide evidence for recombination events within the Arsenophonus genome and horizontal transmission of strains among insect taxa. This work provides further insight into the evolution of the Arsenophonus genome, the infection dynamics of this bacterium and its influence on its insect host's ecology.


Background
Many arthropods live in symbiosis with one or more endosymbiotic bacteria, establishing a wide diversity of symbiotic associations ranging from mutualism to parasitism [1,2]. When arthropod hosts feed on imbalanced diets, such as plant sap or vertebrate blood, mutualistic bacterial symbionts play a central role in their biology by providing essential nutrients that are lacking or limited [3], leading to obligatory cooperative insect-microbial relationships.
Arthropods also harbor facultative symbionts acquired more recently, leading to complex associations with shorter epidemiological and evolutionary dynamics [4,5]. These are mainly vertically transmitted but according to the host-symbiont association, horizontal transfers may occur within and between species on different evolutionary time scales [6][7][8][9]. An extremely diverse group of bacterial taxa is involved in facultative symbiosis, with a wide range of both hosts and phenotypes. Some facultative endosymbiotic bacteria confer direct fitness benefits such as protection against natural enemies [10,11], hostplant specialization [12] or thermal tolerance [13]. Others, like the alphaproteobacterium Wolbachia and the Bacteroidetes Cardinium, manipulate host reproduction to enable their spread and maintenance in host populations despite deleterious effects (for review see Stouthamer et al. [14]).
Among the symbiotic bacteria, the gammaproteobacterium genus Arsenophonus has particular characteristic features with regard to lineage diversity, host spectrum and the symbiotic relationships established with its host. It thus constitutes a good model to study the evolutionary process shaping symbiotic associations. The diversity of Arsenophonus host species is particularly large, including insects, other arthropods (such as ticks) and plants [15]. This can be explained by the symbiont's transmission routes since this vertically transmitted bacterium can also be acquired by horizontal transfer within and among species [16,17]. Moreover, some strains can be cultivated on cell-free cultures [18]. Arsenophonus-host relationships range from parasitism to mutualism, with the induction of various phenotypes such as reproductive manipulation (male-killing) [19], phytopathogenicity [20] or obligatory mutualism [21,22]. However, in most reported symbiotic associations, the impact of this symbiont on the host phenotype remains unknown. Based on rRNA gene analysis, phylogenetic studies have revealed an extremely high diversity of bacterial lineages forming a monophyletic group [15]. In addition, the Arsenophonus phylogeny encompasses several other host-specific sub-clusters with lower divergence associated to ticks, plants, triatomine bugs, whiteflies, several genera of hippoboscids and ants, but no co-speciation pattern within clades. Beside these bacterial lineages that cluster according to host taxonomy, a number of closely related Arsenophonus strains infect unrelated host species. Moreover, the same host species sometimes harbors several Arsenophonus lineages, a pattern that is probably due to the Arsenophonus's ability to be horizontally transferred, as recently demonstrated in the hymenopteran parasitoids of the family Pteromalidae [17]. Previous studies have shown that whitefly species can host different strains of several bacteria [15,23,24] , and they thus appear to be particularly relevant to investigating Arsenophonus diversity and evolution. However, we cannot disregard the fact that rRNAbased phylogeny suffers inconsistencies as a result of intragenomic heterogeneity among the 8 to 10 estimated rRNA copies in the Arsenophonus genome [25]. Moreover, biased phylogeny can also result from homologous recombination, which appears more frequently in symbiotic bacteria than expected based on their intracellular lifestyle and vertical transmission [26,27]. The availability of the complete sequence of the Arsenophonus genome now provides the opportunity to perform a more accurate exploration of the evolutionary history and ecological spread of this pervasive symbiotic bacterium on different host-taxonomical scales.
Among the whiteflies, the Bemisia tabaci (Homoptera, Aleyrodidae) species complex has emerged as a focus of attention for several reasons, chief among them being the ongoing species radiation and the high prevalence of a wide diversity of endosymbiotic bacteria, including several lineages of Arsenophonus [28]. The whitefly B. tabaci is a worldwide polyphagous pest of vegetables and ornamental crops, previously thought to be a unique species composed of several well-differentiated genetic groups or biotypes. Recently however, some of these groups have been recognized as true species, so that B. tabaci is now considered a complex of 24 cryptic species which barely interbreed and form different phylogenetic clades [29]. The biological data needed to draw clear boundaries among species and to identify the cause of such genetic differentiation are still lacking. This phloem-feeding insect harbors a primary symbiont, Portiera aleyrodidarum, required for supplementing its specialized diet. B. tabaci also hosts up to six vertically transmitted secondary symbionts, some of which are phylogenetically highly distant [23]. For each of these symbionts, the phenotypic consequences of infection in B. tabaci remain poorly identified, if at all [30]. Nevertheless, in other insect species, some of these bacteria are known to manipulate host reproduction, while others increase resistance to natural enemies [4,10,14,31]. Moreover, the symbionts are thought to play a major role in the viral transmission capacities of the pest [32,33]. Interestingly, multiple bacterial infections are common in B. tabaci, and the endosymbiotic community is correlated with the B. tabaci genetic groups on different scales of differentiation [28,34,35]. This raises the question of these endosymbionts role in B. tabaci biology and species radiation. Within the 24 well-differentiated mtDNA groups recognized as true species by De Barro et al. [29] and that regroup all previously described biotypes, Arsenophonus has been found in AsiaII3 (ZHJ1 biotype), AsiaII7 (Cv biotype), Indian Ocean (Ms biotype), Mediterranean [Q and Africa Silver Leafing (ASL) biotypes which probably form true species] and the Sub-Saharan Africa species [Africa non-Silver Leafing (AnSL) biotype] [28,[34][35][36][37][38]. For all other species or groups, there is either no data or they have proven to be free from infection. For example, among the putative species of the Africa/Middle East/Asia Minor clade which contains the most invasive species the Ms, Q and ASL groups Arsenophonus appears well established, whereas the invasive B group has been shown to be uninfected, despite extensive symbiont screening [28,34,39]. The prevalence varies considerably within and among populations and genetic groups infected by Arsenophonus. For example, Q is composed of three COI-differentiated groups, Q1, Q2 and Q3 [28]. To date, these three cytotypes have not shown the same geographical distribution and show different endosymbiotic bacterial community compositions [28,40]. The subgroup Q1, found in Europe, is not infected by Arsenophonus but harbors three other bacteria [28]. In contrast, Q2 observed in the Middle East and Q3 reported only in Africa show high prevalence of Arsenophonus in co-infection with Rickettsia [28,34,41]. Ms individuals are highly infected by Arsenophonus with a high level of co-infection by Cardinium [37]. All of these groups (B, Q, ASL, Ms and AnSL) show quite different geographical ranges. Ms has been detected on the islands in the southwestern part of the Indian Ocean, Tanzania and Uganda, living in sympatry with B [42]. ASL and AnSL have been reported only in Africa [28,35,[43][44][45][46]. In contrast, the invasive B and Q groups are spread all over the world. Q has been found in Africa, America, Europe, Asia and the Middle East [28,34,47,48]. However, this situation is constantly in flux, because commercial trade is responsible for recurrent introduction/invasion processes of B. tabaci giving rise to new sympatric situations. Moreover, potential horizontal transfers of symbionts and interbreeding can generate new nucleo-cytoplasmic combinations and thus rapid evolution of symbiont diversity.
Patterns of Arsenophonus infection in B. tabaci within the high-level Africa/Middle East/Asia Minor groups make this clade a good candidate to study, on fine taxonomic and time scales, the spread of this bacterium, its ability to be horizontally transferred and finally, its evolutionary history, including genetic diversity generated by recombination events. In the present paper, we explore the prevalence and diversity of Arsenophonus strains in this clade using an MLST approach to avoid the disadvantages of the rRNA approach. In parallel we also studied, as an outgroup, the Sub-Saharan AnSL species (S biotype), considered the basal group of this species complex, and two other whitefly species found at the sampling sites, Trialeurodes vaporariorum and Bemisia afer.

Insect sampling
Individuals from different species of Bemisia tabaci and two other Aleyrodidae species were collected from 2001 to 2010 from various locations and host plants in Africa and Europe and stored in 96% ethanol (Table 1, Figure  1).

DNA extraction and PCR amplification
Arsenophonus detection and identification of B. tabaci genetic groups Insects were sexed and DNA was extracted as previously described by Delatte et al. [49]. All samples were screened for Arsenophonus infection using the specific primers Ars-23S1/Ars-23S2 targeting the 23SRNA gene [50] (Table 2). To check for extracted DNA quality, all samples were also tested for the presence of the primary symbiont P. aleyrodidarum using specific primers for the 16S rRNA genes described by Zchori-Fein and Brown [23]. When positive signals were recorded in both PCRs, insects were used in the analysis. B. tabaci genetic groups were identified by PCR-RFLP (random fragment length polymorphism) test based on the mitochondrial marker COI (Cytochrome Oxidase 1) gene as described by Gnankine et al. [35] for Q, ASL and AnSL individuals. A set of 10 microsatellite markers was used to identify Ms according to Delatte et al. [42]. Moreover, a portion of the COI gene was sequenced for five individuals from each of the different B. tabaci genetic groups, using the protocol described by Thierry et al. [37] and Gnankine et al. [35] ( Figure S1 in Additional file 1).

Study of Arsenophonus diversity
PCRs targeting three different genes of Arsenophonus were carried out on positive samples with two sets of primers designed specifically for this study (ftsK: ftskFor1/Rev1, ftskFor2/Rev2; yaeT: YaeTF496/ YaeTR496, see Table 2) and one set from the literature (fbaA: FbaAf/FbaAr) [17]. For the Q group, amplifications failed for some individuals and the primer FbaArLM (Table 2) was then used instead of FbaAr. These two primers are adjacent and their use permits

Phylogenetic analyses
Multiple sequences were aligned using MUSCLE [51] algorithm implemented in CLC DNA Workbench 6.0 (CLC Bio). Phylogenetic analyses were performed using maximum-likelihood (ML) and Bayesian inferences for each locus separately and for the concatenated data set.
JModelTest v.0.1.1 was used to carry out statistical selection of best-fit models of nucleotide substitution [52] using the Akaike Information Criterion (AIC). A corrected version of the AIC (AICc) was used for each data set because the sample size (n) was small relative to the number of parameters (n/K < 40). This approach suggested the following models: HKY for fbaA, GTR for ftsK, HKY+I for yaeT and GTR+I for the concatenated data set. Under the selected models, the parameters were optimized and ML analyses were performed with Phyml v.3.0 [53]. The robustness of nodes was assessed with 100 bootstrap replicates for each data set.
Bayesian analyses were performed as implemented in MrBayes v.3.1.2 [54]. According to the BIC (Bayesian information criterion) estimated with jModelTest, the selected models were the same as for ML inferences. For the concatenated data set, the same models were used for each gene partition. Analyses were initiated from random starting trees. Two separate Markov chain Monte Carlo (MCMC) runs, each composed of four chains, were run for 5 million generations with a "stoprule" option to end the run before the fixed number of generations when the convergence diagnostic falls below 0.01. Thus, the number of generations was 3,000,000 for FbaA, 600,000 for FtsK, 2, 100,000 for YaeT and 1,000,000 for the concatenated data set. A burn-in of 25% of the generations sampled was discarded and posterior probabilities were computed from the remaining trees. Runs of each analysis performed converged with PSRF values at 1.
In addition, Arsenophonus strains identified in the present study were used to infer phylogeny on a larger scale with the Arsenophonus sequences from various insect species obtained from Duron et al. [17]. The GTR+G model was used for both methods (ML and Bayesian inferences) and the number of generations was 360,000 for the Bayesian analysis.

Recombination analysis
The multiple sequence alignments used in the phylogenetic analysis were also used to identify putative recombinant regions with methods available in the RDP3 computer analysis package [55]. The multiple sequence alignments were analyzed by seven methods: RDP [56], GENECONV [57], Bootscan [58], Maximum Chi Square [59], Chimaera [60], SiScan [61], and 3Seq [62]. The default search parameters for scanning the aligned sequences for recombination were used and the highest acceptable probability (p value) was set to 0.001.

Diversity and genetic analysis
Identical DNA sequences at a given locus for different strains were assigned the same arbitrary allele number (i.e. each allele has a unique identifier). Each unique allelic combination corresponded to a haplotype.
Genetic diversity was assessed using several functions from the DnaSP package [63] by calculating the average number of pairwise nucleotide differences per site among the sequences (π), the total number of mutations (η), the number of polymorphic sites (S) and the haplotype diversity (Hd). The software Arlequin v.3.01 [64] was used to test the putative occurrence of geographical or species structure for the different population groups by an AMOVA (analysis of molecular variance). The analyses partitioning the observed nucleotide diversity were performed between and within sampling sites Ars-23S2 5'-GGTCCTCCAGTTAGTGTTACCCAAC -3' This study This study (countries, localities) or species (B. tabaci species, T. vaporariorum and B. afer). For each analysis, genetic variation was partitioned into the three following levels: between groups (F CT ), between populations within groups (F SC ) and within populations (F ST ). Significance was tested by 10,000 permutations as described by Excoffier et al. [64].

Results
Three bacterial genes fbaA, yaeT and ftsK of Arsenophonus were sequenced for 152 Aleyrodidae individuals sampled from different geographical locations and host plants ( Figure 1, Table 1). The obtained sequences exhibited a high degree of identity to sequences from the bacterial genus Arsenophonus available in the NCBI database (http://www.ncbi.nlm.nih.gov), ranging from 91 to 100% for fbaA, 94 to 98% for yaeT, and 91 to 100% for ftsK. The G-C content varied from 39 to 46% (Table  3), the expected range for these bacteria [65].

Prevalence and co-occurrence of Arsenophonus
Arsenophonus revealed highly variable prevalences among and within genetic groups and locations (Table  1). Within the Q3 and ASL groups found only in Africa, more than 80% of the individuals were infected with Arsenophonus, whereas the prevalence was lower in the AnSL group (50% on average). The infection level was much more variable in Q2 (from 33 to 100%) and Ms (from 4 to 100%). Furthermore, all individuals tested from T. vaporariorum (30) and B. afer (2) were infected with Arsenophonus. Since the sampling was not performed on the same host plants, or in the same locations or countries for a given group, we could not test for the influence of host plant or locality. Based on the three sequenced genes, we could not detect individual co-infection by two lineages of Arsenophonus in the same whitefly.

Allelic variation
Nine alleles were found for both ftsK and fbaA, and 11 for yaeT (Table 4). In these three genes, only 12.1% of the sites showed variation (110/906; Table 3). The observed allelic diversity was not randomly distributed. In fact, strong and significant differentiation (Fct = 0.69*, explaining 69% of the total variation in the sample, Table S1 in Additional file 1) was observed between groups of alleles, with each group being mostly associated to a genetic group within the B. tabaci complex or the other Aleyrodidae species tested (T. vaporariorum or B. afer).
For the ftsK locus, we observed indels of two types: a 2-bp insertion found exclusively in the Arsenophonus hosted by the Q2 genetic group and a 1-bp deletion found in some ASL and Q2 individuals. These two indels resulted in hypothetical truncated ftsK proteins potentially encoding 866 or 884 amino acids, respectively (predicted ftsK has 1030 amino acids in Arsenophonus nasoniae [Genbank: CBA73190.1]; (Table S2 in Additional file 1).
Among the 152 individuals used in this study, a total of 19 haplotypes of Arsenophonus were identified, which is low compared to the theoretical 891 allelic combinations (9 x 9 x 11, 9 alleles for both ftsK and fbaA, and 11 for yaeT; Table 4).

Recombination analysis
Using the RDP3 package, recombination events were tested for each gene separately and for the concatenated data set using all sequences studied (see Figure  2). No recombination events were detected for any of the gene portions analyzed separately, suggesting that there is no intragene recombination. For the concatenated data set sequences, among the seven algorithms tested, four (GENECONV, Bootscan, Maximum Chi Square, and Chimaera) showed two significant recombination events (Table S3 in Additional file 1). Recombination events were detected in individuals B1-47 and B1-42 (ASL genetic group) for the whole region of the ftsK gene (positions 366 to 617 in the concatenated alignment).
Parental-like sequences determined for the recombinant B1-42 were VILCU10 (Q2 genetic group, major parent) and B1-45 (ASL genetic group, minor parent), and parental-like sequences for the recombinant B1-47 were O2-22 (Q3 genetic group, major parent) and B1-34 (ASL genetic group, minor parent). These two recombinant sequences suggest a recombination event between Arsenophonus sequence-like of the Q2 and ASL genetic groups for B1-42 and between Q3 and ASL genetic groups for B1-47.

Phylogenetic inference of relationships
All tree topologies (each gene separately and the combined analysis) were the same with both ML and Bayesian analyses, and we therefore present trees with both bootstrap statistics and Bayesian posterior probabilities (Figures 2, 3; Figure S2 in Additional file 1).

Phylogenetic analysis among Arsenophonus from Aleyrodidae
The phylogenetic trees obtained for each of the three loci were congruent except for the two recombinants (B1-42 and B1-47). Thus, we conducted analyses using the 907-bp concatenated fbaA, ftsK and yaeT sequences.
The concatenated tree (Figure 3) revealed the existence of two highly supported clades composed of six groups and one singleton (the Arsenophonus found in B. afer, genetically distant from B. tabaci; Figure S1 in Additional file 1). Shown are: mean GC%, number of polymorphic sites including gaps (S), the total number of mutations (η),average number of pairwise nucleotide differences per site among the sequences (π), number of haplotypes (h) and haplotype diversity (Hd).
• The total number of individuals includes the singleton B. afer.
The first clade was composed of Q2, Ms, Trialeurodes and some ASL individuals. The second clade was composed of Q3, ASL and AnSL individuals. Interestingly, ASL individuals sampled from the same location and host plant (Burkina Faso, Bobo/Kuinima, Tomato, Marrow; Table 1) were found in both Arsenophonus clades, and included the recombinants as well.
The six phylogenetic groups of Arsenophonus highly correlated with the B. tabaci genetic groups defined on the basis of the mitochondrial COI, and with the two other Aleyrodidae species. Indeed, four groups were composed exclusively of individuals belonging to the same genetic group, respectively Ms, ASL, Q3 and Q2. The two other groups included either two distinct COI groups of B. tabaci ASL and AnSL or individuals from two different host species : B. tabaci (with Ms genetic group individuals from Madagascar, Tanzania and Reunion) and T. vaporariorum (Tables 3, 4).
Comparative analysis of the genetic divergence of these groups at the three loci (Tables 3, 4) revealed that the group composed of ASL and AnSL individuals is the most polymorphic (π = 0.0068), while the Q2 group is highly homogeneous despite several sampling origins (Table 1). Overall, DNA polymorphism was rather low with an average value of group π means of 0.002.

Phylogenetic relatedness of Arsenophonus strains from other insects species
The Arsenophonus isolates observed in our B. tabaci samples proved to be phylogenetically very close to the Arsenophonus strains found in other insect species (Figure 3). One clade, composed of T. vaporariorum, B. afer, the B. tabaci groups Ms, Q2, and some individuals belonging to ASL, fell into the Aphis sp. and Triatoma sp. Arsenophonus clade described by Duron et al. [17]. The other clade was comprised mainly Arsenophonus infecting Hymenoptera (Nasonia vitripennis, Pachycrepoideus vindimmiae, Muscidifurax uniraptor) and the dipteran Protocalliphora azurea.

Discussion
In this paper we report on a survey of the Arsenophonus bacterial symbiont in whitefly species, and in particular in B. tabaci. The data revealed considerable withingenus diversity at this fine host taxonomic level. Previous studies conducted in several arthropod species have found Arsenophonus to be one of the richest and most widespread symbiotic bacteria in arthropods [9,15]. However, those studies were performed with 16S rRNA, which is present in multiple copies in the genome of the bacterium [25] and has proven to be a marker that is Number of individuals per haplotype and frequencies are indicated. The name of each haplotype is the name of one of its representatives. The genetic groups of B. tabaci associated with the haplotype are indicated in parentheses.  highly sensitive to methodological artifacts, leading to an overestimation of the diversity [15]. The phylogenetic analyses performed on concatenated sequences of three Arsenophonus genes from whiteflies identified two well-resolved clades corresponding to the two clades obtained in the MLST study performed by Duron et al. on a larger insect species scale [17]. One clade was composed of Arsenophonus lineages from three B. tabaci genetic groups (Ms, ASL, Q2), T. vaporariorum and B. afer, and strains found in other Hemiptera. The other clade, initially clustering Arsenophonus strains found in Hymenoptera and Diptera, also contained whitefly symbionts of the AnSL, ASL and Q3 genetic groups of the B. tabaci species complex. This clade thus combines insect hosts from phylogenetically distant taxa. The lineages of Arsenophonus from this clade were most likely acquired by whiteflies more recently through lateral transfers from other insect species. The genetic groups of B. tabaci represented in this clade all originated from Africa (AnSL, ASL and Q3), which could be explained by horizontal transmission events among groups of B. tabaci after a first interspecific transfer of Arsenophonus from another insect genus. There have been many reports of interspecific horizontal transfers of facultative symbiotic bacteria, suggesting that this phenomenon is frequent in arthropods and probably represents the most common process in the establishment of new symbioses [8]. For example, extensive horizontal transmissions of the reproductive manipulator Wolbachia have occurred between insect species [66]. However, horizontal transfers of Arsenophonus were poorly documented at the time. Nevertheless, a bacterium called Candidatus Phlomobacter fragariae, which is pathogen of strawberry plants, is phylogenetically close to Arsenophonus associated with some hemiptera (from cixiids) and more distantly related to psyllid and delphacid secondary endosymbionts [20,67], showing probable evidence of horizontal transfer between plants and insects. Recently Duron et al. [17] demonstrated, by phylogenetic analysis and experimental studies, the existence of such horizontal transmission of Arsenophonus strains among different wasp species through multi-parasitism. Here we provide indirect phylogenetic evidence of horizontal transmission of Arsenophonus among distantly related species that do not have clear intimate ecological contact (via predation or parasitism for instance) and thus have less opportunities for horizontal transfers. This could be explained by the particular features of Arsenophonus, most notably its broad spectrum of host species (many insect taxa but also plants) and its ability to grow outside the host [68].
On a lower taxonomic scale, within the whitefly species, 19 haplotypes were identified among the 152 concatenated sequences of Arsenophonus obtained in this study. They formed six phylogenetic groups and one singleton corresponding to the Arsenophonus strain found in the host species B. afer. These groups did not cluster individuals according to host plant or sampling site, and four of them were congruent to the B. tabaci genetic groups.
Among the two other phylogenetic groups, one clustered B. tabaci individuals that belonged to two strongly diverse genetic groups, ASL and AnSL, which are considered two different species [29] and which were not collected on either the same host plant or in the same country (Burkina Faso and Benin/Togo, respectively). Only some of the ASL individuals belonged to this group, while the others clustered together. These two groups split into the two clades found in whiteflies, which may reflect two separate acquisition events.
The other group of Arsenophonus comprised individuals of two whitefly species, T. vaporariorum and B. tabaci (Ms individuals originated from different countries: Madagascar, Tanzania or Reunion). The Arsenophonus strains found in Ms individuals clustered into two groups, but they fell into the same clade (close to Hemiptera). The haplotype diversity of this group was very low, suggesting a recent transfer between T. vaporariorum and Ms. One hypothesis is that the exchange of Arsenophonus lineages between these two species occurred through their parasitoids, as previously described for Wolbachia in planthoppers [69], since T. vaporariorum and B. tabaci share some parasitoid species (such as Encarsia or Eretmocerus) and are usually found in sympatry. A second pathway of infection could be through their feeding habit via the plant, as both species are found in sympatry in the field and share the same host plant range. Such a method of symbiont acquisition has been hypothesized for Rickettsia in B. tabaci [70].
Within the B. tabaci species complex, we found, for the first time for Arsenophonus, intergenic recombination events in two individuals belonging to the ASL genetic group. The parental-like sequences came from Q2, Q3 and ASL individuals. Although unexpected for intracellular bacteria, homologous recombination has been described in some endosymbiotic bacteria [26,27]. For example, Wolbachia showed extensive recombination within and across lineages resulting in chimeric genomes [27]; Darby et al. [25] also found evidence of genetic transfer from Wolbachia symbionts, and phage exchange with other gammaproteobacterial symbionts, suggesting that Arsenophonus is not a strict clonal bacterium, in agreement with the present study. These recombination events may have important implications for the bacteria, notably in terms of phenotypic effects and capacity of adaptation to new hosts, and thus for the bacterial-host association [8], and might prevent the debilitating effects of obligate intracellularity (e.g., Muller's rachet [71]). In the Wolbachia genome, intergenic and intragenic recombinations occur; we detected only intergenic recombination events between ftsK and the two other genes in Arsenophonus. Surprisingly, we detected indels inducing STOP codons in this gene. These indels, found in all individuals of the Q2 genetic group sampled in Israel, France, Spain, and Reunion, disables the end of the ftsK portion sequenced in this study. In bacteria, ftsK is part of an operon of 10 genes necessary for cell division [72]. However, a recent study has demonstrated that, in Escherichia coli, overexpression of one of the 10 genes of this operon (ftsN) is able to rescue cells in which ftsK has been deleted [73]. This gene, ftsN, is also present in the Arsenophonus genome [Genbank: CBA75818.1]. These data suggest that ftsK may be not suitable for a MLST approach and other conserved genes should be targeted instead. Future studies should focus on obtaining extensive data related to the specificity of Arsenophonus-Q2 interactions. It would be interesting to sample more Q2 individuals infected with Arsenophonus to determine the prevalence of this STOP codon in natural populations and its consequences for the bacteria.

Conclusions
In this study, we found that the diversity of Arsenophonus strains in B. tabaci corresponds with the diversity observed on a larger scale in insect species. It would be interesting, in further studies, to extend the sampling to more host species in order to get an accurate idea of the diversity of Arsenophonus lineages. However, a complete understanding of the Arsenophonus phylogeny would require more molecular markers. This could be achieved through the use of other housekeeping genes for the MLST approach or insertion sequences and mobile elements, which is now possible since the genome of Arsenophonus has been completely sequenced. We found intergenic recombinations using only three genes, suggesting that such events could be frequent in the Arsenophonus genome. Understanding the Arsenophonus genomic features is crucial for further research on the evolution and infection dynamics of these bacteria, and on their role on the host phenotype and adaptation. According to these effects on host physiology and phenotype, they could then be potentially exploited in efforts to manipulate pest species such as B. tabaci.

Additional material
Additional file 1: Figure S1. Partial mitochondrial COI gene phylogeny of Aleyrodidae individuals used in this study. The tree was constructed using a Bayesian analysis. Node supports were  Figure S2. Arsenophonus phylogeny using maximum-likelihood (ML) and Bayesian analyses based on sequences of the three genes fbaA (A), ftsK (B) and yaeT (C). Different evolution models were used to reconstruct the phylogeny for each gene [fbaA (HKY), ftsK (GTR), yaeT (HKY+I)]. Bootstrap values are shown at the nodes for ML analysis and the second number represents the Bayesian posterior probabilities. Table S1. Analysis of molecular variance computed by the method of Excoffier et al. [69] on samples of Arsenophonus from several Aleyrodidae species. Group denomination was according to their hosts, i.e. Bemisia tabaci: ASL, AnSL, Q2, Q3, Ms, Bemisia afer, Trialeurodes vaporariorum. Each species (group) was separated into populations corresponding to location of sampling. *p < 0.05. Table S2. Haplotypes of the three sequenced genes fbaA (A), ftsK (B), yaeT (C) recovered across all 152 samples of Aleyrodidae collected in this study. Only polymorphic positions are shown, and these are numbered with reference to the consensus sequence. Dots represent identity with respect to reference. The frequency indicates the number of times the haplotype was found in the total sample. *non-synonymous mutations.  Table S3. Recombination in Arsenophonus. Details of the Arsenophonus recombination events detected in this study, including parental-like sequences, and p-values for various recombination-detection tests, using RDP3 [60].