Bayesian phylogenetic tree of Perkinsea diversity using the V4 region of SSU rDNA gene (part 1) and provenance of the Perkinsea BioMarKs sequences. Subsection of the phylogenetic tree is shown (see Figure 3 for the rest of the phylogeny). Bayesian posterior probability, Maximum Likelihood bootstrap (1,000 replicates), and LogDet distance bootstrap (1,000 replicates) values are added at each node using the following convention: support values are summarised by black circles on the branch when support are equal to or higher than 0.9/80%/80% and ringed circles when bootstrap values are between 0.6/60%/60% and 0.9/80%/80%. When the bootstrap value is below 60%, a “+” is added if the topology of the tree is recovered in the ML and LogDet analyses. A “-” is shown when these tree topologies are not consistent. Nine sequences of Dinoflagellata and five sequences of Marine Alveolata group II were used as an outgroup. Branches shortened by ½ are labelled with a double slashed line. The black and grey branches on the tree indicate marine and freshwater lineages respectively. Distribution and provenance of sequences across RNA and DNA derived libraries are illustrated down the right columns as shaded triangles if they represent a cluster group. Number in brackets refers to multiple identical sequence reads from the same sample. Circles are used to represent the provenance of a single environment unique sequence cluster. The colour of triangles designates the number of sequences recovered from each location (surface, DCM and sediment) or rDNA/rRNA for each cluster group. White represent between 0 and 5 sequences, Grey between 6–10 and black higher than 10. For correspondence between the 17 freshwater cluster groups identified by Bråte et al. (2010) and the 5 freshwater cluster groups identified in our analyses see Additional file 4: Table S3.