PCA plot showing the clustering of the samples. The figure shows a PCA plot based on taxonomic (phylum level) and metabolic (SEED subsystems, level I) parameters combined. The geochemical parameters were overlain using the envfit function of the vegan library in R. The first principal components accounted for 95 % of the variation in the dataset, while the second principal component accounted for 3 %. All metagenome data were given as percent of total reads. The geochemical parameters were normalized by dividing with the standard deviation and subtracting the smallest number from all numbers in each row. Plot A: the metagenomic parameters are represented by red arrows. Labels are shown for parameters with Euclidian distance over 0.1 from origin. The geochemical parameters are represented by blue arrows. Only the most significant geochemical parameters are shown (p-value < 0.1). Plot B: is an excerpt of plot A, magnifying the central region of the plot. Labels for all metagenomic parameters with Euclidian distance over 0.02 are included.