- Research article
- Open Access
Diversity and transcription of proteases involved in the maturation of hydrogenases in Nostoc punctiforme ATCC 29133 and Nostocsp. strain PCC 7120
BMC Microbiology volume 9, Article number: 53 (2009)
The last step in the maturation process of the large subunit of [NiFe]-hydrogenases is a proteolytic cleavage of the C-terminal by a hydrogenase specific protease. Contrary to other accessory proteins these hydrogenase proteases are believed to be specific whereby one type of hydrogenases specific protease only cleaves one type of hydrogenase. In cyanobacteria this is achieved by the gene product of either hupW or hoxW, specific for the uptake or the bidirectional hydrogenase respectively. The filamentous cyanobacteria Nostoc punctiforme ATCC 29133 and Nostoc sp strain PCC 7120 may contain a single uptake hydrogenase or both an uptake and a bidirectional hydrogenase respectively.
In order to examine these proteases in cyanobacteria, transcriptional analyses were performed of hupW in Nostoc punctiforme ATCC 29133 and hupW and hoxW in Nostoc sp. strain PCC 7120. These studies revealed numerous transcriptional start points together with putative binding sites for NtcA (hupW) and LexA (hoxW). In order to investigate the diversity and specificity among hydrogeanse specific proteases we constructed a phylogenetic tree which revealed several subgroups that showed a striking resemblance to the subgroups previously described for [NiFe]-hydrogenases. Additionally the proteases specificity was also addressed by amino acid sequence analysis and protein-protein docking experiments with 3D-models derived from bioinformatic studies. These studies revealed a so called "HOXBOX"; an amino acid sequence specific for protease of Hox-type which might be involved in docking with the large subunit of the hydrogenase.
Our findings suggest that the hydrogenase specific proteases are under similar regulatory control as the hydrogenases they cleave. The result from the phylogenetic study also indicates that the hydrogenase and the protease have co-evolved since ancient time and suggests that at least one major horizontal gene transfer has occurred. This co-evolution could be the result of a close interaction between the protease and the large subunit of the [NiFe]-hydrogenases, a theory supported by protein-protein docking experiments performed with 3D-models. Finally we present data that may explain the specificity seen among hydrogenase specific proteases, the so called "HOXBOX"; an amino acid sequence specific for proteases of Hox-type. This opens the door for more detailed studies of the specificity found among hydrogenase specific proteases and the structural properties behind it.
Cyanobacteria evolved more then 2.0 billion years ago and were the first organisms to perform oxygenic photosynthesis [1, 2]. They exist in many different shapes and forms e.g. unicellular, filamentous and colonial and can even form symbiosis with a variety of organisms . Several cyanobacterial strains also have the ability to fix atmospheric nitrogen into ammonium, a process performed by the enzyme complex nitrogenase. Among filamentous cyanobacteria like Nostoc sp. strain PCC 7120 and Nostoc punctiforme ATCC 29133 (from now on referred to as Nostoc PCC 7120 and Nostoc punctiforme), both used in the present study, this process takes place in specialised cells called heterocysts in which a thick envelope and lack of photosystem II activity creates a nearly oxygen free environment for the nitrogenase [3, 4]. The same nitrogenase is also a key player in the hydrogen (H2) metabolism by producing H2 as a by-product during the fixing of atmospheric nitrogen (N2). In addition, cyanobacteria may also possess distinct [NiFe]-hydrogenases.
The cyanobacterial hydrogenases can functionally be divided into two groups; uptake hydrogenases, dimeric HupSL, that consumes H2, and bi-directional hydrogenases, pentameric HoxYHEFU, that can both consume and produce H2 . In the case of Nostoc PCC 7120 both hydrogenases may be present, while Nostoc punctiforme only contains the uptake hydrogenase [3, 5].
The cyanobacterial uptake hydrogenase is closely connected to both the N2-fixing process and the occurrence of a nitrogenase, recycling the H2 and thereby regaining energy and electrons. The function of the bi-directional hydrogenase is more unclear and suggestions range from functioning as a mediator of reducing power during anaerobic conditions to it being part of respiratory complex I .
Both types of hydrogenases go through an extensive maturation process that involves several different accessory proteins. Even though much is still to be learned about this maturation process in cyanobacteria, comprehensive studies in other organisms like Escherichia coli have been performed [6, 7]. Particularly the large subunit of [NiFe]-hydrogenase (HupL and HoxH in cyanobacteria) requires numerous accessory proteins responsible for metal transport, biosynthesis and insertion of the metal atoms nickel and iron into its active site. The genes encoding for these proteins are usually referred to as the hyp-genes and have been identified in many organisms including several cyanobacterial strains . The Hyp-proteins are considered unspecific and there is usually only one set of hyp-genes irrespective of the number hydrogenases in a single strain [8, 9]. It was recently suggested that a set of protein encoding genes within the extended hyp-operon of Nostoc PCC 7120 may be involved in the maturation of the small subunit of the cyanobacterial uptake hydrogenase .
The final step in the maturation process of the large subunit is a proteolytic cleavage of the C-terminal, which results in a conformational change, and the association of the large subunit to the small subunit [11, 12]. The number of amino acids that are cleaved off varies between different hydrogenases and organisms but the cleavage always takes place after the conserved motif DPCXXCXXH/R resulting in the histidine being the new C-terminal amino acid [11–14]. Several experiments together with sequencing data have indicated that these putative proteases, contrary to the Hyp-proteins, are specific to different hydrogenases; not only to hydrogenases in different bacterial strains but also to different hydrogenases within the same strain [12, 15]. In both Nostoc punctiforme and Nostoc PCC 7120 putative proteases have been identified through secondary and tertiary structure alignments . The protein product of the gene hupW is believed to process HupL (the large subunit of the uptake hydrogenase) and can be found in both cyanobacterial strains. Nostoc PCC 7120 however, which in addition harbours a bi-directional hydrogenase, also contains hoxW whose protein product is believed to be involved in the processing of HoxH [5, 16].
It is still unknown exactly how the recognition of the different hydrogenases takes place and which part(s) of the protease determines specificity. A crystal structure of a large subunit- protease complex is still not yet available from any organism. However, the protease HupD from E. coli has been crystallised giving vital clues about its function . The importance of Ni-incorporation into the active site for any cleavage to occur has been addressed [13, 18, 19] and together with amino acid replacement experiments, it has been shown that nickel is an important substrate recognition motif. In addition the protease binds directly to the metal [17, 19] and the crystal structure of HybD in E. coli showed that three amino acids; Glu16, Asp62 and His93, are most likely to be involved in the metal binding .
Contrary to the lack of functional studies of cyanobacterial hydrogenases extensive studies have been done on the transcriptional regulation of cyanobacterial hydrogenases and their accessory genes . Several putative binding sites of different transcription factors have been reported in connection with the uptake hydrogenase such as FNR (fumarate-nitrate reduction) in Anabaena variabilis and the global nitrogen regulatory protein NtcA in Nostoc punctiforme, Lyngbya majuscule CCAP 1446/4 and Gloeothece sp. strain ATCC 27152 and IHF (integrated host factor) in Nostoc punctiforme and Lyngbya majuscule CCAP 1446/4 . Participation by the transcription factor NtcA fits in well with the known connection between the uptake hydrogenase and N2 fixation. Further it has been shown that the uptake hydrogenase is only transcribed under N2-fixing conditions and in connection with heterocyst formation [20, 21].
The genes encoding the bi-directional hydrogenase, contrary to the uptake hydrogenase, are transcribed in both heterocysts and vegetative cells and under both non N2- and N2-fixing conditions . So far, two transcription factors have been identified in connection with the bi-directional hydrogenase, LexA and an AbrB-like protein [22–24].
In the present study we investigate the transcriptional regulation of the genes encoding hydrogenase specific proteases hupW in Nostoc punctiforme and hupW and hoxW in Nostoc PCC 7120, under both N2-fixing and non N2-fixing conditions. In addition, we address the question of the diversity, specificity and evolution of the hydrogenase specific proteases in cyanobacteria.
Diversity of cyanobacterial hydrogenase specific proteases
To examine the diversity of hydrogenase specific proteases and their relationship to each other, in cyanobacteria and other microorganisms, a phylogenetic tree was constructed using both PAUP and MrBayes analysis. Since no suitable outgroup has been found for the proteases at this stage, a non-rooted tree was constructed including claude creditability values. The resulting tree from the MrBayes analysis revealed several subgroups among the hydrogenase specific proteases, which correlates with respective hydrogenase group according to Vignais et al  (Figure 1);
1. Bacterial proteases (cleaves group 1 hydrogenases)
2. Cyanobacterial proteases, HupW type (cleaves group 2 hydrogenases)
3. Bacterial and Archaean proteases
a. Archean proteases (cleaves group 3a hydrogenases)
d. Bacterial proteases, HoxW type (cleaves group 3d hydrogenases)
4. Bacterial and Archaean proteases, Hyc type (cleaves group 4 hydrogenases)
The phylogenetic groups of the hydrogenase specific protease have been named according to the nomenclature used for [NiFe]-hydrogenase.
The result from the PAUP analysis is less resolved but supports the result from MrBayers analysis with some minor differences within group 3d (HoxW in Synechocysis sp. strain PCC 6803 and HoxW in Synechococcus sp. strain PCC 7002 are shown as more closely related).
An extended phylogenetic tree was also constructed containing more strains including hydrogenase specific proteases cleaving type 3b-hydrogenases. This tree was unfortunately less reliable and far from robust with several weak nodes (Additional file 1 and Additional file 2). However the result showed putative group 1 proteases and putative group 3b proteases as less clustered and instead spread around point X (Figure 1 and Additional file 1).
Transcriptional studies of hupW in Nostoc punctiforme ATCC 29133 and Nostocsp strain PCC 7120
Northern hybridisations were performed of hupW in both Nostoc punctiforme and Nostoc PCC 7120 using both N2-fixing and non N2-fixing cultures (Figure 2). The results from Nostoc PCC 7120 revealed two transcripts. The first is shorter (approx. 500 nt) and present under both N2-fixing and non N2-fixing conditions, while the second longer transcript (approx. 1600 nt) is only present under N2-fixing conditions. The size of the longer transcript is comparable with the size of a two-gene operon containing hupW together with the upstream gene alr1422, a gene of unknown function (Figure 3a). RT-PCR confirmed that the two genes are co transcribed (Figure 3a). Additional 5'RACE experiments revealed three TSPs whereby the first is located 234 bp upstream of hupW. Succeeding bio-informatic studies identified a putative σ70-like -10 and -35 box (Figure 3a) (TATAAT respectively TTAAAA) and two imperfect putative NtcA binding sites (TGAN8CAC and GTAN12TAC). By running the complete intergenic region in BLAST at Cyanobase two conserved regions were also discovered. Both can be found in the intergenic regions of several genes in Nostoc PCC 7120 and Anabaena variabilis ATCC 29413 (data now shown). Their function is unclear but one of them shows similarity to the consensus sequence WATCAANNNNTTR from the previously described IHF binding sites . The second and third TSPs were identified inside the gene alr1422, 4 bp and 14 bp downstream of the putative translation start site. A new putative translation start site within the same frame was found 115 bp downstream from the previously suggested start site. By analysing the sequence of the promoter region a -10 box (TATTTT and TATCAT), a -35 box (TTAAAC and TACCGA) and two putative NtcA binding sites (GTAN8AAC/GTN10AC) 147/157 bp and 62/72 bp upstream of the two TSPs were also identified.
For Nostoc punctiforme a transcript of hupW of about 1300 nt, is only present in N2-fixing cultures (Figure 2). 5'RACEs identified a single TSP 607 bp upstream of hupW in Nostoc punctiforme, together with a σ70-like -10 box sequence (TAGGCT) and a putative NtcA binding site (GTAN8CAC) located 40 bp upstream from the TSP (Figure 3b). The resulting transcript includes the upstream gene Npun_F0373, which was confirmed by RT-PCR using primers for the subsequent PCR covering the intergenic region and agrees with the result from the Northern blot experiments (Figure 2 and 3b).
In silico analysis of alr122 and Npun_F0373 in Nostoc sp. strain PCC 7120 and Nostoc punctiformeATCC 29133
Homologues to alr1422 in Nostoc PCC 7120 are present in two other strains, Anabaena variabilis ATCC 29413 (ava3972) and Trichodesmium erythraeum IMS101 (tery_3492). It shows no transmembrane regions or domains that would give an indication of its function.
The gene Npun_F0373 is of unknown function but a search with NCBI BLAST revealed four homologues in other microorganisms, all cyanobacterial; Nostoc PCC 7120, Anabaena variabilis ATCC 29413, Nodularia spumigena CCY 9414 and in Nostoc sp. PCC 7422 (Figure 4, Additional file 3). In Nostoc sp. strain PCC 7422 only parts of the genome are sequenced and in the 5'end of GenBank accession number AB237640 the first 63 bp of the gene can be identified. The gene is truncated in Nodularia spumigena CCY 9414 but is intact in the other strains and in two cases (Nostoc punctiforme and Nodularia spumigena CCY 9414) it is located directly upstream of hupW and/or the uptake hydrogenase genes. Alignments of the promoter sequence of these genes show highly conserved promoter regions, all containing putative NtcA binding sites, -10 box, putative Shine-Dalgarno sequence and even suggests a putative TSP for four out of the five genes (the gene Npun_F0373 homologue in Nodularia spumigena CCY9414 is probably transcribed with the upstream gene, hupL) (Figure 4). Bio-informatic studies of Npun_F0373 propose a transmembrane region between amino acids 84–105 but showed no other domains or sites giving clues to its function. However, when comparing strains that either harbour or lack the gene, it was found that among the strains containing Npun_F0373 and its homologues, the ability to form heterocysts is a shared feature (Additional file 4).
Transciptional studies of hoxW in Nostocsp strain PCC 7120
hoxW is located between the genes all0771 (4-hydroxyphenylpyruvate dioxygenase) and all0769 (acetyl-CoA synthetase), both with no known relationship to H2 metabolism, and around 4.7 kbp downstream of the hoxHYU operon  on the opposite strand (Figure 5).
Northern blot hybridisation of hoxW was performed using RNA isolated from both N2-fixing and non N2-fixing cultures indicating an increased level of hoxW under N2-fixing conditions and revealing several transcripts ranging from ~1000-500 nt (Figure 5b). This was confirmed by 5'RACE experiments that showed TSPs at both 44 bp and 70 bp upstream of hoxW. When analysing the promoter region, a σ70-like -10 box (TAGCTT) was identified for the TSP, 70 bp upstreams of hoxW, but no -35 box while the TSP, 44 bp upstream of hoxW, contains a putative -35 box (TTAAAA) but no clear -10 box (Figure 5a).
When analysing the complete intergenic region between hoxW and its upstream gene all0771 two conserved regions appeared (Figure 5a). Both regions can be found in between genes in numerous cases especially in the genome of Nostoc PCC 7120 and Anabaena variabilis ATCC 29413. The first conserved region, situated 204–231 bp upstream of hoxW, consists of four repeats, which when run through Mfold forms a putative hairpin (dG = -10.21). The second region is located 162–195 bp upstream of hoxW and its sequence TAGTAGTTATGTAAT(N12)TAGCTT shows resemblance to a LexA binding site, according to the previously defined motif RGTACNNNDGTWCB together with a putative -10 box .
Specificity of HupW and HoxW in cyanobacteria
To address the protease specificity an alignment of protein sequences was performed to search for conserved regions specific to each protease group, HupW and HoxW (group 2 and 3d, Figure 1), in cyanobacteria. This study revealed that one of the conserved regions among the proteases is highly dissimilar when comparing HupW and HoxW in cyanobacteria (Figure 6 and Figure 7a). In most proteases, including HupW, this region consists of the sequence D(G/C/F)GT (aa 41–44 in HupW of Nosotoc PCC 7120) while among the HoxW proteases it is replaced by the sequence H(Q/I)L (aa 42–44 in HoxW of Nostoc PCC 7120) (the latter now on referred to as the HOXBOX).
To get a better understanding of this region and its possible function bio-informatic work was performed targeting conserved and similar amino acids on the surface of putative HoxW and putative HupW in Nostoc PCC 7120 and HybD in E. coli together with protein-protein docking experiments using the docking algorithm BiGGER. The studies showed that the conserved residues are not evenly distributed but clustered around the proposed nickel binding residues Glu16 and His93 (HybD – E. coli)  and around the conserved "HOXBOX" region for all three cases. In HupW and HybD conserved surface areas could also be found along alpha helix 1, beta sheet 2 and alpha helix 4 [16, 17] (Figure 7a–b).
Protein docking experiments resulted in 11 hits for HybC-HybD (E. coli), 84 hits for HybB-HynC (Desulfovibrio vulgaris str. Miyazaki F) and 28 hits for HoxH-HoxW (Nostoc PCC 7120). The best hit for HybD in E. coli and HoxW in Nostoc PCC 7120 can be seen in Figure 7c, a target-probe complex whereby the HOXBOX of the protease is in a less favourable position for C-terminal cleavage. This means that the HOXBOX is either facing away from the C-terminal or that other residues are blocking making it difficult for physical contact to occur without major conformation changes. This was the case for 70% of the hits and the average distance of Gly42/His42 (HybD/HoxW) in the HOXBOX to the last amino acid of the C-terminal was around 17–20 Å. The majority of the hits indicated that the HOXBOX region and the areas around alpha helix 1, beta sheet 2 and alpha helix 4 are in close interaction with the large subunit of the hydrogenase. This is especially true for the HybC-HybD complex while HoxH-HoxW showed a preference for a more narrow interaction with only the closest residues around Asp16 and His88 and the HOXBOX involved in the contact with HupL. The preferred docking result for HybD in E. coli and HoxW in Nostoc PCC 7120 reflects the results from the studies of the conserved residues as can be seen when comparing Figure 7b and Figure 7c.
Diversity of cyanobacterial hydrogenase specific proteases
Previous phylogenetic studies of hydrogenases in different microorganisms [3, 28, 29] clearly divide the proteins into four classes [28, 29]. One of the most extensive studies, using over 80 microorganisms, showed that the large and the small subunit of the hydrogenase enzyme evolved together and have been two tightly connected subunits for probably all of their evolutionary history . When comparing the evolution of hydrogenases with the present study of hydrogenase specific proteases some striking resemblances appear which indicate a similar development and co-evolution between the large subunit of the hydrogenases and their specific proteases (Figure 1).
Within the phylogenetic tree of the hydrogenase specific proteases similar groups appear as seen among the hydrogenase subunits. This is especially true for the proteases in group 1, 2, 3a and 4. Just as the hydrogenase subunit HycE in E. coli (group 4) is most closely related to the archean hydrogenases (group 3) so is its hydrogenase specific protease HycI (group 4) most closely related to group 3 proteases. The resemblance between the phylogenetic trees suggests that the co-evolution between the hydrogenase and the hydrogenase specific protease is of ancient origin and an explanation for this might be found in the mechanism of the cleavage process. It has previously been suggested that a conformational recognition takes place between the protease and the large subunit  which may through the years enhanced the specificity seem among proteases.
The Hox-specific proteases of group 3d are the exception and can be found as an independent group (Figure 1). Further studies, even though not as robust, also show proteases of 3b type and Additional proteases of group 1 type being spread either individually or on branches around point X (Additional file 1). These results contradict previous evolutionary studies of their respective hydrogenases which have placed group 3b/3d hydrogenases as clearly defined subgroups within group 3 [NiFe]-hydrogenases . By comparing the [NiFe]-hydrogenase phylogenetic tree with the protease phylogenetic tree presented in this study, it also becomes apparent that neither group 1, 2 or 3d would be the deepest branch in a rooted version of the tree. Such a tree would suggest that proteases within the groups 3b/3d developed before the proteases of group 3a and 4, which seems far-fetched since proteases of group 3a and 4 type cleaves hydrogenases that are deeper branched then the 3b/3d hydrogenases.
We therefore suggest that the placement of HOX-specific proteases (3d) and the scattered result of 3b proteases in the phylogenetic tree may be the result of horizontal gene transfer (HGT). HGT is today seen as a major force in evolution and has occurred numerous times between archaea and bacteria [30–33]. Within prokaryotes almost no gene family is untouched by HGT  and there are also numerous cases of HGT within cyanobacteria . [NiFe]-hydrogenases have not been spared from this mechanism and an archaeal organism is believed to be the origin of the Ech- hydrogenase in Thermotoga maritima .
By comparing the phylogenetic tree of hydrogenases and their specific protease and assuming that the [NiFe]-hydrogenase and its specific protease have evolved together the most likely scenario is that an early group 3 [NiFe]-hydrogenase with or without its specific protease was transferred, most probably from an archaeal organism to a bacterial. If we assume that the type 3 hydrogenase and the protease transferred together then this indicates that most likely the root of the tree should be placed between group 3a and 4 (point Z; Figure 1) and that the protease transferred is the ancestor of all type 1, 2 and 3d proteases (Figure 8). If we assume the opposite, (that the hydrogenase transferred alone), then the root should instead be placed between type 1/2/3d and type 3a/4 proteases (point Y; Figure 1) and the transferred hydrogenase must have incorporated an already existing type 1 protease to its maturation process. The scattered impression of type 1 and 3b proteases from the less robust phylogenetic tree with additional hydrogenase specific proteases (Additional file 1) could be the result e.g. older phylum branching off close to the HGT point, poor resolution of the phylogenetic tree or by additional HGT and so does not contradict our proposed theory of HGT. Rooting the tree with an outgroup; germination protease (GPR), the closest relative to the [NiFe]-hydrogenase specific proteases, (data not shown) placed the root between group 3a and 4 suggest that the first scenario, a root between group 3a and 4, is more plausible (point Z; Figure 1). However, all attempts at rooting the tree resulted in very unstable phylogenetic trees. When considering both GPR endopeptidase function (bacterial spoluration) and taxonomic location (bacterial phylum of firmicutes only) it is plausible that the [NiFe]-hydrogenase specific proteases are instead the ancestor of GPR, making any tree with GPR as outgroup unreliable.
Based on the tree of life we also propose that the HGT of probably a 3b similar type protease/hydrogenase most likely took place before the diversification of the bacterial phylum and group 1 hydrogenases. [37, 38]. By comparing our result with genomic timescales of prokaryotic evolution we can even suggest a time for the event of around 3–3.5 billion years ago [39, 40]. This is based on that the archaeal phylum and classes started to evolve earlier (between 4-3 billion years ago) then the bacterial (~3-2.5 billion years ago) and the proposition that methanogenesis was one of the first metabolical pathways to be developed . Since group 3a-3b hydrogenases, have previously been shown to be connected to methanogenesis  this data supports our suggestion of an early differentiation of group 3 hydrogenases. It should be noted that this proposed theory does not contradict previous suggestions of an early pre-LUCA existence and diversification of hydrogenases but rather clarifies the picture [29, 41]. The effect this proposed HGT had on bacterial evolution is not clear but HGT in general may have had a significant effect on the diversification of bacterial species by introducing new metabolic pathways and traits [42, 43].
Large-scale molecular genetic analysis of the DNA sequence (like studies of gene order and G-C content) could give a clearer picture however, because the HGT might have occurred more then 3 billion years ago mechanisms like amelioration will most likely have erased all evidence.
Transcriptional studies of hupW in Nostoc punctiforme ATCC 29133 and Nostocsp strain PCC 7120
It is interesting that hupW in both Nostoc punctiforme and Nostoc sp. strain PCC 7120 are only or mainly transcribed under N2-fixing conditions even though it is not a surprising discovery. The same pattern has been observed for the uptake hydrogenase whose function has previously been connected to N2-fixing . This suggests that the hupW proteases are under the same or similar transcriptional regulation as the hydrogenases they cleave. This expression pattern could be explained by the putative NtcA binding sites in the promoter region of hupW in both Nostoc punctiforme and Nostoc PCC 7120 (Figure 3b). NtcA binding sites have been found upstream of hupSL in Gloeothece sp. ATCC 27152 , Nostoc punctiforme , Lyngbya majuscule CCAP 1446/4  and Anabaena variabilis ATCC 29413 , and putative binding sites have been observed upstream of the hyp-genes in Nostoc punctiforme .
The two putative NtcA binding sites (TGAN8CAC and GTAN12TAC) identified upstream of the TSP of hupW in Nostoc PCC 7120 are imperfect when compared with the sequence signature of NtcA (GTAN8TAC) [49, 50]. These sites are therefore likely to have none or a very weak binding affinity to NtcA and the two conserved regions observed downstream of the TSP may be the target of additional transcription factors. Sequences similar to these conserved regions were also found in the intergenic regions of several other genes in Nostoc PCC 7120 and Anabaena variabilis ATCC 29413 (data not shown) and one of the conserved regions shows resemblance to an IHF binding site and the consensus sequence WATCAANNNNTTR [26, 51]. Binding sites for IHF have previously been found in the promoter region of hupSL in Nostoc punctiforme  and Lyngbya majuscula  but have also been observed upstream of the hup genes in Bradyrhizobium japonicum , the nif genes in purple bacteria  and the nif operon in Anabaena azollae .
Transciptional studies of hoxW in Nostocsp strain PCC 7120
Contrary to the hupW regulation, the result from the Northern blot studies of transcript level on hoxW in Nostoc PCC 7120 showed only a minor difference between non N2-fixing (lower) and N2-fixing conditions (higher). Considering the very small difference seen in transcript level the main function of the bi-directional hydrogenase and its specific protease indicate that it is not connected to N2-fixation. Studies of the transcript levels of the bi-directional hydrogenase subunit hoxH, when shifted from non N2-fixing to N2-fixing (Nostoc muscorum) or to N2 limiting (Gloeocapsa alpicola) conditions, shows either no effect (Nostoc; ) or very small effect (Gloeocapsa; ). However further studies of the bi-directional hydrogenase activity in Gloeocapsa alpicola actually showed significantly increased activity even though the relative abundance of hoxH (and hoxY) transcript did not change .
Conserved regions were identified in the promoter region of hoxW. The first region, containing a short tandemly repeated repetitive (STRR) sequence, has the ability to form a hairpin loop which is not unusual in filamentous cyanobacteria and has been found between hupS and hupL in Anabaena variabilis ATCC 29413, Nostoc PCC 7120, Nostoc punctiforme and Lyngbya majuscula CCAP 1446/4 [46, 56, 57]. In cyanobacteria they are usually made up of 7 bp repeats and even if their function is still not known they may be involved in increasing transcript stability or confer a translation coupling between genes [3, 56, 58]. Hairpin structures in the DNA sequence can also result in pauses during transcription or even act as a termination site . The latter is a more likely scenario in this case since the putative hairpin is positioned close to the 3' end of the previous gene all0769 (4-hydroxyphenylpyruvate dioxygenase), which is not co-transcribed with hoxW.
The second conserved region in the hoxW promoter region shows a strong resemblance to the consensus sequence RGTACNNNDGTWCB of a LexA binding site . LexA has previously been shown to bind to the promoter region of the hox-genes in Synechocystis sp. strain PCC 6803 [22, 59] and Nostoc PCC 7120 , and the hyp-genes in Lyngbya majuscula CCAP 1446/4 .
Specificity of HupW and HoxW in cyanobacteria
An alignment of the deduced amino acid sequence of several groups of proteases revealed that one of the conserved regions found in hydrogenase specific proteases was replaced by a new, unique region in HoxW proteases (group 3d), the so called HOXBOX (aa 42–44 in HoxW, Nostoc PCC 7120). This novel observation of a conserved group specific region may be an important finding for the understanding of the specificity and function of hydrogenase specific proteases. The function of this region in hydrogenase specific proteases has previously been under speculation with some suggesting that it functions as a catalytic site for the proteolytic cleavage [17, 61] and others that it is involved in substrate binding . Amino acid replacement, whereby Asp38 in HycI in E. coli was changed to an asparagine showed no effect on the cleavage process  which of course does not rule out that other parts of this region might be of importance.
In silico location studies of conserved surface residues of different proteases identified that the conserved amino acids are unevenly distributed on the surface and concentrated to certain regions (Figure 7b). To find conserved residues around the proposed nickel binding amino acids Glu16 and His93 (HybD – E. coli) is to be expected considering the importance of these residues for substrate binding. Interestingly, conserved residues were also observed around the HOXBOX region and further on along alpha helix 1, beta sheet 2 and alpha helix 4 [16, 17], especially in group 1 and 2 of the proteases. This could be due to their importance for the overall structure of the protein but could also indicate that these areas are involved in either cleavage function or docking between the protease and the large hydrogenase subunit. The latter theory coincides well with the result from the protein docking studies (Figure 7c). The same areas that contain a high degree of conserved residues were in the docking result often seen in close contact with the hydrogenase. The protein docking results, performed with hydrogenases and proteases from several organisms, places the HOXBOX alternatively the corresponding region continuously in unfavourable positions for C-terminal cleavage making its possible function as a catalytic site unlikely. Added to the already mentioned observation that this region exist in two variations (i.e. the HOXBOX or D(G/C/F)GT) it seems more reasonable it is involved in substrate binding and recognition and might even be important for the proteases specificity.
It should be mentioned that these protein-docking studies are mostly performed with 3D-models constructed through protein threading since no crystallised hydrogenase and protease exist from the same organism. Even though the proteins used in this study are related, the sequence identities are sometimes low (20–25%) but increases in the putative docking areas (30–40%). The large subunit of the hydrogenase is also believed to exist in an open conformation, which probably makes the nickel associated to the active site of the hydrogenase accessible for the protease . An open conformation could have an immense effect on any kind of protease-hydrogenase interaction but is with today's knowledge impossible to predict.
An understanding of the transcriptional regulation of hydrogenase specific proteases in cyanobacteria is starting to emerge. It suggests that the hydrogenase specific proteases in cyanobacteria are under very similar regulatory control as the hydrogenases they cleave. The two proteins also appear to have a close physical interaction during the cleavage moment, which could explain the specificity seen among proteases and the resemblance seen between the protease and the hydrogenase phylogenetic trees, and this interaction might be of very ancient origin. After comparing the phylogenetic tree of hydrogenases and their specific proteases we suggest that a group 3 hydrogenase spread through HGT to the bacterial domain, probably together with a hydrogenase specific protease indicating that the proteolytic cleavage first evolved within group 3a/4 hydrogenases. We also propose that all 3d-type hydrogenases within bacteria evolved from this group 3 hydrogenase and therefore are the result of the same HGT event. Finally the novel observation of the so called HOXBOX may help in understanding the specificity seen among hydrogenase specific proteases and is an interesting target for further studies.
Bacterial strains and culture conditions
Cyanobacterial strains used in these experimental studies, Nostoc sp. strain PCC 7120 (also known as Anabaena sp. strain PCC 7120) , and Nostoc punctiforme ATCC 29133 (also known as Nostoc sp. strain PCC 73102)  were grown in BG11o medium (N2-fixing cultures) at 30°C under continuous light (40 μmol photons s-1m-2) and by sparging with air as previously described . For non N2-fixing growth (cultures with no heterocysts) NH4Cl (2.5 mM) and MOPS (0.5 mM), adjusted to pH 7.8, were added to the medium. All cultures were mixed using a magnetic stirrer. Escherichia coli strains were grown in LB medium or on agar plates containing LB medium and antibiotics of interest at 37°C.
RNA and DNA isolation
N2-fixing cell cultures were harvested in room temperature for DNA isolation as previously described  with the exception that 2 M instead of 3 M of NaAc was used. RNA was extracted from both N2-fixing and non N2-fixing cultures by centrifugation of the cells (4,500 × g for 10 in) in room temperature followed by resuspension in 1 ml TRIzol reagent (Sigma). The cells were then disrupted with 0.2 g of acid washed 0.6-mm-diameter glass beads by using a Fast-prep (Precellys®24) at a speed of 5.5 for 3 × 20 s, keeping the samples on ice in between runs. Phases were separated by centrifugation at 15,000 × g for 10 min at 4°C and the cleared solution was then transferred to new tubes and incubated at room temperature for 5 min. 0.2 ml of chloroform were added to the samples which were thereafter gently turned by hand for 15 s followed by a 2 min incubation at room temperature. The samples were then centrifugated at 15,000 × g for 15 min at 4°C and the upper obtained liquid phase was transferred to new tubes. The precipitation of the RNA was performed by adding 0.25 ml isopropanol and 0.25 ml of salt solution (0.8 M Sodium citrate and 1.2 M NaCl) followed by incubation at room temperature for 10 min. The RNA was then collected by centrifugation 15,000 × g for 10 min at 4°C and washed with 75% ethanol before treatment with DNase I (GE Healthcare) in 20 μl Dnase buffer (40 mM Tris-HCl, 6 mM MgCl2, pH 7.5) for 30 min at 37°C. A phenol: chloroform extraction was performed and the RNA was precipitated in 2.5 volume of ice-cold ethanol (99.5%) and 0.2 volume of cold LiCl (10 M). After precipitation at -20°C over night the samples were centrifuged at 20,000 × g, washed and resuspended in DEPC-treated distilled H2O.
Identification of transcriptional start points (TSP)
TSP studies were performed using RNA from N2-fixing cultures and the "5'RACE System for Rapid Amplification of cDNA Ends" kit (Invitrogen) according to manual. Resulting bands were cloned into the pCR 2.1-TOPO vector (Invitrogen) and transformed into DH5α competent cells, all according to instructions from the manufacturer. The obtained vectors were purified by the "Genelute Plasmid Mini-prep Kit" (Sigma-Aldrich) followed by sequencing (Macrogen Inc).
In the case of hoxW in Nostoc PCC 7120, the primers used for the reactions were modified and designed according to the TAG-method  and only the first of the two nested PCRs described in the "5'RACE System for Rapid Amplification of cDNA Ends" kit manual was performed (Table 1).
cDNA for transcriptional studies by RT-PCR were produced from RNA from N2-fixing and non N2-fixing cultures by using the RevertAid™ First Strand cDNA Synthesis Kit (Fermentas) containing RevertAid™ H Minus M-MuLV Reverse Transcriptase and RiboLock™ Ribonuclease Inhibitor according to the instructions. The following PCRs were done using TAQ polymeras (Fermentas) according to manufacturers instructions and visualized on a 1% agaros gel.
The probe used for Northern blot was produced by PCR amplification with appropriate primers (Table 1) and purified with the GFX, PCR, DNA and Gel Band Purification Kit (GE Healthcare). 7 μg of total RNA from N2-fixing and non N2-fixing cultures of Nostoc PCC 7120 and Nostoc punctiforme was separated by electrophoresis in denaturing agarose gels and blotted to Hybond-N+ (GE Healthcare) according to instruction using the, in the instruction described, modified Church and Gilbert buffer. Labelling of the probes was done using the Rediprime II Random prime labelling system (GE Healthcare) and removing of unincorporated 32P dCTP was thereafter performed by using Probe Quant G-50 microcolumns (GE Healthcare). The equal loading of the RNA was analyzed by the relative amount of rnpB transcripts. The positioning of the bands was visualized using a Pharos FX™ plus Molecular Imager (Bio-Rad) and analyzed with the accompanying software.
Protein and nucleotide sequence analysis and construction of phylogenetic tree
All strains and proteins, together with their GenBank accession number, used in this study are shown in Table 2[69–87]. Protein sequences used for the phylogenetic tree were retrieved from the NCBI database . All alignments were performed in BioEdit version 188.8.131.52  using ClustalW multiple alignment and the resulting alignment were corrected manually. For the construction of the unrooted phylogenetic tree the alignments were run through PAUP version 4.0 beta and MrBayes 3.1 software [90–92]. The maximum parsimony analysis (PAUP) was performed with heuristic algorithm and random addition of the sequences and bootstrap support values was calculated 1000 times. For the bayesian analysis MrBayes was executed for 1 000 000 generations with a sample frequency of 100 using the WAG model. A burn-in of 2500 trees was used and the support values indicate the proportion of the 7500 remaining trees. The online program ModelGenerator was used to determine the optimal model (WAG) [93, 94]. For graphic outputs the resulting trees were then visualised by using Treeview [95, 96].
Searches for homologues sequences of Npun_F0373 (Nostoc punctiforme), Alr1422 (Nostoc PCC 7120) and promoter regions were done by both using the NCBI and CyanoBase databases and their respective BLAST programs. Prediction of DNA secondary structure was done by using the online program MFold [97, 98]. Transmembrane regions were predicted using the online program SOSUI [99–101].
For location studies of conserved residues on the surface of the proteases, alignments were performed for three of the protease groups revealed in the phylogenetic tree; group 5 – proteases of HoxW type (HoxW from Nostoc PCC 7120,Anabaena variabilis ATCC 29413,Lyngbya sp. strain PCC 8106, Ralstonia eutropha H16,Thiocapsa roseopersicina, Synechococcus sp. strain PCC 7002,Synechocystis sp. strain PCC 6803, Mycobacterium vanbaalenii PYR-1, and Methylococcus capsulatus strain Bath), group 2- cyanobacterial proteases of HupW type (HupW from Nostoc PCC 7120, Nostoc punctiforme, Lyngbya sp. strain PCC 8106, Anabaena variabilis ATCC 29413, Nodularia spumigena CCY 9414 and Gloeothece sp. strain PCC 6909) and group 1- proteases of HybD type (HupD/Azoarcus sp. BH72, HupD/Bradyrhizobium japonicum, HynC/Desulfovibrio gigas, HynC/Desulfovibrio vulgaris str. Miyazaki F, Desulfovibrio vulgaris subsp. vulgaris DP4, HyaD/HybD/E. coli K12, HoxM/Ralstonia eutropha H16, HupD/Rhizobium leguminosarum bv. Viciae, HyaD/HupD/HybD/Salmonella enterica subsp.enterica serovar Choleraesuis str. SC-B67, HyaA/HybD/Shigella boydii Sb227 and HupD/Thiocapsa roseopersicina). Conserved residues shared by 100%, 90%, and 80% of the sequences were then visualised on the surface of the 3D models on a representative from each group; the 3D models of HoxW and HupW from Nostoc PCC 7120 and on the crystallized structure of HybD from E. coli (protein data bank accession number 1CFZ.pdb).
3D modelling and protein docking
3D models of proteases were constructed by using the online program SWISS-MODEL  and with HybD from E. coli as a template (1CFZ.pdb). The same method were also used for the 3D models of the large subunits of the hydrogenases, using HydB from Desulufovibrio vulgaris Miyazaki F as template (protein data bank accession number 1UBJ:L). The results were visualised in the program Swiss-PDB-viewer [103, 104].
Protein-protein docking simulations were done by using the docking program BiGGER V2 . The following constraints were set; Gln16 and His93 in the protease had to be at a minimum distance of 8 Å from the Cys61 and Cys546 in the hydrogenase large subunit (amino acid numbers refers to HybD and HybC in E. coli). The docking experiments were then run as soft docking with an angular step of 15° and a minimum contact of 300. The residues used for constraints were chosen since they are suggested to bind to the nickel in the active site of the large subunit of the hydrogenase [17, 62, 106]. The docking simulations were done for the following combinations; HybC model – HybD (1CFZ) (E. coli), HydB (1UBJ:L) – HynC model (Desulfovibrio vulgaris str. Miyazaki F) and HoxH model – HoxW model (Nostoc PCC 7120). The best solutions were selected according to the global score from BiGGER V2 and with regard to the possibility of nickel binding.
Tomitani A, Knoll AH, Cavanaugh CM, Ohno T: The evolutionary diversification of cyanobacteria: molecular-phylogenetic and paleontological perspectives. Proc Natl Acad Sci USA. 2006, 103 (14): 5442-5447. 10.1073/pnas.0600999103.
Cavalier-Smith T: Cell evolution and Earth history: stasis and revolution. Philos Trans R Soc Lond B Biol Sci. 2006, 361 (1470): 969-1006. 10.1098/rstb.2006.1842.
Tamagnini P, Leitao E, Oliveira P, Ferreira D, Pinto F, Harris DJ, Heidorn T, Lindblad P: Cyanobacterial hydrogenases: diversity, regulation and applications. FEMS Microbiol Rev. 2007, 31 (6): 692-720. 10.1111/j.1574-6976.2007.00085.x.
Dunn JH, Wolk CP: Composition of the cellular envelopes of Anabaena cylindrica. J Bacteriol. 1970, 103 (1): 153-158.
Tamagnini P, Troshina O, Oxelfelt F, Salema R, Lindblad P: Hydrogenases in Nostoc sp. Strain PCC 73102, a strain lacking a bidirectional enzyme. Appl Environ Microbiol. 1997, 63 (5): 1801-1807.
Forzi L, Sawers RG: Maturation of [NiFe]-hydrogenases in Escherichia coli. Biometals. 2007
Bock A, King PW, Blokesch M, Posewitz MC: Maturation of hydrogenases. Adv Microb Physiol. 2006, 51: 1-71. 10.1016/S0065-2911(06)51001-X.
Jacobi A, Rossmann R, Bock A: The hyp operon gene products are required for the maturation of catalytically active hydrogenase isoenzymes in Escherichia coli. Arch Microbiol. 1992, 158 (6): 444-451. 10.1007/BF00276307.
Lutz S, Jacobi A, Schlensog V, Bohm R, Sawers G, Bock A: Molecular characterization of an operon (hyp) necessary for the activity of the three hydrogenase isoenzymes in Escherichia coli. Mol Microbiol. 1991, 5 (1): 123-135. 10.1111/j.1365-2958.1991.tb01833.x.
Agervald A, Stensjo K, Holmqvist M, Lindblad P: Transcription of the extended hyp-operon in Nostoc sp. strain PCC 7120. BMC Microbiol. 2008, 8: 69-10.1186/1471-2180-8-69.
Gollin DJ, Mortenson LE, Robson RL: Carboxyl-terminal processing may be essential for production of active NiFe hydrogenase in Azotobacter vinelandii. FEBS Lett. 1992, 309 (3): 371-375. 10.1016/0014-5793(92)80809-U.
Menon NK, Robbins J, Vartanian MD, Patil D, Harry D, Peck J, Menon AL, Robson RL, Przybyla AE: Carboxy-terminal processing of the large subunit of [NiFe] hydrogenases. FEBS Lett. 1993, 331 (1–2): 91-95. 10.1016/0014-5793(93)80303-C.
Rossmann R, Sauter M, Lottspeich F, Böck A: Maturation of the large subunit (HYCE) of Escherichia coli hydrogenase 3 requires nickel incorporation followed by C-terminal processing at Arg537. Eur J Biochem. 1994, 220 (2): 377-384. 10.1111/j.1432-1033.1994.tb18634.x.
Magalon A, Bock A: Dissection of the maturation reactions of the [NiFe] hydrogenase 3 from Escherichia coli taking place after nickel incorporation. FEBS Lett. 2000, 473 (2): 254-258. 10.1016/S0014-5793(00)01542-8.
Thiemermann S, Dernedde J, Bernhard M, Schroeder W, Massanz C, Friedrich B: Carboxyl-terminal processing of the cytoplasmic NAD-reducing hydrogenase of Alcaligenes eutrophus requires the hoxW gene product. J Bacteriol. 1996, 178 (8): 2368-2374.
Wünschiers R, Batur M, Lindblad P: Presence and expression of hydrogenase specific C-terminal endopeptidases in cyanobacteria. BMC Microbiol. 2003, 3 (8): 8-10.1186/1471-2180-3-8.
Fritsche E, Paschos A, Beisel H-G, Böck A, Huber R: Crystal Structure of the Hydrogenase Maturationing Endopeptidase HYBD from Escherichia coli. J Mol Biol. 1999, 288 (5): 989-998. 10.1006/jmbi.1999.2719.
Maier T, Bock A: Generation of Active [NiFe] Hydrogenase in Vitro from a Nickel-Free Precursor Form. Biochemistry. 1996, 35 (31): 10089-10093. 10.1021/bi960567l.
Theodoratou E, Paschos A, Magalon A, Fritsche E, Huber R, Böck A: Nickel serves as a substrate recognition motif for the endopeptidase involved in hydrogenase maturation. Eur J Biochem. 2000, 267: 1995-1999. 10.1046/j.1432-1327.2000.01202.x.
Axelsson R, Oxelfelt F, Lindblad P: Transcriptional regulation of Nostoc uptake hydrogenase. FEMS Microbiol Lett. 1999, 170: 77-81. 10.1111/j.1574-6968.1999.tb13357.x.
Boison G, Bothe H, Schmitz O: Transcriptional Analysis of Hydrogenase Genes in the Cyanobacteria Anacystis nidulans and Anabaena variabilis Monitored by RT-PCR. Curr Microbiol. 2000, 40 (5): 315-321. 10.1007/s002849910063.
Oliveira P, Lindblad P: LexA, a transcription regulator binding in the promoter region of the bidirectional hydrogenase in the cyanobacterium Synechocystis sp. PCC 6803. FEMS Microbiol Lett. 2005, 251 (1): 59-66. 10.1016/j.femsle.2005.07.024.
Sjöholm J, Oliveira P, Lindblad P: Transcription and regulation of the bidirectional hydrogenase in the cyanobacterium Nostoc sp. strain PCC 7120. Appl Environ Microbiol. 2007, 73 (17): 5435-5446. 10.1128/AEM.00756-07.
Oliveira P, Lindblad P: An AbrB-Like protein regulates the expression of the bidirectional hydrogenase in Synechocystis sp. strain PCC 6803. J Bacteriol. 2008, 190 (3): 1011-1019. 10.1128/JB.01605-07.
Vignais PM, Billoud B, Meyer J: Classification and phylogeny of hydrogenases. FEMS Microbiol Rev. 2001, 25 (4): 455-501.
Wagner R: Transcription Regulation in Prokaryotes. 2000, Oxford: Oxford University Press Inc
Mazon G, Lucena JM, Campoy S, Fernandez de Henestrosa AR, Candau P, Barbe J: LexA-binding sequences in Gram-positive and cyanobacteria are closely related. Mol Genet Genomics. 2004, 271 (1): 40-49. 10.1007/s00438-003-0952-x.
Wu LF, Mandrand MA: Microbial hydrogenases: primary structure, classification, signatures and phylogeny. FEMS Microbiol Rev. 1993, 10 (3–4): 243-269.
Vignais PM, Billoud B: Occurrence, classification, and biological function of hydrogenases: an overview. Chem Rev. 2007, 107 (10): 4206-4272. 10.1021/cr050196r.
Deppenmeier U, Johann A, Hartsch T, Merkl R, Schmitz RA, Martinez-Arias R, Henne A, Wiezer A, Baumer S, Jacobi C, et al: The genome of Methanosarcina mazei: evidence for lateral gene transfer between bacteria and archaea. J Mol Microbiol Biotechnol. 2002, 4 (4): 453-461.
Lawrence JG, Ochman H: Molecular archaeology of the Escherichia coli genome. Proc Natl Acad Sci USA. 1998, 95 (16): 9413-9417. 10.1073/pnas.95.16.9413.
Nesbo CL, L'Haridon S, Stetter KO, Doolittle WF: Phylogenetic analyses of two "archaeal" genes in thermotoga maritima reveal multiple transfers between archaea and bacteria. Mol Biol Evol. 2001, 18 (3): 362-375.
Woese CR: Interpreting the universal phylogenetic tree. Proc Natl Acad Sci USA. 2000, 97 (15): 8392-8396. 10.1073/pnas.97.15.8392.
Dagan T, Artzy-Randrup Y, Martin W: Modular networks and cumulative impact of lateral transfer in prokaryote genome evolution. Proc Natl Acad Sci USA. 2008, 105 (29): 10039-10044. 10.1073/pnas.0800679105.
Raymond J, Zhaxybayeva O, Gogarten JP, Gerdes SY, Blankenship RE: Whole-Genome Analysis of Photosynthetic Prokaryotes. Science. 2002, 298 (5598): 1616-1620. 10.1126/science.1075558.
Calteau A, Gouy M, Perriere G: Horizontal transfer of two operons coding for hydrogenases between bacteria and archaea. J Mol Evol. 2005, 60 (5): 557-565. 10.1007/s00239-004-0094-8.
Hedges SB: The origin and evolution of model organisms. Nat Rev Genet. 2002, 3 (11): 838-849. 10.1038/nrg929.
Ciccarelli FD, Doerks T, von Mering C, Creevey CJ, Snel B, Bork P: Toward Automatic Reconstruction of a Highly Resolved Tree of Life. Science. 2006, 311 (5765): 1283-1287. 10.1126/science.1123061.
Battistuzzi FU, Feijao A, Hedges SB: A genomic timescale of prokaryote evolution: insights into the origin of methanogenesis, phototrophy, and the colonization of land. BMC Evol Biol. 2004, 4: 44-10.1186/1471-2148-4-44.
Sheridan PP, Freeman KH, Brenchley JE: Estimated Minimal Divergence Times of the Major Bacterial and Archaeal Phyla. Geomicrobiology Journal. 2003, 20: 1-14. 10.1080/01490450303891.
Baymann F, Lebrun E, Brugna M, Schoepp-Cothenet B, Giudici-Orticoni M-Trs, Nitschke W: The redox protein construction kit: pre-last universal common ancestor evolution of energy-conserving enzymes. Phil Trans Biol Sci. 2003, 358 (1429): 267-274. 10.1098/rstb.2002.1184.
Ochman H, Lawrence JG, Groisman EA: Lateral gene transfer and the nature of bacterial innovation. Nature. 2000, 405 (6784): 299-304. 10.1038/35012500.
Goldenfeld N, Woese C: Biology's next revolution. Nature. 2007, 445 (7126): 369-10.1038/445369a.
Oliveira P, Leitao E, Tamagnini P, Moradas-Ferreira P, Oxelfelt F: Characterization and transcriptional analysis of hupSLW in Gloeothece sp. ATCC 27152: an uptake hydrogenase from a unicellular cyanobacterium. Microbiology. 2004, 150 (Pt 11): 3647-3655. 10.1099/mic.0.27248-0.
Lindberg P: Cyanobacterial Hydrogen Metabolism – Uptake Hydrogenase and Hydrogen Production by Nitrogenase in Filamentous Cyanobacteria. 2003, Uppsala: Uppsala Universtiy
Leitao E, Oxelfelt F, Oliveira P, Moradas-Ferreira P, Tamagnini P: Analysis of the hupSL operon of the nonheterocystous cyanobacterium Lyngbya majuscula CCAP 1446/4: regulation of transcription and expression under a light-dark regimen. Appl Environ Microbiol. 2005, 71 (8): 4567-4576. 10.1128/AEM.71.8.4567-4576.2005.
Weyman PD, Pratte B, Thiel T: Transcription of hupSL in Anabaena variabilis ATCC 29413 is regulated by NtcA and not by hydrogen. Appl Environ Microbiol. 2008, 74 (7): 2103-2110. 10.1128/AEM.02855-07.
Hansel A, Axelsson R, Lindberg P, Troshina OY, Wünschiers R, Lindblad P: Cloning and characterisation of a hyp gene cluster in the filamentous cyanobacterium Nostoc sp. strain PCC 73102. FEMS Microbiol Lett. 2001, 201 (1): 59-64. 10.1111/j.1574-6968.2001.tb10733.x.
Herrero A, Muro-Pastor AM, Flores E: Nitrogen control in cyanobacteria. J Bacteriol. 2001, 183 (2): 411-425. 10.1128/JB.183.2.411-425.2001.
Luque I, Flores E, Herrero A: Molecular mechanism for the operation of nitrogen control in cyanobacteria. Embo J. 1994, 13 (12): 2862-2869.
Goodrich JA, Schwartz ML, McClure WR: Searching for and predicting the activity of sites for DNA binding proteins: compilation and analysis of the binding sites for Escherichia coli integration host factor (IHF). Nucleic Acids Res. 1990, 18 (17): 4993-5000. 10.1093/nar/18.17.4993.
Black LK, Maier RJ: IHF- and RpoN-dependent regulation of hydrogenase expression in Bradyrhizobium japonicum. Mol Microbiol. 1995, 16 (3): 405-413. 10.1111/j.1365-2958.1995.tb02406.x.
Hoover TR, Santero E, Porter S, Kustu S: The integration host factor stimulates interaction of RNA polymerase with NIFA, the transcriptional activator for nitrogen fixation operons. Cell. 1990, 63 (1): 11-22. 10.1016/0092-8674(90)90284-L.
Jackman DM, Mulligan ME: Characterization of a nitrogen-fixation (nif) gene cluster from Anabaena azollae 1a shows that closely related cyanobacteria have highly variable but structured intergenic regions. Microbiology. 1995, 141: 2235-2244.
Sheremetieva ME, Troshina OY, Serebryakova LT, Lindblad P: Identification of hox genes and analysis of their transcription in the unicellular cyanobacterium Gloeocapsa alpicola CALU 743 growing under nitrate-limiting conditions. FEMS Microbiol Lett. 2002, 214 (2): 229-233. 10.1111/j.1574-6968.2002.tb11352.x.
Lindberg P, Hansel A, Lindblad P: hupS and hupL constitute a transcription unit in the cyanobacterium Nostoc sp. PCC 73102. Arch Microbiol. 2000, 174 (1–2): 129-133. 10.1007/s002030000186.
Tamagnini P, Axelsson R, Lindberg P, Oxelfelt F, Wunschiers R, Lindblad P: Hydrogenases and Hydrogen Metabolism of Cyanobacteria. Microbiol Mol Biol Rev. 2002, 66 (1): 1-20. 10.1128/MMBR.66.1.1-20.2002.
Mazel D, Houmard J, Castets AM, Tandeau de Marsac N: Highly repetitive DNA sequences in cyanobacterial genomes. J Bacteriol. 1990, 172 (5): 2755-2761.
Gutekunst K, Phunpruch S, Schwarz C, Schuchardt S, Schulz-Friedrich R, Appel J: LexA regulates the bidirectional hydrogenase in the cyanobacterium Synechocystis sp. PCC 6803 as a transcription activator. Mol Microbiol. 2005, 58 (3): 810-823. 10.1111/j.1365-2958.2005.04867.x.
Ferreira D, Leitao E, Sjoholm J, Oliveira P, Lindblad P, Moradas-Ferreira P, Tamagnini P: Transcription and regulation of the hydrogenase(s) accessory genes, hypFCDEAB, in the cyanobacterium Lyngbya majuscula CCAP 1446/4. Arch Microbiol. 2007, 188 (6): 609-617. 10.1007/s00203-007-0281-2.
Yang F, Hu W, Xu H, Li C, Xia B, Jin C: Solution structure and backbone dynamics of an endopeptidase HycI from Escherichia coli : implications for mechanism of the [NiFe] hydrogenase maturation. J Biol Chem. 2007, 282 (6): 3856-3863. 10.1074/jbc.M609263200.
Theodoratou E, Huber R, Böck A: [NiFe]-Hydrogenase maturation endopeptidase: structure and function. 7th International Hydrogenase Conference: 2005. 2005, Reading, UK: Biochemical Society Transactions, 108-111.
Kaneko T, Nakamura Y, Wolk CP, Kuritz T, Sasamoto S, Watanabe A, Iriguchi M, Ishikawa A, Kawashima K, Kimura T, et al: Complete genomic sequence of the filamentous nitrogen-fixing cyanobacterium Anabaena sp. strain PCC 7120. DNA Res. 2001, 8 (5): 205-213. 10.1093/dnares/8.5.205. 227–253
Meeks JC, Elhai J, Thiel T, Potts M, Larimer F, Lamerdin J, Predki P, Atlas R: An overview of the genome of Nostoc punctiforme, a multicellular, symbiotic cyanobacterium. Photosynth Res. 2001, 70 (1): 85-106. 10.1023/A:1013840025518.
Stensjö K, Ow SY, Barrios-Llerena ME, Lindblad P, Wright PC: An iTRAQ-based quantitative analysis to elaborate the proteomic response of Nostoc sp. PCC 7120 under N2 fixing conditions. J Proteome Res. 2007, 6 (2): 621-635. 10.1021/pr060517v.
Pinto FL, Svensson H, Lindblad P: Generation of non-genomic oligonucleotide tag sequences for RNA template-specific PCR. BMC Biotechnol. 2006, 6: 31-10.1186/1472-6750-6-31.
Rozen S, Skaletsky H: Primer3 on the WWW for general users and for biologist programmers. Methods Mol Biol. 2000, 132: 365-386.
Agrawal AG, Voordouw G, Gartner W: Sequential and structural analysis of [NiFe]-hydrogenase-maturation proteins from Desulfovibrio vulgaris Miyazaki F. Antonie Leeuwenhoek. 2006, 90 (3): 281-290. 10.1007/s10482-006-9082-x.
Bult CJ, White O, Olsen GJ, Zhou L, Fleischmann RD, Sutton GG, Blake JA, FitzGerald LM, Clayton RA, Gocayne JD, et al: Complete genome sequence of the methanogenic archaeon, Methanococcus jannaschii. Science. 1996, 273 (5278): 1058-1073. 10.1126/science.273.5278.1058.
Chiu CH, Tang P, Chu C, Hu S, Bao Q, Yu J, Chou YY, Wang HS, Lee YS: The genome sequence of Salmonella enterica serovar Choleraesuis, a highly invasive and resistant zoonotic pathogen. Nucleic Acids Res. 2005, 33 (5): 1690-1698. 10.1093/nar/gki297.
Colbeau A, Kovacs KL, Chabert J, Vignais PM: Cloning and sequence of the structural (hupSLC) and accessory (hupDHI) genes for hydrogenase biosynthesis in Thiocapsa roseopersicina. Gene. 1994, 140 (1): 25-31. 10.1016/0378-1119(94)90726-9.
Halboth S, Klein A: Methanococcus voltae harbors four gene clusters potentially encoding two [NiFe] and two [NiFeSe] hydrogenases, each of the cofactor F420-reducing or F420-non-reducing types. Mol Gen Genet. 1992, 233 (1–2): 217-224. 10.1007/BF00587582.
Hendrickson EL, Kaul R, Zhou Y, Bovee D, Chapman P, Chung J, Conway de Macario E, Dodsworth JA, Gillett W, Graham DE, et al: Complete Genome Sequence of the Genetically Tractable Hydrogenotrophic Methanogen Methanococcus maripaludis. J Bacteriol. 2004, 186 (20): 6956-6969. 10.1128/JB.186.20.6956-6969.2004.
Hidalgo E, Palacios JM, Murillo J, Ruiz-Argueso T: Nucleotide sequence and characterization of four additional genes of the hydrogenase structural operon from Rhizobium leguminosarum bv. viciae. J Bacteriol. 1992, 174 (12): 4130-4139.
Kaneko T, Sato S, Kotani H, Tanaka A, Asamizu E, Nakamura Y, Miyajima N, Hirosawa M, Sugiura M, Sasamoto S, et al: Sequence analysis of the genome of the unicellular cyanobacterium Synechocystis sp. strain PCC 6803. II. Sequence determination of the entire genome and assignment of potential protein-coding regions. DNA Res. 1996, 3 (3): 109-136. 10.1093/dnares/3.3.109.
Kaneko T, Tanaka A, Sato S, Kotani H, Sazuka T, Miyajima N, Sugiura M, Tabata S: Sequence analysis of the genome of the unicellular cyanobacterium Synechocystis sp. strain PCC 6803. I. Sequence features in the 1 Mb region from map positions 64% to 92% of the genome. DNA Res. 1995, 2 (4): 153-166. 10.1093/dnares/2.4.153. 191–198
Krause A, Ramakumar A, Bartels D, Battistoni F, Bekel T, Boch J, Bohm M, Friedrich F, Hurek T, Krause L, et al: Complete genome of the mutualistic, N2-fixing grass endophyte Azoarcus sp. strain BH72. Nat Biotechnol. 2006, 24 (11): 1385-1391. 10.1038/nbt1243.
Maeder DL, Weiss RB, Dunn DM, Cherry JL, Gonzalez JM, DiRuggiero J, Robb FT: Divergence of the hyperthermophilic archaea Pyrococcus furiosus and P. horikoshii inferred from complete genomic sequences. Genetics. 1999, 152 (4): 1299-1305.
Maroti G, Fodor BD, Rakhely G, Kovacs AT, Arvani S, Kovacs KL: Accessory proteins functioning selectively and pleiotropically in the biosynthesis of [NiFe] hydrogenases in Thiocapsa roseopersicina. European Journal of Biochemistry. 2003, 270 (10): 2218-2227. 10.1046/j.1432-1033.2003.03589.x.
Oxelfelt F, Tamagnini P, Lindblad P: Hydrogen uptake in Nostoc sp. strain PCC 73102. Cloning and characterization of a hupSL homologue. Arch Microbiol. 1998, 169 (4): 267-274. 10.1007/s002030050571.
Rakhely G, Kovacs AT, Maroti G, Fodor BD, Csanadi G, Latinovics D, Kovacs KL: Cyanobacterial-Type, Heteropentameric, NAD+-Reducing NiFe Hydrogenase in the Purple Sulfur Photosynthetic Bacterium Thiocapsa roseopersicina. Appl Environ Microbiol. 2004, 70 (2): 722-728. 10.1128/AEM.70.2.722-728.2004.
Riley M, Abe T, Arnaud MB, Berlyn MK, Blattner FR, Chaudhuri RR, Glasner JD, Horiuchi T, Keseler IM, Kosuge T, et al: Escherichia coli K-12: a cooperatively developed annotation snapshot – 2005. Nucleic Acids Res. 2006, 34 (1): 1-9. 10.1093/nar/gkj405.
Rousset M, Magro V, Forget N, Guigliarelli B, Belaich JP, Hatchikian EC: Heterologous expression of the Desulfovibrio gigas [NiFe] hydrogenase in Desulfovibrio fructosovorans MR400. J Bacteriol. 1998, 180 (18): 4982-4986.
Schwartz E, Henne A, Cramm R, Eitinger T, Friedrich B, Gottschalk G: Complete nucleotide sequence of pHG1: a Ralstonia eutropha H16 megaplasmid encoding key enzymes of H(2)-based ithoautotrophy and anaerobiosis. J Mol Biol. 2003, 332 (2): 369-383. 10.1016/S0022-2836(03)00894-5.
Ward N, Larsen O, Sakwa J, Bruseth L, Khouri H, Durkin AS, Dimitrov G, Jiang L, Scanlan D, Kang KH, et al: Genomic insights into methanotrophy: the complete genome sequence of Methylococcus capsulatus (Bath). PLoS Biol. 2004, 2 (10): e303-10.1371/journal.pbio.0020303.
Yang F, Yang J, Zhang X, Chen L, Jiang Y, Yan Y, Tang X, Wang J, Xiong Z, Dong J, et al: Genome dynamics and diversity of Shigella species, the etiologic agents of bacillary dysentery. Nucleic Acids Res. 2005, 33 (19): 6445-6458. 10.1093/nar/gki954.
NCBI database. [http://www.ncbi.nlm.nih.gov/]
Hall TA: BioEdit: a user-friendly biological sequence alignment editor and analysis program for Windows 95/98/NT. Nucl Acids Symp Ser. 1999, 41: 95-98.
Swofford DL: PAUP*. Phylogenetic Analysis Using Parsimony (*and Other Methods). Version 4. 2003, Sunderland, Massachusetts: Sinauer Associates
Huelsenbeck JP, Ronquist F: MRBAYES: Bayesian inference of phylogenetic trees. Bioinformatics. 2001, 17 (8): 754-755. 10.1093/bioinformatics/17.8.754.
Ronquist F, Huelsenbeck JP: MrBayes 3: Bayesian phylogenetic inference under mixed models. Bioinformatics. 2003, 19 (12): 1572-1574. 10.1093/bioinformatics/btg180.
Keane T, Creevey C, Pentony M, Naughton T, Mclnerney J: Assessment of methods for amino acid matrix selection and their use on empirical data shows that ad hoc assumptions for choice of matrix are not justified. BMC Evolutionary Biology. 2006, 6 (1): 29-10.1186/1471-2148-6-29.
Page RD: TreeView: an application to display phylogenetic trees on personal computers. Comput Appl Biosci. 1996, 12 (4): 357-358.
Zuker M: Mfold web server for nucleic acid folding and hybridization prediction. Nucleic Acids Res. 2003, 31 (13): 3406-3415. 10.1093/nar/gkg595.
Hirokawa T, Boon-Chieng S, Mitaku S: SOSUI: classification and secondary structure prediction system for membrane proteins. Bioinformatics. 1998, 14 (4): 378-379. 10.1093/bioinformatics/14.4.378.
Mitaku S, Hirokawa T: Physicochemical factors for discriminating between soluble and membrane proteins: hydrophobicity of helical segments and protein length. Protein Eng. 1999, 12 (11): 953-957. 10.1093/protein/12.11.953.
Guex N, Peitsch MC: SWISS-MODEL and the Swiss-PdbViewer: an environment for comparative protein modeling. Electrophoresis. 1997, 18 (15): 2714-2723. 10.1002/elps.1150181505.
Palma PN, Krippahl L, Wampler JE, Moura JJ: BiGGER: a new (soft) docking algorithm for predicting protein interactions. Proteins. 2000, 39 (4): 372-384. 10.1002/(SICI)1097-0134(20000601)39:4<372::AID-PROT100>3.0.CO;2-Q.
Massanz C, Friedrich B: Amino acid replacements at the H2-activating site of the NAD-reducing hydrogenase from Alcaligenes eutrophus. Biochemistry. 1999, 38 (43): 14330-14337. 10.1021/bi9908080.
This work was supported by the Swedish Energy Agency, the Knut and Alice Wallenberg Foundation, the Nordic Energy Research Program (project BioH2), the EU/NEST FP6 project, BioModularH2 (contract # 043340), and the EU/Energy FP7 project SOLAR-H2 (contract # 212508). We would also like to thank Anneleen Kool (Uppsala University) and Björn Brindefalk (Uppsala University) for the excellent support and help with constructing and analysing the phylogenetic tree and Fernando Lopes Pinto (Uppsala University) for his help with designing the TAG primers used in the 5'RACE experiments.
ED performed most experimental work; Most of the transcriptional studies of hupW and hoxW, all studies done in silico including phylogenetic studies and specificity studies and analysis of the data. She is the primary author of the final manuscript. MH identified the TSPs of alr1422/hupW in Nostoc PCC 7120. KS supervised the experimental work and was also involved in parts of the writing of the manuscript. PL conceived and coordinated the project and the manuscript. All authors have read and approved the manuscript.
Electronic supplementary material
Additional file 1: Supplementary extended tree. This PDF-file contains an extended phylogenetic tree containing more hydrogenase specific proteases from both bacterial and archaean strains including putative type 3 b proteases. The proposed subgroups for each protease are marked in the figure; 1 (red), 2 (orange), 3a (blue), 3d (purple), 4 (green). When protease subgroup is unknown the group number of proposed cleavage substrate (hydrogenase) is written in brackets. It is based on the protease's placement within the phylogenetic tree, the number of hydrogenases within each strain and the possibility for co-transcription with a hydrogenase. X: The point in the phylogenetic tree when horizontal gene transfer might have occurred. Y/Z: Suggested positions of root. Archaean strains: red text. Bacterial strains: black text. For abbreviations used see Additional file 2. The tree were constructed using the MrBayes software which was executed for 1 500 000 generations with a sample frequency of 100 using the WAG model. A burn-in of 3750 (25%) trees was used. For graphic outputs the resulting trees were visualised by using Treeview. (PDF 267 KB)
Additional file 2: Table organisms. This excel-file contains a table of all hydrogenase specific proteases used in the extended phylogenetic tree (Additional file 1) including strain, organism, locus_tag, abbreviation, accession number, and proposed phylogenetic group. This file also contains the number of hydrogenases in each strain including accession number. Proposed cleavage substrate (hydrogenase large subunit) for each protease is marked with grey background/bold text and is based on each protease position in phylogenetic tree, the number of hydrogenases within each strain and location within genome (i.e. possibility for co-transcription with hydrogenase gene). B; unknown phylogenetic group. (XLS 34 KB)
Additional file 4: Supplementary figure NpunF0373homologoues. This word document file show the presence/absence of homologous to the gene Npun_F0373 of Nostoc punctiforme in selected cyanobacterial strains together with their, when present, locus_tag and GenBank accession number. hupL, hupW, hoxH, hoxW and different metabolic functions; the ability to produce heterocyst and filaments and the capacity for nitrogen-fixation, are also indicated. (+); present, (-); absent, (?); presence/absence unknown. (DOC 44 KB)
Authors’ original submitted files for images
Below are the links to the authors’ original submitted files for images.