SORGOdb: Superoxide Reductase Gene Ontology curated DataBase
© Lucchetti-Miganeh et al; licensee BioMed Central Ltd. 2011
Received: 15 December 2010
Accepted: 16 May 2011
Published: 16 May 2011
Skip to main content
© Lucchetti-Miganeh et al; licensee BioMed Central Ltd. 2011
Received: 15 December 2010
Accepted: 16 May 2011
Published: 16 May 2011
Superoxide reductases (SOR) catalyse the reduction of superoxide anions to hydrogen peroxide and are involved in the oxidative stress defences of anaerobic and facultative anaerobic organisms. Genes encoding SOR were discovered recently and suffer from annotation problems. These genes, named sor, are short and the transfer of annotations from previously characterized neelaredoxin, desulfoferrodoxin, superoxide reductase and rubredoxin oxidase has been heterogeneous. Consequently, many sor remain anonymous or mis-annotated.
SORGOdb is an exhaustive database of SOR that proposes a new classification based on domain architecture. SORGOdb supplies a simple user-friendly web-based database for retrieving and exploring relevant information about the proposed SOR families. The database can be queried using an organism name, a locus tag or phylogenetic criteria, and also offers sequence similarity searches using BlastP. Genes encoding SOR have been re-annotated in all available genome sequences (prokaryotic and eukaryotic (complete and in draft) genomes, updated in May 2010).
SORGOdb contains 325 non-redundant and curated SOR, from 274 organisms. It proposes a new classification of SOR into seven different classes and allows biologists to explore and analyze sor in order to establish correlations between the class of SOR and organism phenotypes. SORGOdb is freely available at http://sorgo.genouest.org/index.php.
Two and a half billion years ago, the intense photosynthetic activity of cyanobacteria caused the largest environmental change in Earth's history: the oxygenation of the atmosphere and the oceans, which were hitherto largely anoxic [1, 2]. This profound transformation of the biosphere exerted an evolutionary selection pressure on organisms and led to the development of new pathways, including the highly exergonic respiratory chain based on O2 as the terminal electron acceptor. Currently, most living organisms, except anaerobic microbes, require oxygen. O2 is used as a substrate by many enzymes involved metabolizing amines, purines and amino acids. Oxygen is a relatively inert molecule due to its spin triplet ground state. However, it can be activated by photons or by one electron oxidation or reduction processes to generate reactive oxygen species (called reactive oxygen species or ROS), particularly hydroxyl radicals (•OH), hydrogen peroxide (H2O2) and superoxide anion radicals (O2-).
The superoxide anion is generated fortuitously by flavoenzymes such as NADH dehydrogenase II, succinate dehydrogenase, fumarate reductase, and sulphite reductase [3, 4]. The superoxide anion is one of the deleterious reactive oxygen species: it can damage DNA, proteins and lipids indirectly by releasing iron from damaged dehydratase clusters [4, 5]. In anaerobes, most of the essential "central metabolic" redox enzymes (for example aconitase, fumarase, dihydroxyacid dehydratase, and pyruvate:ferredoxin oxidoreductase) contain iron sulphur [Fe-S] clusters that are rapidly inactivated when exposed to oxygen [5–8].
To survive and protect themselves from the toxicity of superoxide anion, many species, and especially anaerobes, have developed defence mechanisms .
Superoxide dismutase (SOD) was first isolated by Mann and Keilis (1938) and its catalytic function, which consists to dismutate O2- into molecular oxygen and hydrogen peroxide, was discovered in 1969 by McCord and Fridovich . Mammals have two forms of SOD isozymes: the manganese SOD (Mn-SOD), present in the mitochondria, and the copper/zinc SOD (Cu/Zn-SO), present in the cytoplasm [10, 11]. In plants, SOD have been classified into three distinct types on the basis of their metal cofactor: Cu/Zn-SOD (in the cytosol and chloroplasts), Mn-SOD (in mitochondria), and Fe-SOD (often in chloroplasts) [12–14]. There are three known SOD in E. coli: MnSOD, FeSOD and CuZnSOD. The two first are located in the cytoplasm and the last in the periplasmic space . A distinct additional fourth class of SOD containing nickel (NiSOD) was recently discovered in Streptomyces [16, 17] and cyanobacteria . SOD-driven dismutation was the only biological mechanism identified for scavenging superoxide anion radicals until the early 1990's. McCord et al.  established a correlation between oxygen tolerance and SOD production and suggested that SOD was the single most important enzyme for enabling organisms to survive in the presence of molecular oxygen. They proposed that the hypersensitivity of obligate anaerobes to oxygen was a consequence of SOD deficiency. However, most anaerobic organisms, which indeed lack SOD, show various degrees of tolerance to oxygen when they are occasionally exposed to this molecule in their environments.
Two novel iron-sulphur-containing proteins that detoxify superoxide molecules were then discovered in sulphate-reducing and hyperthermophilic anaerobes: desulfoferrodoxin (Dfx) in Desulfovibrio desulfuricans, Desulfovibrio vulgaris Hildenbourgh  and Desulfoarculus baarsii , neelaredoxin (Nlr) in Desulfovibrio gigas  and superoxide reductase (SOR) in Pyrococcus furiosus . This revealed the existence of alternative mechanisms for ROS detoxification in anaerobes. The function of these proteins was first studied in 1996 by Dfx complementation of superoxide detoxication activity in E. coli SOD mutants . Later, Nlr from Treponema pallidum  and D. gigas  were also shown to complement such SOD mutants. Liochev and Fridovich  suggested that Dfx catalyzes the reduction of superoxide rather than its dismutation, and that it uses cellular reductants such as NAD(P)H. Subsequently, the Dfx enzyme was confirmed as an oxidoreductase [23–25, 27]. Finally, the superoxide reductase activity of those proteins were established by two groups [21, 23].
Dfx and Nlr proteins have different numbers of iron sites: both contain a similar C-terminal single iron-containing site (centre II) but also has Dfx a second N-terminal site (centre I) [22, 28]. Centre II is the active site of SOR and consists of a pentacoordinated Fe2+ centre with four equatorial histidines and one axial cysteine in a square pyramidal geometry (Fe(His)4(Cys) [29–31]). The binding site for the substrate O2- is the free sixth axial site of the reduced enzyme centre . The additional N-terminal domain of the 2Fe-SOR contains a rubredoxin-like centre, with Fe3+ ligated by four cysteines in a distorted tetrahedral geometry (centre I, Fe(Cys)4, ). A first classification of these enzymes was proposed according to the number of metal centres: neelaredoxin or 1Fe-SOR and desulfoferrodoxin or 2Fe-SOR [33, 34]. An additional class was proposed after the isolation of a Treponema pallidum SOR that contains an extended non-iron N-terminal domain of unknown function [25, 35]. In all these three classes, only the reduced form of the iron-containing active centre II is able to react with the superoxide anion O2•-.
SOD are found in nearly every living organism except in some strictly anaerobic species [36, 37]. Tally et al suggested that the diversity in the oxygen tolerance of anaerobes is generally related to their level of SOD . SOR were first thought to be restricted to anaerobic prokaryotes but were subsequently discovered in some micro-aerophilic and micro-aerotolerant Bacteria and Archaea [39, 40]. More recently, a SOR encoding gene was also discovered in an eukaryote, Giardia intestinalis, a microaerophilic protozoan (cited by ). Although SOD and SOR both detoxify superoxide, there is a fundamental difference in their properties: SOD generate one-half mole of oxygen and one-half mole of hydrogen peroxide per superoxide molecule whereas SOR produce only one mole of hydrogen peroxide. The physiological conditions, that determine SOR or SOD preference in organisms, have not be completely determined, although the presence of SOR rather than SOD may be associated with the amount of redox proteins produced by organisms .
Most genomes, even those of anaerobic species, contain both SOD and SOR although some species have only one of the two enzymes. The increasing number of sequenced genomes makes allows comparative genomic analyses, to elucidate the evolutionary or functional processes of SOR. Unfortunately, there are several problems with the annotation of superoxide reductase genes, partly a consequence of heterogeneous transfer of annotations from previously characterized neelaredoxin, desulfoferrodoxin, superoxide reductase or rubredoxin oxidase. Moreover, due to the absence of updating or correction of databases, many sor genes remained anonymous because of the transfer of annotations from SOR genes initially annotated as "hypothetical", "function unknown" or "putative activity". Also, SOR are small proteins, ca. 200 amino acids on average, and mis-annotations are frequent for proteins of this length .
For all these reasons, we developed SORGOdb, the first resource specifically dedicated to superoxide reductase genes in entirely sequenced and in-draft genomes. SOR sequences were curated manually, analysed and stored using a new ontology in a publically available resource (http://sorgo.genouest.org/). SOR genes were detected in the three kingdoms of life, and only on chromosomal replicons. Although no N-terminal signal sequences were previously described for bacteria SOR , we predicted seven SOR to be potentially TAT-secreted (Twin-arginine translocation) in some bacteria, including for example in Desulfovibrio salexigens DSM 2638, Desulfuromonas acetoxidans DSM 684 and Geobacter uraniireducens Rf4. Our analysis confirms the observations by Pinto et al in 2010 that (1) the repartition of SOR classes does not correlate with organism phylogeny and that (2) sor genes occur in very diverse genetic environments. Indeed, although some sor are clustered with genes encoding electron donors (such as rubredoxin in D. vulgaris) or inter-related oxidative responsive genes, most are close to functionally unrelated genes. This is consistent with sor genes being acquired, or lost, through lateral gene transfer .
SOR proteins with entrie(s) in Pubmed and/or PDB structure
Desulfovibrio desulfuricans ssp. desulfuricans. ATCC 27774
Desulfovibrio Desulfuricans ssp. desulfuricans G20
2JI1, 1VZI, 1VZG, 1VZH
Pyrococcus horikoshii Ot3
Pyrococcus furiosus DSM 3638
1DQI, 1DO6, 1DQK
Treponema pallidum ssp. pallidum str. Nichols
Archaeoglobus fulgidus DSM 4304
Desulfovibrio vulgaris 'Miyazaki F
Desulfovibrio vulgaris sp. vulgaris str. Hildenborough
Clostridium acetobutylicum ATCC 824
Nanoarchaeum equitans Kin4-M
At the end of this integrative research, we had a collection of 325 non-redundant and curated predicted SOR in 274 organisms, covering all the three kingdoms: Bacteria (270 genes), Archaea (52 genes) and Eukaryota (3 genes).
Classes of SOR in SORGOdb (Number of proteins per classes)
SOR in SORGOdb
The navigation menu (on the left) provides access to SORGOdb functions through three modules. (i) Browse: browse SOR proteins according to phylogeny criteria (kingdom, phylum, class and order) or locus tag name. (ii) Search: by organism name query and by sequence similarity through a BlastP form that allows users to enter primary sequences to find similar entries into the SORGOdb database and (iii) Pre-computed Results that include data statistics (organized in three tabs), classes (details about SORGOdb classes and ontology) and useful links (reference, tools and websites). Statistical results about SORGOdb classification were presented in the Classification tab (http://sorgo.genouest.org/classif-Stat.php).
The results panel (on the right) provides intermediary selection options and displays SOR record information in a tabular way including organism name, locus tag name, SORGOdb classification and domains architecture. When available, SORGOdb includes a CGView  representation of the distribution of SOR and all SOD genes (MnSOD, FeSOD CuZnSOD and NiSOD)  in the replicons and a gView  map to illustrate the genetic organisation and encoded functions surrounding each SOR (window of 11 genes max.).
As an example, SORGOdb allows the study of the distribution of genes encoding superoxide reductase across a whole phylum. As a case study, we decided to consider the Archaea as these organisms are considered to be originate from a hyperthermophilic anaerobic common ancestor and were probably already prevalent when the Earth had its primative anoxic H2 and CO2 atmosphere.
Nanoarchaeota  and Korarchaeota  are obligately anaerobic sulphur-dependent organisms placed close to the root of the archaeal SSU rRNA tree. Nanoarchaeota is currently known from a single organism Candidatus Nanoarchaeum equitans, a hyperthermophilic symbiont that grows on the surface of Ignicoccus hospitalis [62, 63]. There are currently no representatives of Korarchaeota in pure culture but the genome of K. cryptophilum, a very thin filamentous thermophilic heterotroph, has been determined from a sample of Yellowstone National Park Obsidian Pool. Both C. N. equitans and K. cryptophilum are found together in the 16S tree, in the vicinity of the Crenarchaeota group, and contain genes encoding superoxide reductase with a SOR (centre II) functional domain and do not encode superoxide dismutase genes.
Thermococcus and Pyrococcus are obligate anaerobes that live in environments where there is no oxygen and both produce a SOR-type superoxide reductase that is catalytically active at temperatures below the optimum growth temperature but representing conditions likely corresponding to zones of oxygen exposure .
The protein tree also revealed two interesting phenomena: Msp_0788 that is a non-canonical Dx-SOR (as the Dx active site is incomplete) that is branched as an out-group close to the entire archaeal Dx-SOR group (Figure 5, point 1). This is consistent with the presumed loss-of-function of Dx of Msp_0788 being relatively recent. Also, the Kcr_1172 locus forms a major divergent branch (Figure 5, point 2).). Using the "Browse by locus tag" option, Kcr_1172 is revealed to be a fusion protein with an additional C-terminal module sharing significantly similarities with archaeal proteins annotated as "hypothetical" or "redoxin domain-containing". The best-conserved component is a CXXC motif (i.e. cysteines separated by two amino acids), found in many redox proteins for the formation, the isomerization and the reduction of disulphide bonds and for other redox functions . Kcr_1172 has a new SOR-derived architecture with the presence of two CXXC active sites (in the C-terminal fusion and N-terminal "Dx parts"), separated by the functional SOR centre II. This arrangement is unique and interesting as a combination of two sites CXXC motifs has been shown to be involved in protein disulphide-shuffling in hyperthermophiles . Although the true function of this protein needs to be determined experimentally, we show with this example that SORGOdb can also be used to reveal possible new SOR features.
The distribution of genes encoding SOR and SOD is extremely heterogeneous, both qualitatively and quantitatively, in the group of methanogenic archaea as shown in Figure 3. Thus, for the genus Methanosarcina, Methanosarcina acetivorans (5.8 Mb) possesses one SOR and two SOD whereas Methanosarcina mazei (4.1 Mb) encodes only one SOR. M. barkeri, that shares 80% identity with both M. acetivorans and M. mazei , encodes two SOD  but no SOR. The presence of these various combinations of oxygen-dependent SOD and SOR genes confirm that methanogens, that are sensitive to oxygen and are rapidly killed by even very low concentrations of O2, protect themselves from ROS; however, the factors that influence the presence and evolution of these genes remain unidentified. No clear relationship can be established between oxygen tolerance and the existence of superoxide reductase functions in the genome of microbes. A difficulty is the different connotations of the term 'anoxia' as used by geologists, zoologists and microbiologists. Geologists call an environment 'aerobic' if the oxygen content exceeds 18%. Zoologists talk about 'hypoxic' conditions when referring to oxygen levels that limit respiration (usually less than ca. 50% O2). For microbiologists, the so-called 'Pasteur point' of switch from aerobic respiration to fermentation is generally less than about 1 per cent of the atmospheric levels of oxygen; microbes, though, are affected by very low levels of oxygen, often much less than 0.1 per cent whereas some "anaerobes" living today are able to tolerate oxygen even at higher levels.
The SORGOdb server is the first web server that centralizes and provides an interface for information concerning superoxide reductase proteins. SORGOdb provides integrated features: (1) Multiple options for data browsing and searching (2) Complete descriptions of SOR and a new domain-based classification (3) Synthetic and downloadable synopsis for each locus tag (4) A SOR-homology analysis tool using BlastP similarity searches with the SORGOdb-positive dataset (5) An integrated access to external hyperlinks to various public data sources (notably NCBI GenBank, and Pubmed). SORGOdb is a unique mining tool that can assist researchers with diverse interests to retrieve, visualize and analyse superoxide reductase genes and proteins.
Database name: SORGOdb
Project home page: http://sorgo.genouest.org/index.php
Operating system(s): Platform independent, designed for Safari and Firefox browser and not available for Internet Explorer.
Reactive Oxygen Species
CLM is supported by Agence Nationale de la Recherche and DG by the Ministère de la Recherche. We wish to thank the bioinformatics platform of Biogenouest of Rennes for providing the hosting infrastructure.
This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.