fPoxDB: fungal peroxidase database for comparative genomics
© Choi et al.; licensee BioMed Central Ltd. 2014
Received: 9 September 2013
Accepted: 24 April 2014
Published: 8 May 2014
Skip to main content
© Choi et al.; licensee BioMed Central Ltd. 2014
Received: 9 September 2013
Accepted: 24 April 2014
Published: 8 May 2014
Peroxidases are a group of oxidoreductases which mediate electron transfer from hydrogen peroxide (H2O2) and organic peroxide to various electron acceptors. They possess a broad spectrum of impact on industry and fungal biology. There are numerous industrial applications using peroxidases, such as to catalyse highly reactive pollutants and to breakdown lignin for recycling of carbon sources. Moreover, genes encoding peroxidases play important roles in fungal pathogenicity in both humans and plants. For better understanding of fungal peroxidases at the genome-level, a novel genomics platform is required. To this end, Fungal Peroxidase Database (fPoxDB; http://peroxidase.riceblast.snu.ac.kr/) has been developed to provide such a genomics platform for this important gene family.
In order to identify and classify fungal peroxidases, 24 sequence profiles were built and applied on 331 genomes including 216 from fungi and Oomycetes. In addition, NoxR, which is known to regulate NADPH oxidases (NoxA and NoxB) in fungi, was also added to the pipeline. Collectively, 6,113 genes were predicted to encode 25 gene families, presenting well-separated distribution along the taxonomy. For instance, the genes encoding lignin peroxidase, manganese peroxidase, and versatile peroxidase were concentrated in the rot-causing basidiomycetes, reflecting their ligninolytic capability. As a genomics platform, fPoxDB provides diverse analysis resources, such as gene family predictions based on fungal sequence profiles, pre-computed results of eight bioinformatics programs, similarity search tools, a multiple sequence alignment tool, domain analysis functions, and taxonomic distribution summary, some of which are not available in the previously developed peroxidase resource. In addition, fPoxDB is interconnected with other family web systems, providing extended analysis opportunities.
fPoxDB is a fungi-oriented genomics platform for peroxidases. The sequence-based prediction and diverse analysis toolkits with easy-to-follow web interface offer a useful workbench to study comparative and evolutionary genomics of peroxidases in fungi.
Peroxidases (EC 1.11.1.x) are a group of oxidoreductases that catalyse the oxidation of various compounds by using peroxides. While hydrogen peroxide (H2O2) is commonly used as an electron donor, peroxidases can take a variety of different substrates as electron acceptors. Peroxidases can be divided into two major groups, contingent upon the presence or absence of a haem cofactor. Among their numerous industrial applications, one good example would be their ability to remove phenolic compounds from wastewater, in which haem peroxidases are involved. For instance, peroxidases including horseradish peroxidase enzymatically catalyse the conversion of phenolic substrates into phenoxy radicals. The resulted phenoxy radicals can chemically react among themselves or with other substrates, consequently causing precipitation of polymeric products, which can be easily separated from the wastewater [1, 2]. In addition, lignin peroxidase (LiP) and manganese peroxidase (MnP) are considered to be the most effective enzymes for recycling carbon sources fixed as lignin . As genes encoding LiP are quite limited to white rot fungi, including Phanerochaete chrysosporium[4, 5], P. sordida, Trametes versicolor, Phlebia radiata[8, 9], P. tremellosa, and Bjerkandera sp. , genes encoding MnP have drawn attention as an alternative ligninolytic peroxidase due to their wider distribution among basidiomycetes compared to those encoding LiP. Furthermore, site-directed mutagenesis on LiP and MnP genes revealed that the catalytic residues play pivotal roles in switching enzymatic activities between LiP and MnP in P. chrysosporium[12, 13]. Recently, a new type of haem protein called versatile peroxidases (VPs) has been found in Pleurotus and Bjerkandera species that can naturally perform both functions [14, 15]. Hence, they are considered to be another candidates for ligninolysis. Meanwhile, a dye-decolorizing peroxidase (DyP), MsP1, in Marasmius scorodonius is thought to be useful for industrial applications due to its high temperature and pressure stability . Besides their industrial impacts, peroxidases are also important in fungal pathogenicity on host animals and plants. For example, deletion mutants of a gene encoding thiol peroxidase, TSA1, in Cryptococcus neoformans showed significantly less virulence on mice . For plant pathogens, peroxidases are required to detoxify host-driven reactive oxygen species for Ustilago maydis and Magnaporthe oryzae. In addition, mutants of genes encoding NADPH oxidases (Nox) in Botrytis cinerea, bcnoxA and bcnoxB, showed attenuated virulence on citrus where double knockout or deletion of the gene encoding regulatory protein, bcnoxR, gave additive effects .
Along with the industrial and biological importance of peroxidases, together with the availability of fully sequenced fungal genomes, a genomics resource is required for better understanding of peroxidases at the genome-level. Peroxidase genes might be identified by using domain prediction tools, such as InterPro scan  or Pfam . However, identification based on domain profiles could result in false positives. For example, NoxA  and a metalloreductase (FREA)  in Aspergillus nidulans showed the same domain profiles predicted by InterPro scan  and Pfam . Since ferric reductases (FRE) and ferric-chelate reductases (FRO) share high structural similarity with Nox , the gene encoding FREA would become a false positive in domain-based prediction of Nox genes. Because filtering out false positives is an important issue in studying comparative or evolutionary genomics on Nox genes, Nox family is divided into three subfamilies, NoxA, NoxB, and NoxC. Previously, a database named as PeroxiBase  was developed to archive the genes encoding peroxidases in a wide range of taxonomy. Although PeroxiBase contains fungal peroxidases, it does not specifically focus on fungi and archive genes encoding NoxR, which are known to regulate NoxA and NoxB in fungi [27–29]. Hence, it is necessary to build a peroxidase database for comparative and evolutionary analysis in fungi.
Here, we developed a new web-based fungal peroxidase database (fPoxDB; http://peroxidase.riceblast.snu.ac.kr/) to provide a fungi-oriented archive with manually improved catalogue of Nox genes and to support comparative and evolutionary genomics of genes encoding various peroxidases. Finally, we show an overview of the taxonomic distribution of peroxidase genes in the kingdom Fungi which could be applied for investigation of phylogenetic relationship.
Summary of peroxidase families found in fungal and Oomycete genomes
Number of genes
Number of genomes
Cytochrome C peroxidase
DyP-type peroxidase D
Hybrid Ascorbate-Cytochrome C peroxidase
Linoleate diol synthase (PGHS like)
NADPH oxidase, NoxA
NADPH oxidase, NoxB
NADPH oxidase, NoxC
NADPH oxidase, Duox**
NADPH oxidase, Rboh***
Other class II peroxidase
Prostaglandin H synthase (Cyclooxygenase)
Atypical 2-Cysteine peroxiredoxin (typeII, typeV)
Atypical 2-Cysteine peroxiredoxin (typeQ, BCP)
Fungi-Bacteria glutathione peroxidase
No haem, Vanadium chloroperoxidase
Typical 2-Cysteine peroxiredoxin
Six peroxidase families including 1-Cysteine peroxiredoxin, atypical 2-Cysteine peroxiredoxin (typeII, typeV), atypical 2-Cysteine peroxiredoxin (typeQ, BCP), catalase, cytochrome C peroxidase, and Fungi-Bacteria glutathione peroxidase were found in at least 200 fungal and Oomycete genomes. Particularly, species belonging to the subphyla Saccharomycotina and Taphrinomycotina had only two haem peroxidase families, but had five and four non-haem peroxidases, respectively (Additional file 1). This result might imply that the non-haem peroxidases were horizontally transferred to fungi from bacteria before diversification as they are shown to be constrained in bacteria . In addition, horizontal gene transfer of haem catalase-peroxidase genes of fungi from bacteria has been reported in several previous studies [35–37]. Further study would provide better speculation on the origin of non-haem peroxidase of fungi. Surprisingly, a few gene families were limited to a certain taxon, implying their specific roles in different fungal life styles. For example, lignin peroxidase (LiP) and manganese peroxidase (MnP) were only found in the subphylum Agaricomycotina. Phanerochaete chrysoporium was the only species which possess the genes encoding LiP in fPoxDB. On the other hand, MnP was found in multiple species belonging to the subphylum Agaricomycotina, particularly in rot fungi including Phanerochaete chrysosporium, Pleurotus ostreatus PC9, Dichomitus squalens, and Heterobasidion irregulare TC 32–1 (Additional file 1). This is in agreement with the previous findings that these enzymes are critical in oxidation and degradation of lignin and lignocellulose . According to Fungal Secretome Database (FSD; http://fsd.snu.ac.kr/) , all 10 LiPs and 26 MnPs belonging to these rot fungi were predicted to be secretory, which strongly supports the importance of their roles at the interface between fungal and host cells.
In order to evaluate the prediction accuracy, 77 protein sequences annotated as peroxidase gene families were downloaded from the UniProtKB/SwissProt database  which was used as a positive set. In addition, to test the discrimination power against other oxidoreductase sequences, expert-curated fungal protein sequences of 39 laccases and 197 other oxidoreductases were also downloaded from the UniProtKB/SwissProt database  for a negative set. Laccases and other oxidoreductases are good negative sets, since these enzymes and peroxidases share the same nature in transferring electrons from one to another but take different electron donors and acceptors. As a result, all 77 protein sequences belonging to eight peroxidase families were correctly predicted by the corresponding sequence profiles in our pipeline. Furthermore, none of the 236 protein sequences from the negative set showed any significant hits. In fact, many sequences in the negative set showed insignificant hits which had far higher E-values than the identification threshold 1.0e-5. These results clearly supported the quality of the pipeline in the accuracy and discrimination power against the positive and negative sets, respectively.
fPoxDB is built on a three-tiered system which consists of database, application, and user interface tiers. The database tier embraces database servers which run on MySQL relational database management system. The application tier is comprised of system monitoring servers and computing nodes which coordinates and schedules BLAST , HMMER , BLASTMatrix , ClustalW , and analysis jobs submitted from the website. The user interface tier adopts data-driven user interface (DUI), originally designed for the CFGP 2.0 , which runs on the Apache HTTP Server. Servers for each tier are physically separated to balance load, providing comfortable user experience of fPoxDB. In-house scripts for the identification pipeline were written in Perl. The web interface follows HTML5 and CSS3 standard to support cross-browsing.
Investigation of gene duplication and loss could help us to understand how fungi adapt to different environments. Catalases are haem peroxidases in which structure is well conserved throughout all domains of life . They have been phylogenetically studied in both prokaryotes and eukaryotes [37, 43], however, not in detail for fungi. To demonstrate how fPoxDB could be used in comparative and evolutionary studies, amino acid sequences of a domain commonly found in 109 catalases from 32 species were analysed. To elucidate evolutionary history of catalases, a reconciliation analysis was conducted. The reconciled tree revealed that duplication or loss events of catalase genes occurred frequently in most of the internal and leaf nodes (Additional file 2). Except for three nodes, all internal nodes underwent multiple gene losses or duplications in fungal clades. Interestingly, only gene losses occurred in members of Ascomycota at the species-level. In contrast, gene losses as well as duplications were found to have occurred in species belonging to Basidiomycota. The fact that basidiomycetes possess more peroxidase genes than ascomycetes suggests that the genes have evolved to adapt to their wood-decaying lifestyle. They require a large amount of catalase activity to reduce high concentration of reactive oxygen species involved in the wood decay . Comparative and evolutionary analysis, such as the above-mentioned example, can be done on other families of peroxidases as well.
“Browse by Species” displays species name, taxonomy, and the number of predicted peroxidase genes/gene families. For each species, the detail page shows the number of predicted genes for each gene family as a graphical chart and table to present an overview on the peroxidase composition in a genome. The hierarchy implemented in the browser is easy to follow, so that users can readily retrieve data. “Browse by Species” also provides the taxonomically ordered summary table for every peroxidase family where kingdom-level and subphylum-level distribution are available. A summary of the whole database that describes the number of predicted genes against each genome can be downloaded as .csv format. This could provide the possibility to study gene family expansion or contraction across a number of genomes. “Browse by Classes” lists the peroxidase gene families and the number of genes and genomes corresponding to each gene family. Distribution of genes for each gene family is depicted in a box plot in order to show subphylum-level of taxonomic distribution at a glance. These distribution summaries could be used for searching peroxidase families which are limited to a certain range of taxonomy, such as LiP and MnP.
In order to systematically manage the sequence data, fPoxDB website is equipped with “Favorite Browser”, a virtual personal storage and data analysis hub originally developed for CFGP 2.0 (http://cfgp.snu.ac.kr/) . In the “My Data” menu, users can create and manage their own data collections which are synchronized with the CFGP 2.0. The “Favorite” folders and their contents can also be used in the CFGP 2.0 as well as many other family web systems [39, 52–54] for further analysis options. For example, the FSD  could be jointly used to check how many peroxidases in a Favorite are predicted to be secretory. Furthermore, users can also try 27 bioinformatics tools available at the CFGP 2.0  in the same way. Via the Favorite Browser in fPoxDB, users can submit BLAST , HMMER , BLASTMatrix , and ClustalW  jobs with the sequences saved in a Favorite. BLASTMatrix  is a parallel BLAST search program which enables searching multiple queries against multiple genomes. The BLASTMatrix  offers a wide taxonomic distribution of the query sequences with various viewing options. Users can browse i) gradient aided taxonomic distribution, ii) actual E-value/bit score matrix, and iii) taxonomic conservation of the query sequences. This also enables users to mine putative orthologues in other genomes, which can be stored into a Favorite on the fly. In addition, domain browsing function is available in the Favorite Browser that provides graphical diagrams for selected domains. The image files of domain structures for the sequences in a Favorite can also be downloaded as a zip archive for further use. fPoxDB also has a novel function for investigation of trans-membrane helices (TMHs). By using “Distribution of TMHs” function in the Favorite Browser, position information and sequences corresponding to THM regions, predicted by TMHMM2.0 , can be retrieved as a text file. This function may offer starting material for studying structural features or evolutionary relationship of Nox genes as they are known to have conserved histidine residues in their THMs [56, 57]. Multiple sequence alignment by ClustalW  is also available via the Favorite Browser. Since many protein domains found in peroxidases are highly conserved, site-directed mutagenesis of conserved catalytic residues had been a vibrant research field [12, 13, 58–61]. Users can align their sequences in a Favorite as full length or a domain of choice, enabling targeted investigation on catalytic domains.
fPoxDB is a fungi-oriented database for studying comparative and evolutionary genomics of various peroxidase gene families. This database provides more accurate prediction of genes encoding Nox and NoxR in fungi. The web interface of fPoxDB provides i) browsing by species/gene family, ii) kingdom-/subphylum-level of distribution, iii) similarity search tools (BLAST , HMMER , and BLASTMatrix ), iv) multiple sequence alignment by ClustalW , and v) domain and TMH analysis function via Favorite Browser. By taking full advantage of these functionalities, fPoxDB will be a valuable platform in i) preparation of data sets for evolutionary study, ii) finding candidate catalytic residues from domain alignment, and iii) finding possible orthologues in other genomes from BLASTMatrix  results. In order to provide better prediction and usability, this database will be updated with continuous improvement on gene family definitions, additional fungal genome sequences, and installation of useful analysis functions. Collectively, fPoxDB will serve as a fungi-specialized peroxidase resource for comparative and evolutionary genomics.
All data and functions described in this paper can be freely accessed through fPoxDB website at http://peroxidase.riceblast.snu.ac.kr/ via the latest versions of web browsers, such as Google Chrome, Mozilla Firefox, Microsoft Internet Explorer (9 or higher), and Apple Safari. The data sets supporting the results of this article are included within the article and its additional files.
This work was supported by the National Research Foundation of Korea grant funded by the Korea government (2008–0061897 and 2013–003196) and the Cooperative Research Program for Agriculture Science & Technology Development (Project No. PJ00821201), Rural Development Administration, Republic of Korea. JC and KTK are grateful for a graduate fellowship through the Brain Korea 21 Plus Program. This work was also supported by the Finland Distinguished Professor Program (FiDiPro) from the Academy of Finland (FiDiPro # 138116). We also thank Da-Young Lee for critical reading of the manuscript.
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.