Adaptive DFBA algorithm
Predefined metabolic changes known prior to the simulation can also be implemented using the DOA formalism, but entangle iteratively running successive simulations between concentration change points, i.e. running a separate DOA calculation for each continuous period without nutrient changes. Simulating interactive metabolite changes triggered by unpredictable events with DOA would need to run a full-length calculation, inspect it, detect the occurrence of the first key event, extract concentrations at that time, apply the desired changes, restart the calculation from that point with the new concentrations and calculate the system evolution to completion again, revising the new results to detect any next key event and repeating the cycle as many times as needed.
Our approach does not require calculations to be restarted while permitting on-the-fly introduction of arbitrary changes to the model or to nutrient concentrations.
When changes are known in advance, specifying metabolite changes in static tabular format provides a convenient solution, whereas using a function provides the additional control needed to implement feed-back controls in response to dynamic and unpredictable system changes: by monitoring the levels of all metabolites at each time point, it is possible to immediately detect when trigger points are reached and react accordingly.
Example uses of this functionality include the response of a sensor in a feed-back controlled, fed-batch experiment, the effect of addition of a given substrate in flask cultures at desired time points, or the effect of depletion of nutrients caused by concomitant processes.
More complex schemes are possible modifying the model status at any point. Using a tabular format permits modification of reaction rates when their course is known in advance and can be modelled mathematically, e.g. multi-phase linear models [27], sigmoid fitting [22], or any other. A function provides the additional versatility needed to implement arbitrarily complex changes in metabolic behaviour, like implementing any desired expression for calculating uptake rates (e.g. Michaelis-Menten [9, 11]), setting reaction bounds in response to changes of directly or indirectly related fluxes or nutrient concentrations (e.g. dependence of CO2 on pH [22]), implementing concentration-dependent boundary conditions (e.g. phosphate-dependent secretory and metabolic changes [11, 33]), activating specific genes in response to environmental and internal conditions [10, 29], implementing time-dependent behavioural changes (e.g. CO2/O2 uptake/excretion switches in response to day-night light cycles [22]), or detecting differential trends to act accordingly.
These modifications permit unprecedented arbitrary complexity in steering simulations to model dynamic regulatory, genomic or even epigenomic changes with Adaptive DFBA.
Users may choose FBA or MTF to solve the metabolic problem at each step. We have tested several combinations of ODE solvers and architectures. To summarize, GLPK provides the best cost-benefit trade-off, and if it fails to run on a specific calculation (which should be exceptional), then CLP (which is slower) or CPLEX provide suitable alternatives.
Overproduction of secretory proteins in S. lividans
It has been shown that poor cell growth can be associated with increased secretion of endogenous and heterologous products [3, 34], and that secretion cannot easily be coupled to other objectives [3, 18, 33,34,35]. Protein secretion poses a well-known additional problem since conflicting objectives lead to a controversial DFBA solution: if priority is given to growth, the cell will limit secretion to save resources, whereas if priority is given to secretion, it will proceed at the expense of growth. Various approaches to deal with these conflicts have been proposed with varying success such as assuming that secretion is tied to cell growth [12], to glucose uptake [18] or using alternate objectives, such as ATP production [17].
We have shown several examples of DFBA limitations when applied to secretory protein overproduction and how to overcome those using Adaptive DFBA, reporting the simplest models that still gave good results in each case. More complex parameter-tuning and pre-processing (e.g. growth estimation using sigmoid functions or rate approximations using non-linear functions) can be easily implemented in R. Using Adaptive DFBA we could enforce specific rates such as growth, L-alanine, C-source or protein secretion and explore which limits are compatible with observed experimental behaviour.
Even with minimal intervention in the system constraints and a coarse time step, different cellular responses and their metabolic associations could be identified using Adaptive DFBA: harbouring the pIJ486 multicopy plasmid or overexpressing secretory proteins resulted in changes to the dynamic consumption of various nutrients in a delicate balance. As in previous metabolomics studies [36], both the plasmid-bearing and mTNF-α producing strains diverge from the plasmid-free strain, with the mTNF-α overproducing strain becoming more divergent with time.
S. lividans showed an initial preference for using amino acids as the main C and N sources, resulting in an initial increase in NH4+ levels and sustained saccharide concentration, coincident with previous observation in batch and fed-batch cultures [27, 28]. As amino acids levels decreased, glucose/mannitol and NH4+ became preferential sources of C and N, and L-alanine was excreted to the medium in large quantities [36]. This switch was associated with a temporary alteration of growth rate when the change was too abrupt. Once the switch completed, growth resumed its former speed until glucose/mannitol reduction forced a new switch to L-alanine consumption. This reduced the need for glucose/mannitol and NH4+ and was associated to the switch to the stationary phase (often near the crossover point between glucose/mannitol and L-alanine levels). These data also yield useful information to interpret previous observations of Tat- or Sec-dependent secretory protein overproduction using glucose or mannitol using S. lividans TK21 [3, 28, 29]. Interestingly, DFBA calculations without enforcement of proper L-alanine exchange rates were also viable, although they proved unable to survive for as long as simulations where it was controlled. This suggests that L-alanine excretion may not be required in early growth, which, may better fit an immediate optimal FBA solution, and that its excretion might be a suboptimal metabolic mechanism leading to constitution of an external reservoir that can sustain viability after the main nutrients are depleted, and explaining the need for forced controls using Adaptive DFBA.
The size and composition of the overproduced secretory protein affects metabolic patterns: interestingly, heterologous proteins like mouse mTNF-α and Rhodothermus cellulase-A have a larger, size dependent, metabolic footprint and lead to faster amino acid uptake than overexpression of α-amylase from S. lividans or agarase from the closely related S. coelicolor, suggesting mutual adaptation of metabolism and protein composition in S. lividans.
Analysis of time-dependent relationships
The availability of a versatile method to model complex systems paves the way for exploring time-dependent interactions. FBA-based studies consider a single point in time, associations are heavily dependent on the magnitude of fluxes and their differences and necessarily ignore time-dependent changes.
Having access to dynamic simulation data opens the possibility of exploring time-related associations, such as identifying fluxes or metabolites that display a highly correlated, anti-correlated or independent time behaviour, clustering reactions by flux patterns, grouping and ranking them functionally and investigating associations to specific targets.
Our results suggest that using only roughly approximate limits for key reactions may be enough to obtain acceptable predictive models. While these limits may be provided by educated guesses, proper modelling should rely on statistically sound multivariate models, as no single metabolite could be completely associated to growth or protein secretion. Selecting an appropriate metabolite combination becomes then the most important issue. All three methods gave relatively consistent results, however, in our hands, Boruta’s Random Tree based approach seems better suited than traditional feature extraction methods to identify biologically meaningful variables.
Application of feature extraction methods identified relevant associations with H2O2 (likely as an indicator of oxidative stress, which is associated to the onset of the stationary phase, to the decline in α-amylase secretion and to the surge in agarase production), L-alanine, mannitol, other amino acids and, surprisingly, four metallic ions (Mo2+, Ni2+, Cu2+, Co2+) whose association with secretion had never been noticed before. Coincidentally, a recent study [34] reported that S. lividans grown in minimal medium (MM) had a low yield in mRFP protein secretion, whereas growth in CM/glucose had a high yield, on the opposite side of the spectrum, only paralleled by NB medium (a rich medium with peptic digest of animal tissue and beef extract). The only qualitative differences between MM and CM/glucose are the presence in CM/glucose of CuSO4 and CoCl2. Furthermore, that a carefully controlled, minimal medium like CM/glucose, containing only glucose and various ions, can match in efficiency a very rich, complex medium like NB, further supports their key role. While the relevance of these ions has passed unnoticed to date, these experimental findings strongly validate the use of deductions based on methods of feature extraction from simulated calculations.
We used heatmaps to identify metabolites whose time-dependent evolution differed significantly and which could be selected as targets for elaborating roughly predictive time-dependent models for use in the specification of approximate exchange limits. Although these methods may be used to obtain approximate predictions, further work is clearly needed to construct better predictive models. A potentially promising approach should likely consider the dependence of each metabolite on all other relevant ones to build multivariate models.
Adaptive DFBA expands the field of metabolic systems whose time-dependent evolution can now be analysed, extending DFBA applicability to deal with uncorrelated objectives, dynamically adapting and unforeseeable systems, and improving predictive power in situations with reduced experimental data.
Our analyses may certainly be improved. Possible enhancements to Adaptive DFBA include additional fine-tuning, further exploring the effect of other key metabolites identified thanks to these simulations, modelling the role of relevant post-secretory mechanisms such as the effect of protein folding and modification, the role of external proteases in degrading misfolded proteins [37] to account for losses in activity, or implementing reactions for “endogenous metabolism” (digestion of dead cells to reduce biomass and produce constituent monomers for survival) in late growth phases [11]. At the simulation level, while our calculations using relatively large models can be completed in acceptable times, further optimizations might include enabling dynamic modification of the time step size [38] and addition of further extensions to the algorithm such as Robustness Analysis, Phenotypic Phase Plane Analysis, or Minimization of Metabolic Adjustment Analysis, which might improve its utility in biotechnology engineering [18]. As mentioned, there is still room open for improvement of predictive models; whether these are really worth implementing will depend on future research.