(1) Overview
Introduction
Well designed mathematical models are excellent theoretical frameworks to analyse and understand the dynamics of a biological system. Here, the design process itself is the first important scientific exercise, in which biological knowledge is collected, organised and represented in a new, systematic way, that allows defining the model assumptions and formulating them in the language of mathematics. A working model then enables testing new hypotheses and allows for novel predictions of the system’s behaviour. Kinetic models allow simulating the dynamics of the complex biochemistry of cells. Therefore, they have the power to explain which processes are responsible for observed emergent properties and they facilitate predictions on how the system behaves under various scenarios, such as changed environmental conditions or modification of molecular components. Unfortunately, the construction of mathematical models is often already a challenging task, hampered by the limited availability of measured physiological and kinetic parameters, or even incomplete information regarding the network structure. It is therefore highly desirable to make the overall process of model construction as easy, transparent and reproducible as possible. Providing a toolbox with a wide range of methods that flexibly adapt to the scientific needs of the user, modelbase greatly simplifies the modelbuilding process, by facilitating a systematic construction of kinetic models fully embedded in the Python programming language, and by providing a set of functionalities that help to conveniently access and analyse the results.
Despite the fact that mathematical models vary significantly in their complexity, from very simple and abstract models to extremely detailed ones, they share a set of universal properties. The process of building a kinetic model can be divided into a number of mandatory steps such as i) establishing the biological network structure (the stoichiometry), ii) defining the kinetic rate expressions, iii) formulation of the differential equations, iv) parametrisation, v) validation and, finally, vi) application [1]. modelbase supports researchers in every step of model development and application with its simple design aimed at being minimally restrictive. It has been written in Python, an open source, generalpurpose, interpreted, interactive, objectoriented, and highlevel programming language. Due to a long list of its general features, such as clear syntax, useful builtin objects, a wealth of generalpurpose libraries, especially NumPy and SciPy, Python has become a widely used scientific tool [2]. Needless to say, the usage of Python over other, proprietary software, such as MATLAB or Wolfram Mathematica, decreases the risk of limited reproducibility and transparency, two critical factors while conducting research. Unfortunately, several powerful models of central biochemical pathways [3, 4] have been published before this need became apparent. As a consequence, some of these models are extremely difficult to implement to even attempt to reproduce their results. Therefore, modelbase provides an environment for relatively easy implementation of models that were published without source code, in a generalpurpose and reusable format. Moreover, modelbase supports the export of a structural (stoichiometric) model into Systems Biology Markup Language (SBML) for further structural analysis with the appropriate software.
In recent years, several other Pythonbased modelling tools have been developed, such as ScrumPy [5] or PySCeS [6]. They allow performing various analyses of biochemical reaction networks, ranging from structural analyses (nullspace analysis, elementary flux modes) to kinetic analyses and calculation of control coefficients. To the best of our knowledge they do not provide dedicated methods for model construction inside Python, and the standard usage relies on loading previously assembled model definition files.
The modelbase package presented here provides an alternative toolbox, complementing the functionalities of existing programs for computer modelling. Its power lies mainly in integrating the model construction process into the Python programming language. It is envisaged that modelbase will greatly facilitate the model construction and analysis process as an integral part of a fully developed programming environment.
Motivation
In the course of our photosynthetic research, we identified several shortcomings that are not adequately met by available free and open source research software. When constructing a series of similar models, which share the same basic structure but differ in details, it is, in most modelling environments, necessary to copy the model definition file (or even pieces of code) and perform the desired modifications. This makes even simple tasks, such as changing a particular kinetic rate law, hideous and unnecessarily complicated, affecting the overall code readability. To facilitate a systematic and structured model definition, exploiting natural inheritance properties of Python objects, our intention was to fully integrate the model construction process into the Python programming language, allowing for an automated construction of model variants. The necessity for this fully Pythonembedded approach became further evident for isotope labelspecific models [7], where an automatic construction of isotopespecific reactions from a common rate law and an atom transition map is desired. Such models are, for example, required to explain complex phenomena, such as the asymmetric label distribution during photosynthesis, first observed by Gibbs and Kandler in the 1950s [8, 9].
Implementation and architecture
modelbase is a consolebased application written in Python. It supplies methods to construct various dynamic mathematical models, using a bottomup approach, to simulate the dynamic equations, and analyse the results. We deliberately separated construction methods from simulation and analysis, in order to reflect the experimental approach. In particular, a model object constructed using the Model class can be understood as a representation of a model organism or any subsystem, on which experiments are performed. Instances of the Simulator class in turn correspond to particular experiments. The software components of modelbase are summarised in the Unified Modeling Language (UML) diagram in Figure 1.
Model construction
The user has the possibility to build two types of models, using one of the classes defined in the module model: Model, for differentialequation based systems, or LabelModel, for isotopelabelled models.
Class Model
Every model object is defined by:
 model parameters,
 model variables,
 rate equations,
 stoichiometries.
Model parameters can be simply defined in a dictionary, d. To build a mathematical model the user needs first to import the modelbase package and instantiate a model object (called m):
import modelbase
m = modelbase.Model(d)
After instantiation, the keys of the parameter dictionary d become accessible as attributes of an object of the internal class modelbase.parameters.ParameterSet, which is stored as the model’s attribute m.par.
To add reacting entities of the described system (referred to as species in SBML), e.g., metabolites, we pass a list of compounds names to the set_cpds method:
m.set_cpds(list_of_compounds)
Each of the added compounds becomes a state variable of the system. The full list of all variables is stored in the attribute m.cpdNames.
If S denotes the vector of concentrations of the biochemical reactants (as defined with the method set_cpds), the temporal change of the concentrations is governed by:
where N denotes the stoichiometric matrix and v(S, k) the vector of reaction rates as functions of the substrate concentrations S and parameters k. The system of ordinary differential equations is assembled automatically after providing all reaction rates and their stoichiometries to the method m.add_reaction(). The stoichiometric matrix of a model can be retrieved by the method m.print_stoichiometries() or m.print_stoichiometries_by_compounds(), for the transposed matrix. A detailed example of instantiating objects and solving a simple biochemical system with three reactions and two metabolites is provided in Box 1.
Box 1: Basic model useWe use modelbase to simulate a simple chain of reactions, in which the two state variables X and Y describe the concentrations of the intermediates. We assume a constant influx rate v_{0}, a reversible conversion of X to Y, described with mass action kinetics with forward and backward rate constants k_{1p} and k_{1m}, respectively, and an irreversible efflux of Y described by mass action kinetics with the rate constant k_{2}. We import the modelbase package, numpy and matplotlib.pyplot, define a list of metabolite species and a dictionary with parameters
import numpy as np
import matplotlib.pyplot as plt
import modelbase
cmpds = ['X','Y']
p = {'v0':1,'k1p':0.5,'k1m':1,'k2':0.1}
We instantiate a model object of class Model
m = modelbase.Model(p)
and pass metabolites to the model (variables are always defined by names)
m.set_cpds(cmpds)
In the next step we define reaction rate functions. The rate functions always accept the model parameters as first argument, whilst the remaining arguments are metabolite concentrations.
v0 = lambda p: p.v0
def v1(p,x,y):
return p.k1p*x – p.k1m*y
def v2(p,y):
return p.k2*y
and then pass them to the model using add_reaction()
m.add_reaction('v0', v0, {'X':1})
m.add_reaction('v1', v1, {'X':1,'Y':1}, 'X', 'Y')
m.add_reaction('v2', v2, {'Y':1}, 'Y').
To perform the computation we generate an instance of a simulation class using the function Simulator()
s = modelbase.Simulator(m)
To integrate the system over a given period of time (T=np.linspace(0,100,1000)), with the initial concentrations set to 0 (y0=np.zeros(2)), we use the method timeCourse()
s.timeCourse(T, y0)
Convenient access to the results of simulation through various get*() methods enables easy graphical display.
plt.figure()
plt.plot(s.getT(),s.getY())
plt.legend(m.cpdNames)
plt.show()

Working with algebraic modules
A particularly useful function of the class Model has been developed to facilitate the incorporation of algebraic expressions, by which dependent variables can be computed from independent ones. Examples include conserved quantities, such as the sum of adenine phosphates, which is often considered to be constant, and rapidequilibrium or quasi steadystate approximations (QSSA), which are applicable for systems with timescale separation and allow calculation of fast from slow variables. The function add_algebraicModule() accepts as arguments a function describing the rule how the dependent variables are calculated from independent ones, the name of the newly created module, and lists of names of the independent and dependent variables. After definition of an algebraic module, all dependent variables become directly accessible. The full list of independent and dependent variables can be accessed using the method allCpdNames().
Various analysis methods
With import modelbase.Analysis the user has access to advanced analysis methods on the model object. Currently, it provides methods to numerically calculate elasticities and the Jacobian, find steady states by attempting to solve the algebraic equations, and calculate concentration control coefficients. We expect the range of analysis methods to increase continuously in the future.
Class LabelModel for isotopelabelled models
The modelbase package includes a class to construct isotopelabelled versions of developed models. In order to simulate the dynamic distribution of isotopes, modelbase defines dynamic variables representing all possible labelling patterns for all intermediates. In contrast to instances of the class Model, for instances of the class LabelModel the number of potentially labelled atoms (usually carbon) needs to be defined for every compound. This is done with the method add_base_cpd(), which accepts the name and the number of labelled atoms of the compound. It automatically creates all 2^{N} isotope variants of the compound, where N denotes the number of labelled atoms. Finally, the method add_carbonmap_reaction() automatically generates all isotopespecific versions of a reaction. It accepts as arguments the reaction name, rate function, carbon map, list of substrates, list of products and additional variables to be passed.
To instantiate a model object for an isotopelabelled version of developed model simply call
m = modelbase.LabelModel(d),
where d is again a dictionary holding parameters. With an instance of this class a dynamic process, such as the dynamic incorporation of radioactive carbon during photosynthesis, can be easily defined and simulated, using the Simulator class described below. An example of how to use this class is provided in Box 2.
Box 2: Isotopelabelled modelA minimal example of an isotopelabel specific model simulates equilibration of isotope distribution in a system consisting of the two reactions of triosephosphate isomerase and fructosebisphosphate aldolase: (2)
$\text{GAP}\rightleftharpoons \text{DHAP}$
(3)
$\text{GAP}+\text{DHAP}\rightleftharpoons \text{FBP}$
We import the modelbase package, numpy and matplotlib.pyplot.
import numpy as np
import matplotlib.pyplot as plt
import modelbase
We define a dictionary of parameters and instantiate the model of class LabelModel
p ={'kf_TPI': 1,'Keq_TPI': 21,'kf_Ald': 2000,'Keq_Ald': 7000}
m = modelbase.LabelModel(p).
Compounds are added with an additional argument defining the numbers of carbons
m.add_base_cpd('GAP', 3)
m.add_base_cpd('DHAP', 3)
m.add_base_cpd('FBP', 6)
leading to an automatic generation of 80 = 2^{6} + 2^{3} + 2^{3} isotopespecific compounds. All reactions are assumed to obey massaction rate laws. Standard rate laws are defined in the modelbase.ratelaws module. Due to simplicity, the following steps are only shown for the forward triosephosphate isomerase reaction. For more details please see the file examples/isotopeLabels.py in the modelbase package.
import modelbase.ratelaws as rl
def v1f(p,y):
return rl.massAction(p.kf_TPI,y)
All isotopespecific rates are generated by the add_carbonmap_reaction() method, based on a list defining in which positions the carbons appear in the products.
m.add_carbonmap_reaction('TPIf',v1f,[2,1,0],['GAP'],['DHAP'],'GAP')
We set the initial conditions such that the total pools are in equilibrium, but carbon 1 of GAP is fully labeled
GAP0 = 2.5e5
DHAP0 = GAP0 * m.par.Keq_TPI
y0d = {'GAP': GAP0,
'DHAP': DHAP0,
'FBP': GAP0 * DHAP0 * m.par.Keq_Ald}
y0 = m.set_initconc_cpd_labelpos(y0d,labelpos={'GAP':0})
and simulate equilibration of the labels for 20 arbitrary time units
s = modelbase.LabelSimulate(m)
T = np.linspace(0,20,1000)
s.timeCourse(T,y0).
We plot the result using the getLabelAtPos() method (see examples/isotopeLabels.py). 
Integration methods and simulation subpackages
Methods for the numeric integration of models are provided by the two subclasses Simulate and LabelSimulate, where the latter inherits many methods from the first. The first is used for standard kinetic models, the latter for isotopespecific models. Both classes provide computational support for dynamic simulations and methods to numerically simulate the differential equation system and to analyse the results. To provide an automatic instantiation of the correct class, we provide the function Simulator. Calling
s = modelbase.Simulator(m)
returns an instance of either Simulate or LabelSimulate, depending on the class of model m, providing all methods to numerically simulate the differential equation system and to analyse the results. Simple applications to run and plot a time course are given in boxes 1 and 2. By default, the dynamic equations are numerically integrated using a CVODE solver for stiff and nonstiff ordinary differential equation (ODE) systems. The default solver uses the Assimulo simulation package [10], with the most central solver group originating from the SUNDIALS (a SUite of Nonlinear and DIfferential/ALgebraic equation Solvers) package [11]. If Assimulo is not available, standard integration methods from the SciPy library [12] are used. When needed, almost every integrator option can be overridden by the user by simply accessing
s.integrator
Additionally, the Simulate class includes methods to integrate the system until a steadystate is reached (sim2SteadyState()), and to estimate the period of smooth limit cycle oscillations (estimatePeriod()). The solution arrays are accessed with the methods getT() and getY(). The advantage of using this method over using Assimulo’s integrator.ysol is that getY() also returns the result for all the derived variables (for which algebraic modules have been used). In addition, the methods getVarByName(), getVarsByName() and getVarsByRegExp() allow to access the simulated values of one or several variables by their variable names or by regular expressions. Moreover, the method getV() gives access to the arrays of reaction rates and getRate() allows to access particular rates by the reaction name. The powerful Python plotting library matplotlib [13] provides numerous methods for graphical display of the results.
Systems Biology Markup Language (SBML)
modelbase supports export of a structural (stoichiometric) version of a created model into an XML file in the computerreadable SBML format. To export the model (m) simply use the method m.ModelbaseToSBML(file_name). A minimal working example can be found in our repository (https://gitlab.com/ebenhoeh/modelbase/blob/master/examples/sbml_export.py). Structural and stoichiometric analyses are currently not implemented in modelbase, therefore such export allows to take advantage of other SBML compatible modelling environments developed for such tasks (e.g. PySCeS or CobraPy [14]). The import of SBML models into modelbase is currently not supported, mainly because of the complementary purpose for which it was developed. The modelbase framework simplifies construction of kinetic models, allowing to perform this task with minimal modelling experience. Therefore, the main purpose of modelbase is the model design process itself, rather than importing a predefined construct to perform complex computations. However, a full SBML export and import functionality is currently under development to allow model sharing across different environments and platforms.
Quality control
modelbase has been continuously developed and used within our lab since 2016. It has been successfully applied to study the complexity of photosynthesis and carbon assimilation in plants [7] and is being further maintained and developed.
(2) Availability
Operating system
modelbase is compatible with all platforms with working Python distribution.
Programming language
modelbase is written in the Python programming language, a generalpurpose interpreted, interactive, objectoriented, and highlevel programming language. It is available for every major operating system, including GNU/Linux, Mac OSX and Windows and has been tested with Python 3.6.
Additional system requirements
None
Dependencies
Dependencies are provided in the setup.py file and include:
 numpy == 1.14.3
 scipy == 1.1.0
 numdifftools == 0.9.20
 assimulo == 2.9
 pandas == 0.22.0
 pythonlibsbml == 5.17.0
Support for the differential equation solver sundials (CVODE) through the python package assimulo requires moreover:
 Sundials2.6.0 (for 64bits machines, install Sundials using fPIC)
 Cython 0.18
 C compiler
 Fortran compiler
The detailed instruction how to install the prerequisites is included in the repository in our installation guide.
List of contributors
In alphabetic order: Marvin van Aalst, Oliver Ebenhöh, Anna Matuszyńska, Nima P. Saadat.
Software location
Archive
Name: Python Package Index (PyPI)
Persistent identifier:https://pypi.org/project/modelbase/
Licence: GPL3
Publisher: Oliver Ebenhöh
Version published: 0.2.5
Date published: 09/10/18
Code repository
Name: GitLab
Persistent identifier:https://gitlab.com/ebenhoeh/modelbase
Licence: GNU General Public License v3.0
Date published: 09/10/18
Language
modelbase was entirely developed in English.
(3) Reuse potential
The strength of our package lies in its flexibility to be applied to simulate and analyse various distinct biological systems. It can be as efficiently used for the development of new models, as for the reconstruction of existing ones. Here, we demonstrate its power by reimplementing three mathematical models that have been previously published without providing the source code (Table 1). This includes i) a model of the photosynthetic electron transport chain (PETC) used to model photoprotective mechanisms in plants and green algae, originating from our lab and initially developed in MATLAB [15]; ii) a model of the CalvinBensonBassham (CBB) Cycle by Poolman et al. [16], developed to study the dynamics of the carbon assimilation and iii) a model of the Pentose phosphate pathway (PPP), adapted by Berthon et al. [17] to investigate label distribution dynamics in isotope labelling experiments.
Process  Original publication  GitHub.com/QTBHHU/  Developer 

Photosynthetic Electron Transport Chain  [15]  ./petcmodelbase  A.M. 
CalvinBensonBassham Cycle  [16]  ./cbbmodelbase  M.v.A. 
Pentose Phosphate Pathway  [4, 17]  ./pppmodelbase  T.N. 
Modelling the PETC to study photoprotective mechanisms
Part of our research focuses on understanding the dynamics of various photoprotective mechanisms present in photosynthetic organisms [18, 15, 19]. The foundation of our further work constitutes the model of the photosynthetic electron transport chain in green algae Chlamydomonas reinhardtii published in 2014 [15]. We have reimplemented the original work in Python and reproduced the results published in the main text (Figure 2), providing a photosynthetic electron transport chain core model, compatible with other modelbaseadapted modules, to further our studies on the dynamics of light reactions of photosynthesis.
CBB Cycle and the dynamics of carbon assimilation
Using modelbase, we have reimplemented a model of the CBB Cycle by Poolman et al. [16]. The model is a variant of the Pettersson and RydePettersson [3] model, where the strict rapid equilibrium assumption is relaxed and fast reactions are modelled by simple mass action kinetics. Its main purpose is to study short to medium time scale responses to changes in extrastromal phosphate concentration and incident light. The concentrations of NADPH, NADP^{+}, CO_{2} and H^{+} are considered constant, leaving the 13 CBB cycle intermediates, ATP, ADP and inorganic phosphate as dynamic variables. The model further incorporates a simplified starch production using glucose 6phosphate and glucose1phosphate and a simple ATP recovery reaction. We used the modelbase implementation of the Poolman model to simulate the steady state concentrations of the metabolites depending on the extrastromal phosphate concentration (Figure 3), reproducing original work by Pettersson and RydePettersson [3]. We have observed that the system is not stable any more for [P_{ext}] > 1.5, a feature not discussed in the Poolman paper [16].
The compatible mathematical representation of the two photosynthetic subsystems, the ATPproducing light reactions and the ATPconsuming CBB cycle, is a prerequisite to merge those two models. Technically, in the modelbase framework, this is a straight forward process. Scientifically, it turned out to be not a trivial task (unpublished work).
PPP and isotope labelling experiments
We envisage that especially our LabelModel extension will find a wide application in metabolic network analysis. Radioactive and stable isotope labelling experiments constitute a powerful methodology for estimating metabolic fluxes and have a long history of application in biological research [20]. Here, we showcase the potential of modelbase for the isotopelabelled experiments by reimplementing the model of the Ftype nonoxidative PPP in erythrocytes originally proposed by McIntyre et al. [4]. This was later adapted by Berthon et al. for label experiments and in silico replication of ^{13}C nuclear magnetic resonance (NMR) studies [17]. We have reproduced the results obtained by the authors, including the time course of diverse Glucose6phosphate isotopomers (Figure 4).
Other possible applications
Among many other applications, modelbase provides tools to reproduce the ‘photosynthetic Gibbs effect’. Gibbs and Kandler described it in 1956 and 1957 [8, 9], when they observed the atypical and asymmetrical incorporation of radioactive ^{14}CO_{2} in hexoses. An example of label incorporation by the CBB cycle intermediates is presented schematically in Figure 5.
Finally, our package provides a solid foundation for additional extensions to the framework architecture, its classes and modelling routines. To encourage its use and to facilitate the first steps to apply the modelbase package, we have prepared an interactive tutorial using a Jupyter Notebook [21], which showcases basic implementation of modelbase and each of its classes in easy to follow and thoroughly explained examples (see https://gitlab.com/ebenhoeh/modelbase/blob/master/Tutorial.ipynb).