PyPSA: Python for Power System Analysis

Python for Power System Analysis (PyPSA) is a free software toolbox for simulating and optimising modern electrical power systems over multiple periods. PyPSA includes models for conventional generators with unit commitment, variable renewable generation, storage units, coupling to other energy sectors, and mixed alternating and direct current networks. It is designed to be easily extensible and to scale well with large networks and long time series. In this paper the basic functionality of PyPSA is described, including the formulation of the full power flow equations and the multi-period optimisation of operation and investment with linear power flow equations. PyPSA is positioned in the existing free software landscape as a bridge between traditional power flow analysis tools for steady-state analysis and full multi-period energy system models. The functionality is demonstrated on two open datasets of the transmission system in Germany (based on SciGRID) and Europe (based on GridKit).


I. INTRODUCTION
Power system tools model the interactions between the electrical grid and the consumers and generators which use the grid. The importance of software modelling of the grid has risen in recent years given the increase in distributed and fluctuating wind and solar generation, and the increasing electrification of all energy demand. On the generation side, variable renewable generation causes loading in parts of the grid where it was never expected, and introduces new stochastic influences on the flow patterns. On the demand side, the need to decarbonise the transport and heating sectors is leading to the electrification of these sectors and hence higher electrical demand, replacing internal combustion engines with electric motors in the transport sector, and replacing fossil fuel boilers with heat pumps, resistive heaters and cogeneration for low-temperature space and water heating. In addition, the increasing deployment of storage technologies introduces many network users which are both consumers and generators of energy.
The increasing complexity of the electricity system requires new tools for power system modelling. Many of the tools currently used for power system modelling were written in the era before widespread integration of renewable energy and the electrification of transport and heating. They therefore typically focus on network flows in single time periods. Examples of such tools include commercial products like DIgSILENT PowerFactory [1], NEPLAN [2], PowerWorld [3], PSS/E [4] and PSS/SINCAL [5], and open tools such as MATPOWER [6], PSAT [7], PYPOWER [8] and pandapower [9] (see [10] for a full list of power system analysis tools).
The consideration of multiple time periods is important on the operational side for unit commitment of conventional generators and the optimisation of storage and demand side management, and on the investment side for optimising infrastructure capacities over representative load and weather situations. Several tools have subsets of these capabilities, such as calliope [11], minpower [12], MOST [13], oemof [14], OSeMOSYS [15], PLEXOS [16], PowerGAMA [17], PRIMES [18], TIMES [19] and urbs [20], but their representations of electrical grids are often simplified.
Python for Power System Analysis (PyPSA), the tool presented in this paper, was developed at the Frankfurt Institute for Advanced Studies to bridge the gap between power system analysis software and general energy system modelling tools. PyPSA can model the operation and optimal investment of the energy system over multiple periods. It has models for the unit commitment of conventional generators, time-varying renewable generators, storage units, all combinations of direct and alternating current electricity networks, and the coupling of electricity to other energy sectors, such as gas, heating and transport. It can perform full load flow calculations and linearised optimal load flow, including under consideration of security constraints. It was written from the start with variable renewables, storage and sector-coupling in mind, so that it performs well with large networks and long time series.
Given the complexity of power system tools and the different needs of different users, it is crucial that such tools are both transparent in what they do and easily extendable by the user. To this end, PyPSA was released as free software under the GNU General Public Licence Version 3 (GPLv3) [21]. This means that the user is free to inspect, use and modify the code, provided that if they redistribute the software, they also provide the source code. Free software and open data also guarantee that research results can be reproduced by any third party, which is important given the large investment decisions that will need to be made on the basis of energy system modelling to reduce greenhouse gas emissions and combat global warming [22], [23].
PyPSA is available online in the Python Package Index (PyPI), on GitHub [24] and is archived on Zenodo [25]. Documentation and examples are available on PyPSA's website [26]. PyPSA is already used by more than a dozen research institutes and companies worldwide, 70 people are registered on the forum [27] and the website [26] has been visited by people from over 160 countries. As of October 2017 it has been used in six research papers [28], [29], [30], [31], [32], [33]. Users have already extended PyPSA for integer transmission expansion [28], [34] and in the grid planning tool open eGo [35]. This paper describes version 0.11.0 of PyPSA [25]. In Section II the mathematical functionality of PyPSA is described, while in Section III the focus shifts to the implementation in software. Quality control is discussed in Section IV; the computational performance of PyPSA is described in Section V; and then its functionality is compared with other software in Section VI. Several example applications are given in VII before conclusions are drawn in VIII.

II. FUNCTIONALITY
In this section the basic components, power flow, linear optimal power flow, energy system optimisation, unit commitment, contingency modelling and other functionality of PyPSA are described. The definitions of the main variables used in this section can be found in Table I, along with units where applicable.

A. Components
PyPSA's representation of the power system is built by connecting the components listed in Table II.
Buses are the fundamental nodes to which all other components attach. Their mathematical role is to enforce energy conservation at the bus at all times (essentially Kirchhoff's Current Law).
Loads, generators, storage units, stores and shunt impedances attach to a single bus and determine the power balance at the bus. Loads represent a fixed power demand; a generator's dispatch can be optimised within its power availaiblity; stores can shift power from one time to another with a standing loss efficiency for energy leakage; storage units behave like stores, but they can also have efficiency losses and power limits upon charging and discharging; finally shunt impedances have a voltage-dependent power consumption.
Lines and transformers connect two buses with a given impedance. Power flows through lines and transformers according to the power imbalances at the buses and the impedances in the network. Lines and transformers are referred to collectively as 'passive branches' to distinguish them from controllable link branches. The impedances of the passive branches are modelled internally using the equivalent PI model. The relation between the series impedance z = r + jx, the shunt admittance y = g + jb, the transformer tap ratio τ , the transformer phase shift θ shift , and the complex currents I 0 , I 1 and complex voltages V 0 , V 1 at the buses labelled 0 and 1 is given by (For lines, for which neither the tap ratio or the phase shift are relevant, set τ = 1 and θ shift = 0 in this equation.) The equivalent circuit is shown in Figure 1. This circuit is for the case where the tap-changer is on the primary side; a similar equation and figure for the case where the tap-changer is on the secondary side is given in the documentation [26]. The line model defaults to the PI model, while the transformer model defaults to the more accurate T model, which is converted to the PI model using the standard delta-wye transformation. For convenience standard types for lines and transformers in networks at 50 Hz are provided following the conversion formula from nameplate parameters to impedances and the typical parameters provided in pandapower [9], so that the user does not have to input the impedances manually. The typical parameters in pandapower are based on [36], [37], [38].
Links connect two buses with a controllable active power dispatch that can be set by the user or optimised by PyPSA. Links can be used to represent point-to-point high voltage direct current (HVDC) lines, import-export capacities in transport models such as Net-Transfer-Capacity (NTC) Generator whose feed-in can be flexible subject to minimum loading or minimum down and up times, or variable according to a given time series of power availability.

Storage Unit
A device which can shift energy from one time to another, subject to efficiency losses. Store A more fundamental storage object with no restrictions on charging or discharging power. Shunt Impedance An impedance in shunt to a bus. Line A branch which connects two buses of the same voltage. Transformer A branch which connects two buses of different voltages. Link A branch with a controllable power flow between two buses. models, or general energy conversion processes with a given efficiency, such as resistive heaters or heat pumps (from electricity to heat) or gas boilers (from gas to heat). Their efficiency can also be time-varying (e.g. to represent the ambient temperature dependence of a heat pump's coefficient of performance). Networks of links implement Kirchoff's Current Law (energy conservation at each bus), but not Kirchoff's Voltage Law, which is obeyed by networks of passive branches. A generator can also be represented in terms of more basic components: a bus is added for the fuel source with a store to represent the amount of fuel available. It is then connected to the electricity bus with a link to represent the energy conversion loss. Similarly a storage unit can be represented with an additional bus for the storage medium with a store attached, and then two links connected to the electricity bus to represent charging and discharging. Energy enters the model in generators; in storage units or stores with higher energy levels before than after the simulation; and in any components with efficiency greater than 1 (such as heat pumps). Energy leaves the model in loads; in storage units or stores with higher energy levels after than before the simulation; and in lines, links or storage units with efficiency less than 1.

B. Power flow without optimisation
In a power flow calculation, the user specifies the power dispatch of all dispatchable components (loads, generators, storage units, stores and links) and then PyPSA computes the resulting voltages in the network and hence the power flows in passive branches (lines and transformers) based on their impedances.
1) Power flow equations for AC networks: A power flow calculation for an alternating current (AC) network ensures that for all buses labelled by n we have where S n = P n + jQ n is the apparent power injected at the bus, I n is the complex current and V n = |V n |e jθn is the complex voltage, whose rotating angle is measured relative to a chosen slack bus. Y nm is the bus admittance matrix, which is constructed for all buses based on the contributions from the individual branch admittance matrices from equation (1) and any shunt impedances at the nodes, following the example of MATPOWER [6]. The inputs and outputs for the buses are given as follows: • For the chosen slack bus n = 0, it is assumed that the voltage magnitude |V 0 | and the voltage angle θ 0 are given. PyPSA must find the powers P 0 and Q 0 . • For P Q buses, P n and Q n are given; |V n | and θ n are to be found. • For P V buses, P n and |V n | are given; Q n and θ n are to be found. The non-linear equation system (2) is then solved using the Newton-Raphson algorithm [39] and, by default, an initial 'flat' guess of θ n = θ 0 and |V n | = 1 (per unit). The initial guess can also be specified ('seeded') by the user, using for example the linearised power flow solution.
2) Power flow equations for DC networks: A power flow calculation for a direct current (DC) network ensures that for all buses labelled by n we have where P n is the active power injected at the bus and the voltage, current and the conductance matrix G ij are now all real quantities. This non-linear equation is also solved with the Newton-Raphson algorithm.
3) Linearised power flow equations for AC networks: In some circumstances a linearisation of the AC power flow equations (2) can provide a good approximation to the full non-linear solution [40], [41]. The linearisation is restricted to calculating active power flows based on voltage angle differences and branch series reactances. It assumes that reactive power flow decouples from active power flow, that there are no voltage magnitude variations, voltage angles differences across branches are small enough that sin θ ∼ θ and branch resistances are negligible compared to branch reactances. This makes it suitable for short overhead transmission lines close to their natural loading.
In this case it can be shown [6] that the voltage angles are related to the active power injections by a matrix where K is the incidence matrix of the network, B is the diagonal matrix of inverse branch series reactances x multiplied by the tap ratio τ , i.e. B = b = 1 x τ , and θ shift is the phase shift for a transformer. The matrix KBK T is singular with a single zero eigenvalue for a connected network and can be inverted by first deleting the row and column corresponding to the slack bus. 4) Linearised power flow equations for DC networks: For DC networks the equation (3) is linearised by positing V n = 1 + δV n and assuming that δV n is small. The resulting equations mirror the linearised AC approximation with the substitutions θ n → δV n and x → r .

C. Optimisation with linear power flow equations
PyPSA is a partial equilibrium model that can optimise both short-term operation and long-term investment in the energy system as a linear problem using the linear power flow equations. PyPSA minimises total system costs, which include the variable and fixed costs of generation, storage and transmission, given technical and physical constraints. The objective function is given by min F ,Gn,r,Hn,s,En,s f ,t ,gn,r,t,hn,s,t,sucn,r,t,sdcn,r,t c · F + n,r c n,r · G n,r + n,r,t (w t · o n,r · g n,r,t + suc n,r,t + sdc n,r,t ) It consists of the branch capacities F for each branch and their annuitised fixed costs per capacity c , the generator capacities G n,r at each bus n for technology r and their annuitised fixed costs per capacity c n,r , the dispatch g n,r,t of the unit at time t and the associated variable costs o n,r , the start up and shut down costs suc n,r,t and sdc n,r,t when unit commitment is activated, the storage unit power capacities H n,s and store energy capacities E n,s at each bus n for storage technology s and their associated fixed costs c n,s andĉ n,s , and finally the positive part of the storage dispatch [h n,s,t ] + and the associated variable costs o n,s .
The branch flows f ,t are optimisation variables but do not appear in the objective function. The optimisation is run over multiple time periods t representing different weather and demand conditions. Each period can have a weighting w t ; the investment costs must then be annuitised for the total period t w t (typically a full year). The dispatch of generators g n,r,t is constrained by their capacities G n,r and time-dependent availabilitiesg n,r,t and g n,r,t , which are given per unit of the capacities G n,r : g n,r,t · G n,r ≤ g n,r,t ≤ḡ n,r,t · G n,r ∀ n, r, t (6) For conventional generators the availabilities are usually constant; a fully flexible generator would haveg n,r,t = 0 andḡ n,r,t = 1. For variable renewable generators such as wind and solar,ḡ n,r,t represents the weather-dependent power availability, while curtailment may also be limited by introducing a lower bound on the dispatchg n,r,t .
The dispatch can also be limited by ramp rate constraints ru n,r and rd n,r per unit of the generator nominal power: −rd n,r ·G n,r ≤ (g n,r,t −g n,r,t−1 ) ≤ ru n,r ·G n,r ∀ n, r, t > 0 (7) Unit commitment for conventional generators is described in Section II-E.
The power capacity G n,r can also be optimised within minimumG n,r and maximumḠ n,r installable potentials: The dispatch of storage units h n,s,t , whose energy carriers are labelled by s, is constrained by a similar equation to that for generators in equation (6): h n,s,t · H n,s ≤ h n,s,t ≤h n,s,t · H n,s ∀ n, s, t (9) excepth n,s,t is now negative, since the dispatch of storage units can be both positive when discharging into the grid and negative when absorbing power from the grid. The power capacity H n,s can also be optimised within installable potentials.
The energy levels e n,s,t of all storage units have to be consistent between all hours and are limited by the storage energy capacity E n,s e n,s,t = η wt n,s,0 e n,s,t−1 +η n,s, e n,s,t · E n,s ≤ e n,s,t ≤ē n,s,t · E n,s ∀ n, s, t Positive and negative parts of a value are denoted as [·] + = max(·, 0), [·] − = − min(·, 0). The storage units can have a standing loss (self-discharging leakage rate) η n,s,0 , a charging efficiency η n,s,+ , a discharging efficiency η n,s,− , inflow (e.g. river inflow in a reservoir) and spillage. The initial energy level can be set by the user, or it is assumed to be cyclic, i.e. e n,s,t=0 = e n,s,t=T . The store component is a more basic version of the storage unit: its charging and discharging power cannot be limited and there are no charging and discharging efficiencies η n,s,+ , η n,s,− . The energy levels of the store can also be restricted by time seriesẽ n,s,t ,ē n,s,t given per unit of the energy capacity E n,s ; this allows the demand-side management model of [42] to be implemented in PyPSA. The energy capacity E n,s can also be optimised within installable potentials.
Global constraints related to primary energy consumption, such as emission limits, can also be implemented. For example, CO 2 emissions can be limited by a cap CAP CO2 , implemented using the specific emissions e r in CO 2 -tonneper-MWh th of the fuel r and the efficiency η n,r of the generator: n,r,t 1 η n,r w t · g n,r,t · e r ≤ CAP CO2 ↔ µ CO2 (11) µ CO2 is the shadow price of this constraint.
The (inelastic) electricity demand d n,t at each bus n must be met at each time t by either local generators and storage or by the flows f ,t from the branches r g n,r,t + s h n,s,t + α ,n,t · f ,t = d n,t ↔ w t · λ n,t ∀ n, t where α ,n,t = −1 if starts at n, α ,n,t = 1 if is a line or transformer and ends at n, and α ,n,t = η ,t if is a link and ends at n (note that for lines and transformers, α ,n,t is the incidence matrix of the network, α ,n,t = K n ). η ,t represents an efficiency loss for a link (it can be time-dependent for efficiency that, for example, depends on the outside temperature, like for a heat pump). λ n,t is the marginal price at the bus. This equation implements Kirchhoff's Current Law (KCL), which guarantees energy conservation at each node. The flows in all passive branches are constrained by their For links, the flows can be more finely controlled with timedependent per unit availabilitiesf ,t ,f ,t which allows, for example, time-dependent demand-side management schemes to be modelled [42]. For both passive branches and links, the upper and lower limits are associated with KKT multipliersμ ,t and μ ,t . The flows in links are fully controllable. Power flows in networks of passive branches (lines and transformers) according to the linear power flow equations. It is assumed that the network is lossless, so that η ,t = 1 for passive branches. To guarantee the physicality of the network flows, in addition to KCL, Kirchhoff's Voltage Law (KVL) must be enforced in each connected network. KVL states that the voltage differences around any closed cycle in the network must sum to zero. If each independent cycle c is expressed as a directed combination of passive branches by a matrix C c then KVL becomes the constraint where x is the series inductive reactance of branch . In a recent paper it is demonstrated that this formulation of the linear load flow using cycles solves up to 20 times faster than standard formulations using the voltage angles [30]; voltage angle and PTDF formulations are also implemented in PyPSA and deliver identical results. Since branch capacities F can be continuously expanded to represent the addition of new circuits to an aggregated transmission corridor , the impedances x of the branches would also decrease. In principle this would introduce a bilinear coupling in equation (15) between the x and the f ,t . To keep the optimisation problem linear and therefore computationally fast, x can be left fixed in each optimisation problem, updated and then the optimisation problem rerun, in up to 5 iterations to ensure convergence, following the methodology of [43]. Another author has implemented an integer transmission expansion in PyPSA [34]  the bilinearity with a disjunctive big-M relaxation [44]; this will be incorporated into the main code base of PyPSA soon.

D. Coupling to other energy sectors
PyPSA can also optimise operation and investment in other energy sectors, such as natural gas, heating and transport. These sectors can be modelled using a network of links with efficiencies for energy conversion losses; an example from a recent paper [29] is shown in Figure 2. For example, links from electricity to heat buses can represent resistive heaters and/or heat pumps (the latter can also be modelled with a time-dependent coefficient of performance, given the importance of capturing the dependence of heat pump performance on outside temperature [45]). Combined Heat and Power plants (CHPs) can also be modelled by adding additional constraints for the back pressure and fuel consumption (see the PyPSA examples [26]). Depletable resources such as natural gas are modelled with stores.

E. Unit Commitment
Unit commitment can be turned on for any generator. This introduces a times series of new binary status variables u n,r,t ∈ {0, 1}, which indicates whether the generator is running (1) or not (0) in period t. The restrictions on generator output now become: u n,r,t ·g n,r,t · G n,r ≤ g n,r,t ≤ u n,r,t ·ḡ n,r,t · G n,r ∀ n, r, t (16) so that if u n,r,t = 0 then also g n,r,t = 0.
If T min up For non-zero start up costs suc n,r a new variable suc n,r,t ≥ 0 is introduced for each time period t and added to the objective function. The variable satisfies suc n,r,t ≥ suc n,r (u n,r,t − u n,r,t−1 ) ∀ n, r, t (19) so that it is only non-zero if u n,r,t − u n,r,t−1 = 1, i.e. the generator has just started, in which case the inequality is saturated suc n,r,t = suc n,r . Similarly for the shut down costs sdc n,r,t ≥ 0 we have sdc n,r,t ≥ sdc n,r (u n,r,t−1 − u n,r,t ) ∀ n, r, t The ramp-rate limits in equation (7) can also be suplemented by ramping limits at start-up and shut-down.

F. Security-Constrained LOPF
PyPSA has functionality to examine the steady state of the power system after outages of passive branches, based on an analysis of the linear power flow.
PyPSA calculates the Branch Outage Distribution Factor (BODF) from the Power Transfer Distribution Factors (PTDF) (see [46]). The BODF gives the change in linearised power flow on passive branch given the outage of passive branch k f Here f is the flow before the outage and f (k) is the flow after the outage of branch k.
The BODF can then be used in Security-Constrained Linear Optimal Power Flow (SCLOPF). SCLOPF builds on the LOPF by including additional constraints that branches may not become overloaded after the outage of a selection of branches. For each potential outage of a branch k, a set of constraints for all other branches is included, guaranteeing that they do not become overloaded beyond their capacity F |f (k)

G. Network clustering
PyPSA also implements a variety of network clustering algorithms to reduce the number of buses in a network while preserving important transmission lines. For example, the k-means clustering algorithm was recently used in [32] to examine the effect of clustering on investment optimisation results.

H. Planned new features
PyPSA is currently in version 0.11.0. PyPSA has been designed to be modular, so that it is possible to develop the code for many other types of calculations. Currently features being considered by the development team include, in rough order of priority: • Integer transmission expansion, following an existing implementation in PyPSA [34] using the disjunctive big-M relaxation [44]; • Multi-horizon dynamic investment optimisation over several years, following for example the implementation in OSeMOSYS [15]; • Transient analysis using the Root-Mean-Square (RMS) values of phasor quantities, following the implementation in PSAT [7]; • An implementation of the non-linear power flow solution using analytic continuation in the complex plane [47], following the implementation in GridCal [48]; • Short-circuit analysis, following the implementation in pandapower [9]; • OPF with the full non-linear network equations, following the implementations in PYPOWER and MAT-POWER; • An interactive web-based GUI for analysing and manipulating the network topology.
III. IMPLEMENTATION AND ARCHITECTURE PyPSA was written in the Python programming language [49] because it is widely used in the modelling community, it is easy to learn and its implementation is also free. It is available for every major operating system, including GNU/Linux, Mac OSX and Windows. PyPSA has been tested with versions 2.7 and 3.5 of Python.
PyPSA stores all data about network components in the DataFrame objects of the Python library pandas [50]. This enables easy and efficient indexing of the data, while mantaining the fast calculation speeds of the underlying array objects of the Python library NumPy [51]. For each of the components listed in Table II (except the overall Network container component) there is a DataFrame listing the static attributes (such as line impedance or capital cost) and a dictionary of DataFrames containing the time-varying attributes (such as wind power availability or consumer demand) that are in addition indexed by the list of snapshots. The specification of some attributes (such as generator maximum output) can be either static or time-varying; if the time series is not specified, then the static value is taken.
All matrix calculations and solutions of linear equation systems are carried out either with NumPy [51] or, in the case of sparse matrices, with SciPy [52]. These Python libraries interface with lower-level programming language libraries to benefit from the speed of strongly-typed languages.
Optimisation problems are formulated using the Pythonbased optimization modeling language Pyomo [53], [54], which is solver agnostic and intuitive to extend. The use of Pyomo also allows inter-operability with other energy system frameworks that use Pyomo, such as calliope [11], oemof [14] and urbs [20]. In PyPSA lower-level functions in Pyomo have been exploited to improve computational performance.
PyPSA has no graphical user interface, but integrates closely with the IPython [55] interactive notebooks, where networks and their properties can be visualised using the Python library Matplotlib [56] (see Figures 4 and 5) or the interactive JavaScript-based library plotly [57].
Internally PyPSA converts all power system quantities (voltage, power, current, impedances) to per unit values. Set points for loads and generation are stored separately from the power values which are actually dispatched.
IV. QUALITY CONTROL PyPSA comes with a large test suite that covers all of its major functionality. Tests are implemented using the Python library pytest [58]. Tests are also included that compare PyPSA's results with other software such as PYPOWER [8] and pandapower [9]. Users can and do report bugs by raising issues in the GitHub repository [24] or on the forum [27]. In Figure 3 computation times are given for a full power flow on the MATPOWER [6] test cases (the IEEE standard cases as well as snapshots from the French TSO RTE and European networks [59]) using MATPOWER and PyPSA. In both cases the complete execution of the load flow function ('runpf' for MATPOWER and 'network.pf' for PyPSA) was timed on a computer with Intel Core i5-2520M processors at 2.50GHz each with a tolerance of 10 −8 for the summed error in the apparent power S from equation (2). The timings were averaged over 10 attempts for each network. The computation times are similar, thanks to the fact that both MATPOWER and PyPSA (via the SciPy library [52]) use the same C library umfpack [60] for solving sparse linear equation systems, but PyPSA is in all cases slightly slower due to the overhead of preparing the admittance matrices in pure Python code. If the admittance matrix remains the same for several calculations, PyPSA has the option to avoid recalculating it, which can save some of this time; further acceleration is possible by using the just-in-time (jit) compiler numba [61], as has been done in the pandapower project [9] with success for larger networks.
For the linear optimal power flow (LOPF) the computation performance depends strongly on the choice of linear solver. To give an indication of typical calculation times, if dispatch in the SciGRID model of the German transmission network described in Section VII (585 buses, 1423 generators including curtailable wind and solar at each node, 38 pump storage units, 852 lines, 96 transformers) is optimised over 4 snapshots, it takes 5 seconds using the COIN-OR Clp free solver on the computer described above. Extensive timings for different formulations of the LOPF problem can be found in [30].

VI. COMPARISON TO OTHER POWER SYSTEM TOOLS
Given the proliferation of software tools available for modelling power systems, a guide is provided here that briefly compares PyPSA to other power system tools, with a particular focus on free software in the Python programming language. The advantages of Python are discussed above in Section III.
Selected features for a selection of different software tools are compared in Table III. Many of the tools have specialised features that are not shown in the table, so this table should only be treated as an indicative overview of their features in relation to PyPSA's features.
Many power system tools concentrate on steady-state, dynamic (i.e. short-term transient) and single-period OPF analysis of power networks. They neglect the multi-period unit commitment, investment optimisation and energy system coupling which PyPSA offers. In Python we focus our comparison on two tools: PYPOWER and pandapower.
PYPOWER [8] is a port of an older version of MAT-POWER [6] from Matlab to Python. It does not make strong use of Python's object-oriented interface and structures data using NumPy arrays, which makes it difficult to track component attributes. It has no functionality to deal with multiperiod OPF, which makes it unsuitable for unit commitment, storage optimisation or investment optimisation. This reflects the functionality of older versions of MATPOWER, but the latest version 6.0 of MATPOWER includes the MATPOWER Optimal Scheduling Tool (MOST) [13], which does multiperiod OPF, but no investment optimisation. Unlike PyPSA, PYPOWER has the ability to do full non-linear OPF for single snapshots.
pandapower [9] provides a pandas [50] interface to PY-POWER [8], which makes it easier to use, and adds useful functionality such as standard types (on which PyPSA's standard types are based), short circuit calculations, state estimation, and modelling of switches and three-winding transformers. The last four functions are currently missing in PyPSA, along with non-linear OPF, but like PYPOWER, pandapower does not have multi-period OPF functionality. pandapower is under active development and the PyPSA team stays in contact with the pandpower team to exchange tips and features, which is a clear benefit for both developers and users of free software.
PyPSA differs from more general energy system models such as calliope [11], oemof [14], OSeMOSYS [15] and urbs [20] by offering more detailed modelling of power networks, in particular the physics of power flow according to the impedances in the network. PyPSA can model a more general energy network using link components (see Section II-D), but cannot, for example, yet do the multiyear dynamic investment that OSeMOSYS does. The nonfree PLEXOS software [16] comes the closest to matching PyPSA's functionality, but PLEXOS is missing non-linear power flow.
These differences with other software tools are the reason that it was decided to develop a new tool rather than to extend an existing one. Existing tools for power flow such as PYPOWER did not have the internal code and data structures for economic optimisation over multiple time periods with many inter-temporal actors, whereas the energy system tools were missing the tight integration with power flow analysis that we believe is necessary for future research.    (12)) for Germany in an hour with high wind and low load; Middle: Line loading during this hour: highly loaded lines in the middle of Germany prevent the transport of cheap wind energy to consumers in the South; Right: Reactive power feed-in (positive in red, negative in blue) necessary to keep all buses at unit nominal voltage. users just starting out with the software. These range from basic small-scale networks demonstrating the features of PyPSA, to a one-node-per-country model of the European power system with high shares of renewables [62], to full transmission network models available as open data from the SciGRID [63] and GridKit projects [64], [65] which we demonstrate here.

VII. DEMONSTRATION OF FEATURES
The SciGRID model of Germany provides geo-referenced data for substations and transmission lines (220 kV and above). In one code example, data from openly-available sources on power plant locations and capacities, load distribution and time series are added to the SciGRID data so that load flow calculations can be carried out. The results of one such simulation for Germany with nodal pricing is shown in Figure 4. In this snapshot there was a large amount of zero-marginal-cost wind feed-in suppressing the locational marginal prices (λ n,t from equation (12)) in the North of Germany. Transmission bottlenecks in the middle of Germany prevent the transportation of this cheap electricity to the South, where more expensive conventional generators set the price. The linearly-optimised dispatch was then fed into a full non-linear power flow calculation where each bus was set to maintain nominal voltage; the resulting reactive power feed-in is also shown in Figure 4.
The data quality for the transmission grid in Open-StreetMap outside Germany is not of uniform quality, so for the European grid, an extract of the ENTSO-E interactive map [66] was made [64] using GridKit [65]. The details of how load, conventional power plants and renewable generation time series and expansion potentials were added to the grid data are provided in a forthcoming paper [67]. The result of generation and storage investment optimisation for a clustering of the network from 5000 buses down to 256 buses, allowing no grid expansion and assuming a CO2 offshore wind onshore wind solar run of river battery storage hydrogen storage gas Fig. 5. Results of optimisation of generation and storage capacities in Europe to reduce CO 2 emissions in the European electricity sector by 95% compared to 1990 levels [32]. The grid topology is based on the GridKit network for Europe, clustered from 5000 buses to 256 buses.
reduction of 95% compared to 1990 levels, is shown in Figure 5. The lack of grid expansion forces some balancing of renewable variability locally with storage. Short-term battery storage (grey) combines with solar power (yellow) in Southern Europe, while longer-term hydrogen storage (purple) pairs with wind power (blue) in Northern Europe. This system has an average cost of e 82/MWh. If the grid is optimally expanded, much of the storage can be eliminated and costs are as low as e 65/MWh [32].

VIII. CONCLUSIONS
In this paper a new toolbox has been presented for simulating and optimising power systems. Python for Power System Analysis (PyPSA) provides components to model variable renewable generation, conventional power plants, storage units, coupling to other energy sectors and multiplyconnected AC and DC networks over multiple periods for the optimisation of both operation and investment. Tools are also provided for steady-state analysis with the full load flow equations. PyPSA's performance for large datasets, comparisons with other software packages and several example applications are demonstrated.
As free software, the code of PyPSA can easily be inspected and extended by users, thereby contributing to further research and also transparency in power system modelling. Given that public acceptance of new infrastructure is often low, it is hoped that transparent modelling can contribute to public understanding of the various options we face when designing a sustainable energy system.