The Madden-Julian Oscillation (MJO) is a prominent feature of the intraseasonal variability of the atmosphere. The MJO strongly modulates tropical precipitation and has implications around the globe for weather, climate and basic atmospheric research. The time-dependent state of the MJO is described by MJO indices, which are calculated through sometimes complicated statistical approaches from meteorological variables. One of these indices is the OLR-based MJO Index (OMI; OLR stands for outgoing longwave radiation). The Python package mjoindices, which is described in this paper, provides the first open source implementation of the OMI algorithm, to our knowledge. The package meets state-of-the-art criteria for sustainable research software, like automated tests and a persistent archiving to aid the reproducibility of scientific results. The agreement of the OMI values calculated with this package and the original OMI values is also summarized here. There are several reuse scenarios; the most probable one is MJO-related research based on atmospheric models, since the index values have to be recalculated for each model run.

The Madden-Julian Oscillation (MJO) is a prominent feature of the Earth’s atmosphere-ocean system with a high relevance for weather, climate and basic atmospheric research. In order to explain the purpose and structure of the software package introduced here, we first mention a few meteorological key points of the MJO.

The MJO was described for the first time by [

The passage of the convective anomaly from the Indian Ocean to the Pacific is conventionally split into 8 temporal phases, roughly defined by the position of the anomaly as shown in

Depiction of the definition of the 8 MJO phases according to the position of the convection anomaly. The figure is taken from the original publication by Madden and Julian [

The software package introduced here provides a modern and sustainable reimplementation of the OMI algorithm and we first outline the general approach of the index calculation, before the software package itself is introduced. For details on the underlying algorithm see Kiladis et al. [

While the RMM index is a good choice to conduct real-time analyses of the MJO, OMI overcomes a few drawbacks of RMM at the expense of the real-time capability [

In contrast to these widespread possible applications, to our knowledge there is no publicly available software to compute OMI from data. Only available is the official description paper [

Given the importance that OMI has gained in MJO research there are collectively a number of reasons to provide a quality-tested open source code to compute OMI:

Facilitate MJO research using the OMI index based on modeled data.

Facilitate MJO research based on real-world data without being dependent on updates of the respective web page (although the dependency on the availability of the observational OLR data remains).

Enable researchers to easily further understand the characteristics of OMI by modifying an established and tested version of the original OMI calculation (which depends on some choices of thresholds etc.). A recent example of such a case has been brought up by Hoffmann and von Savigny [

Make OMI also conveniently available to all researchers, whose research questions might involve the MJO and who can be spared from the effort to characterize the MJO themselves.

Publish the source code as a special kind of technical documentation of the rather complicated calculation approach with 100%-coverage of all its details in addition to the original publication [

Check and demonstrate the reproducibility of the involved and potentially error-prone statistical approach as a contribution to good scientific practice. (The reimplementation actually led to an update [

The reimplementation in Python, which is presented here, was motivated by two of the points listed above, particularly the analysis of modeled data and the further investigation of OMI characteristics. While the implementation approach was at first solely based on the description paper [

Since the package has been released only recently, there is so far no big community using it. However, there are numerous research papers using OMI and we have already gotten numerous requests regarding the usage of the package prior to the release. In one case we have already shared a preliminary version of the code.

The package structure is designed to be easily extendable to other MJO indices in the future. Basic modules are therefore placed directly in the

The basis for the data handling and all numerical operations is the

There are four classes that handle the data exchange between the

The class

The other three classes represent the calculation results:

A calculated pair of EOFs and associated statistical diagnostic quantities are stored in

The list of all 366 pairs of EOFs is stored in

The PC time series, which is the basic output that represents the temporal evolution of the MJO, is represented by

All these classes come with routines for I/O and basic diagnostic plots.

Note that it is in principle possible to provide the OLR data on freely chosen spatial and temporal grids as input for the OMI calculation. However, the original OMI calculation has been performed on spatial grids with a spacing of 2.5° between 20° S and 20° N in latitude and 0° to 360° in longitude as well as daily averages in the time domain [

The calculation of the OMI EOFs and PCs itself is implemented in the module

The preprocessing consists of a temporal and spatial filtering of the input data. This is actually a rather involved centerpiece of the OMI calculation and has been implemented in the separate module

For the PCA step, two different implementations can be chosen via an argument of the method call: the internal implementation, which follows the description by Kutzbach [

The post processing consists of two pragmatic steps, introduced in the original calculation of OMI. First, the signs of the EOFs calculated by the PCA are arbitrary. This means that the signs may switch from one DOY to another, which is undesirable. Therefore, the signs of all 366 pairs of EOFs are aligned after their computation as the first step, i.e. arbitrary sign reversals of EOFs between neighboring DOYs will be removed. Note that this post processing step might cause problems if the calculation is not performed on the original spatial grids. In this case, the users should call the preprocessing and the PCA calculation separately and then implement individual post processing solutions themselves if needed at all. Second, it was found that reasonable EOFs for some DOYs at the beginning of November were difficult to obtain [

The reimplementation presented here should not be understood as a one-to-one porting of the original code. Instead, it is essentially a new implementation following the statistical steps described in Kiladis et al. [

One subtle detail, probably responsible for a part of any differences in the results obtained, is the treatment of leap years in the selection of the data samples. We have included two options in the code, which are selectable for the users with the keyword argument

For convenience, the package is listed in the Python package index (

The package includes a basic example, which is available as a common Python script (

The software quality control includes three levels of automated testing routines, which are based on the

As stated before, the code in the

Comparison of recalculated and original EOFs summarized over all DOYs. Note that we did not include numbers for the setup “strict leap year treatment/DOY 366 included”, since these numbers are only determined by the EOFs of DOY 366, which is intentionally different from the original. Hence, no conclusion on the overall agreement can be drawn from these numbers.

EOF | INDICATOR | LEAP YEAR TREATMENT | DOY 366 | VALUE |
---|---|---|---|---|

1 | Correlation | not strict | both | >0.994 |

2 | Correlation | not strict | both | >0.993 |

1 | 99% percentile | not strict | both | <0.0084 W/m^{2} |

2 | 99% percentile | not strict | both | <0.0065 W/m^{2} |

1 | Correlation | strict | excluded | >0.994 |

2 | Correlation | strict | excluded | >0.993 |

1 | 99% percentile | strict | excluded | <0.0084 W/m^{2} |

2 | 99% percentile | strict | excluded | <0.0065 W/m^{2} |

Comparison of recalculated and original PCs considering the complete period of the available original data (01/01/1979 to 28/08/2018).

PC | INDICATOR | LEAP YEAR TREATMENT | DOY 366 | VALUE |
---|---|---|---|---|

1 | Correlation | not strict | both | >0.998 |

2 | Correlation | not strict | both | >0.998 |

1 | Std.-Dev. of difference | not strict | both | <0.0458 |

2 | Std.-Dev. of difference | not strict | both | <0.0488 |

1 | 99% percentile | not strict | both | <0.157 |

2 | 99% percentile | not strict | both | <0.1704 |

1 | Correlation | strict | excluded | >0.998 |

2 | Correlation | strict | excluded | >0.998 |

1 | Std.-Dev. of difference | strict | excluded | <0.0449 |

2 | Std.-Dev. of difference | strict | excluded | <0.0484 |

1 | 99% percentile | strict | excluded | <0.1523 |

2 | 99% percentile | strict | excluded | <0.1671 |

1 | Correlation | strict | included | >0.998 |

2 | Correlation | strict | included | >0.998 |

1 | Std.-Dev. of difference | strict | included | <0.0509 |

2 | Std.-Dev. of difference | strict | included | <0.0501 |

1 | 99% percentile | strict | included | <0.1552 |

2 | 99% percentile | strict | included | <0.1708 |

Examples of recalculated EOFs in comparison to the original EOFs for DOY 23, which is among the DOYs with the best agreement, and DOY 218, which has the worst agreement. Note that the color scale of the panels with the differences varies.

Detailed comparison statistics for the EOFs of all DOYs. See text for details.

Comparison of the recalculated and original PCs for an arbitrarily chosen sample period (the year 2011).

Tested on Ubuntu 18.04 Linux and Windows 10.

Python > = 3.6 (tested with Python 3.6, 3.7, and 3.8)

There are no special hardware requirements in addition to a state-of-the-art personal computer system (e.g., 1.5 GHz Processor, 8 GB memory, a few GB free disk space). The complete algorithm will run for a few hours on such a system.

The package depends on the following standard Python packages. These can be installed using common Python package managers (e.g.,

In order to run the unit and integration tests, the following package is needed:

To have the possibility to use an external implementation of the PCA as described before, the following package can be installed:

Some of the unit and integration tests depend on external datasets, which serve either as input or as reference for the results. These datasets are also permanently available from Zenodo (

Christoph G. Hoffmann (University of Greifswald, Germany) has written the code and led the project.

George N. Kiladis (NOAA/Physical Sciences Laboratory, Boulder, Colorado) has contributed code samples as a reference (which are not included in the package) and discussed several implementation issues.

Maria Gehne (CIRES, University of Colorado Boulder, and NOAA/Physical Sciences Laboratory, Boulder, Colorado) has tested the package from the perspective of the original designers of OMI.

Juliana Dias (CIRES, University of Colorado Boulder, and NOAA/Physical Sciences Laboratory, Boulder, Colorado) has provided a file with reference data.

Christian von Savigny (University of Greifswald, Germany) has contributed to the discussion of the general approach.

The language of the code and the documentation is English.

The reimplementation of the OMI algorithm as open source code can be helpful for climate, weather, and basic atmospheric research in diverse aspects as has been outlined in the introduction. This also includes documentation aspects and good scientific practice.

The primary reuse case of the package in terms of actually running the code to calculate OMI values consists of the analysis of the MJO behavior in complex models of the atmosphere. The major point is that it will be necessary to recalculate OMI for the atmospheric conditions simulated by a specific model to get a consistent representation of the MJO in that particular model. Due to the chaotic nature of the Earth’s atmosphere even an ideal numerical model would not able to precisely reproduce the Earth’s weather at a particular point in space and time for many days after its initialization. Hence, although state-of-the-art atmospheric models produce realistic weather patterns and realistic climatological conditions (in the sense of large-scale and long-term averaging), the conditions for individual periods and locations cannot be reasonably compared to the real world for long-term runs. This implies that it is impossible to use the original OMI index calculated for the real world based on OLR observations to also describe the MJO in a modeled atmosphere. Put in other words, each MJO-related study based on free running atmospheric models has to recompute the OMI index for each model run based on the modeled OLR data to get a consistent representation. This is easily possible with the presented Python package. Given the various atmospheric models (of which all are run with many different setups depending on the particular science questions) and the rising awareness of the relevance of the MJO for tropical and extra-tropical meteorology, we expect a high reuse potential, as long as the community becomes aware of the existence of this code.

A more specific reuse case is to understand the characteristics of OMI itself. This knowledge can become useful, when subtle interactions between the MJO and other processes in the earth system are studied. In this case, it must be considered that the particular representation of the MJO (here OMI) influences the results. For this, it can be helpful to be able to recompute OMI with slight modifications, e.g., with different values for the filter constants of the bandpass filter. We do not expect, however, that these individual exploratory variations of the implementation should feed back into the basic source code, as the basic code should unambiguously represent the original documented and scientifically approved OMI algorithm.

We expect that the results, which are produced by the package, will be stable right from the outset, since all major features for the complete reproduction of OMI have already been implemented and tested. Nevertheless, we welcome contributions to the code, such as code optimizations or implementations of other MJO indices. These contributions will also have to meet the high quality standards in terms of automated testing etc. to keep the results stable and scientifically reliable. For contributions and questions, we can be contacted using the project’s GitHub page (e.g., “Pull requests” and “Issues”) and the author contacts of this manuscript.

We thank Alejandro Jaramillo Moreno for fruitful discussions on the implementation of the Wheeler-Kiladis-Filter in Python and Rattana Chhin for beta-testing the package. We would like to thank the two reviewers for their valuable comments on the manuscript and the software. We acknowledge support for the Article Processing Charge from the DFG (German Research Foundation, 393148499) and the Open Access Publication Fund of the University of Greifswald.

This work was supported by the University of Greifswald.

The authors have no competing interests to declare.