A Python Package to Preprocess the Data Produced by Novonix High-Precision Battery-Testers

We present preparenovonix, a Python package that handles common issues encountered in data files generated with a range of software versions from the Novonix battery-testers.1 This package can also add extra information that makes easier coulombic counting and relating a measurement to the experimental protocol. The package provides a master function that can run at once the cleaning and adding derived information, with flexibility to choose only some features. There is a separate function to simply read a column by its given name. The usage of all the functions is documented in the code including examples. The code presented here can be installed either as a python package2 or from a GitHub repository.3


Introduction
The growth exploiting renewable energies is only possible thanks to the development of adequate energy storage systems [2]. Li-ion batteries have become one of the fastest growing electric energy storage systems in the automotive market [4]. Cycling these batteries through charging and discharging is one of the cornerstone experiments to understand their working performance and ageing behaviour, which are essential to help improving their design [5] and the lifespan prediction [3]. This cycling can be done in different types of testers, using a range of batteries not limited to the Li-ion ones. Novonix is a relatively new company in the market of battery-testing systems, catering to high-precision coulometry [1]. An accurate coulombic efficiency tracking can provide insights for battery ageing mechanism and lifetime prediction at early experimental stages. The preparenovonix package prepares the raw data exported from Novonix battery-testers so it can be later analysed with ease. Traditionally, this type of code is not widely shared among different groups working on battery research. However, opening this code to the community has the potential to benefit all users of the Novonix battery-testers and to promote further collaboration developing code relevant for the battery research field.
The preparenovonix package prepares exported data files produced by Novonix battery-testers 4 by (i) cleaning them and (ii) adding derived information to the file. The package also allows reading an individual column given its name. The derived information includes: 1. A State column with explicit information of the start and end of a given type of measurement. Novonix provides a Step number with a different value for each type of measurement, for example, 0 corresponds to an open circuit. However, it is possible to have two consecutive measurements of the same type but with different experimental conditions, for example charging at different currents. These can now be set appart using the State value. 2. A reduced protocol summarising the experimental protocol into having each command and corresponding experimental conditions in a single line. This is needed to directly relate a measurement with the experimental protocol. The reduced protocol is output as a string of arrays and it is stored as part of the header when using the prepare_novonix function (see Figure 2 and the text below). The example data provided within the repository for this code is shown in Figure 1. This figure compares the raw Novonix data with the data after being processed by the preparenovonix package. The example raw data contains individual measurements for which the experimental run time decreases. As it can be seen in Figure 1, these measurements are removed by the preparenovonix package. The example raw data file also includes a failed test. The preparenovonix package takes the capacity from the failed test and adds it to the capacities from the completed experiment. This shifts the result capacity curve by a constant value, as it can be seen in Figure 1. This figure also shows the increasing loop number when the measurements are within a repeat loop and the protocol line each measurement corresponds to.

Implementation and architecture
The main functions available in the preparenovonix package 5 are listed below in alphabetical order. The list contains the module name followed by the function name with the expected input parameters in brackets.     Figure 2. Running all the available features from the preparenovonix package through this function can take form few seconds to up to few minutes depending on the size of the input file.
In what follows, the above functions will be referred by simply their name, without stating the modules they belong to.
As it is shown in Figure 2, the preparenovonix package only cleans data files that are consider to be exported from the Novonix battery-testers and it only derives information for cleaned Novonix files. The master function prepare_novonix allows the user to call either the cleaning process or the addition of extra columns ensuring that these dependencies are taken into account. The input parameters for this function are the path to a file and four boolean optional parameters: addstate, lprotocol, overwrite and verbose. The last parameter provides the option to output more information about the run. If the overwrite parameter is set to False, a new file will be generated with a name similar to the input one, except for the addition of _prep before the extension of the file.
The function isnovonix decides if a file has the expected structure (including a full header) for an exported file produced by the Novonix battery-testers. If the file is lacking the header or if it has not been exported with a Novonix battery-tester using the covered software, 6 the code will exit with an error message and without generating a new file. The function cleannovonix produces a new Novonix type file after performing the following tasks: 0 for the first measurement of a given type (for example, a constant current charge). 1 for measurements between the first and last of a given type. 2 for the last measurement of a given type.
-1 for single measurements. This can happen under different circumstances. A type of measurement can end after a single measurement when some experimental conditions are met, this usually happens while the time resolution is coarse. At times, the current can overshoot from negative to positive values at the beginning of a measurement. A bug in the Novonix software that locks certain values, etc. If two single measurements happen together, the two lines are discarded in the new file containing the additional State column.
The State column is generated based on the following quantities provided in the raw Novonix data files: Step number (integer indicating the type of measurement) and Step time (this time is assumed to reset to 0 each time a new type of measurement starts). The function create_reduced_protocol reads the complete header from the input file and generates (or reads) the reduced protocol. This function returns the reduce protocol itself and a boolean flag, viable_prot. The reduced protocol consist of an array of strings. Each string contains a line number, a command from the experimental protocol and the corresponding experimental conditions (if aplicable); for example: [4 : Repeat 49 times :]. Only commands referring to the following processes will appear in the reduced protocol: 7 • Open circuit storage (or rest) The reduced protocol is tested against the number of unique measurements in the file, determined using the column State. If the number of measurements expected from the protocol is less than the actual number of measurements, the flag viable_prot is set to False, indicating that the construction of the reduced protocol was not viable.
The Protocol line and Loop number columns can be generated by either calling directly the function novonix_add_loopnr or by setting to True the parameter lprotocol when calling the function prepare_novonix. The column Protocol line associates a measurment with its corresponding line in the reduced protocol. The Loop number column has a value of 0 if a measurement does not correspond to any repetition statement in the protocol and otherwise it grows monotonically with each repetition (see Figure 1).
If the flag viable_prot was set to False by the reduced_protocol function, the Protocol line and Loop number columns are populated with the value -999.

Quality control
Each function in the preparenovonix package is tested with internal checks and with pytest both locally and through the Travis Continuous Integration service. 8 The tests have been performed in different platforms and using different Python versions. The tests use an example data file. This file is automatically retrieved when the dedicated GitHub repository is either cloned or downloaded (see the 'Software location' section for the relevant urls).
Each function is documented with an example of usage. The expected result when used on the example data is also provided. Moreover, an example script, example.py, is provided at the root directory of the dedicated GitHub repository. This script also produces Figure 1.
The complete documentation for the preparenovonix package can be found at: https:// prepare-novonix-data.readthedocs.io/.

Operating system
Windows, OSX, Linux Programming language Python 3.5 and above.

Additional system requirements
The code presented here uses as input the data files exported directly from the Novonix battery-testers. The on-line documentation described in the 'Quality control' section, provides an updated list of the Novonix software versions that the code presented here has been tested against.

Dependencies
This software requires the numpy Python library. Matplotlib is also required for using the plotting routine compare.plot_vct.py. Further details on how to