Software Metapapers

xarray: N-D labeled Arrays and Datasets in Python

Authors:

Abstract

 xarray is an open source project and Python package that provides a toolkit and data structures for N-dimensional labeled arrays. Our approach combines an application programing interface (API) inspired by pandas with the Common Data Model for self-described scientific data. Key features of the xarray package include label-based indexing and arithmetic, interoperability with the core scientific Python packages (e.g., pandas, NumPy, Matplotlib), out-of-core computation on datasets that don’t fit into memory, a wide range of serialization and input/output (I/O) options, and advanced multi-dimensional data manipulation tools such as group-by and resampling. xarray, as a data model and analytics toolkit, has been widely adopted in the geoscience community but is also used more broadly for multi-dimensional data analysis in physics, machine learning and finance.

Keywords:

PythonpandasnetCDFmultidimensionaldatadata handlingdata analysis
  • Year: 2017
  • Volume: 5 Issue: 1
  • Page/Article: 10
  • DOI: 10.5334/jors.148
  • Submitted on 7 Sep 2016
  • Accepted on 23 Feb 2017
  • Published on 5 Apr 2017
  • Peer Reviewed