(1) Overview

Introduction

Agent-based models (ABMs) are a popular way to model coupled human and natural systems (CHANS) because of their flexibility in representing complexity in human decision-making and in how individuals can respond to, and exert control on, flows and stores of materials and energy represented by process-oriented models of biophysical systems. Agents can represent individuals or groups of individuals with common beliefs and values, they can interact with each other and with their environment, and learn from and adapt to changing environmental conditions [, ]. ABMs of land use and land cover change (LULCC) have been created to examine agriculture, conservation strategies, and urban expansion across the globe [, , ], yet many of these are challenging to use outside of the original system for which they were developed because of the location-specific nature of the data inputs and model structure.

There are many agent-based models and frameworks that have been developed for a variety of applications. These range from general-purpose ABM platforms like NetLogo, MASON, and RePast to ABMs that have been formulated and developed for very specific problems. These platforms and models span a number of programming languages including (R, Java, C++, and Python) and some have developed entire windowed integrated development environments for creating, running, and analyzing ABMs (e.g., NetLogo, RePast). The Network for Computational Modeling in Social and Ecological Sciences (CoMSES Net, []) is a community portal that supports agent based modeling in social and ecological sciences, providing researchers the opportunity to contribute their ABM code, be it the entire modeling package or the subset of model scripts that require execution in a general-purpose ABM framework. A subset of ABMs have been developed and applied specifically to the problem of LULCC. Similar to the broader community of ABMs, these ABMs of LULCC range from sets of code oriented toward specific research questions and problems to more generic toolsets for modeling LULCC within a general-use ABM framework.

The need for the modeling framework reported here arises from a growing subfield of CHANS research that addresses MultiSector Dynamics (MSD). MSD research seeks to understand growing interdependencies and risks at the intersection of the energy, water, and land sector and, as such, recognizes LULCC as inextricably coupled to the dynamics of water and energy systems. As such, a goal of this model is to increase the flexibility of integrating various component models such as urban population dynamics [], land surface models (e.g. CLM; [], surface and groundwater models (WRF-Hydro, []; ParFlow, []), transportation networks, infrastructure and energy development []. The interdependencies between these subsystems require either multi-model simulations or mechanisms for software coupling. Janus initiates this process by creating agents that can observe any number of environmental and social constraints while supporting a flexible library of decision-making options.

The model described here is intended to provide a framework for assembling ABMs of LULCC in ways that: (1) are grounded in data that characterizes both social and biophysical processes being represented, (2) can test alternative hypotheses about the role of social networking in potential spatiotemporal patterns of land use and land cover, and (3) can facilitate integration with models characterizing other sectors with which LULCC is connected, such as water supply systems, transportation, and energy distribution []. It was developed to facilitate the use of consistent county-scale demographic and land use (30 m) data that are available throughout the United States. Although this data is specific to the United States, any data source that permits the development of distributions of age and land status (e.g. owner, tenant) may be used. Demographics from sub-national datasets enables incorporation of local details about agents and their decision-making to be preserved in a regional scale model. The motivation of this model is to compare LULCC projections derived from global integrated human-Earth system models with ABM-derived projections that explicitly incorporate local socio-political and environmental constraints. It was developed to use projections from GCAM [] but can be configured to take any crop profit times series and combination of land use categories.

Janus was developed in Python to facilitate use in a variety of ABM applications. The suite of preprocessing tools allows for streamlined and reproducible data inputs, while the post-processing tools allow for efficient assessment of results (Figure 1). Explanation of how to run the model and details on the current decision-making process are explained in the Janus README. Although only one decision-making function is currently in place, the model allows these decision-making processes to be interchangeable. Additional attributes can be given to agents to tailor them to a specific location or set of environmental constraints.

Figure 1 

Architecture of the file system and data sets associated with Janus. * Layers are not included in the example dataset, but denote additional data that decision-making could be based upon in the future.

Implementation and architecture

Janus follows a sequential workflow that includes preprocessing and the model run itself (Figure 2):

  • Step 1: Gather input data, landcover data and a shapefile of counties in the area of interest. The county shapefile must have the county names under a field ‘county’
  • Step 2: GIS pre-processing (janus/preprocessing/get_gis_data.py)
  • Step 3: Profit signal generation (janus/preprocessing/generate_synthetic_prices.py)
  • Step 4: Declare variables (janus/example/config.yml)
  • Step 5: Run Model (janus/model.py)
Figure 2 

GIS preprocessing workflow, implementation and model workflow.

The Cropland Data Layer (CDL) may be downloaded using CropScape [] (nassgeodata.gmu.edu/CropScape/) or the Aether Platform (pypi.org/project/aether/), and the county shapefiles can be found in the TIGER dataset []. Once these land cover and spatial extent files are placed in the data folder, use the get_gis_data.py to prepare the modeling domain and initial landcover dataset. In this step the user defines the scale of interest, the initial model year and the counties to include. A suite of GIS functions (geofxns.py) are then used to convert the CDL data to GCAM categories (or user defined categories in the “keyfile”), aggregate the data to a larger scale (1 km or 3 km), create a grid of polygons, and an extent grid. Although this preprocessing has been developed for CDL data, users could modify the code to use any land cover and land use classification product.

To run this model for different regions both inside and outside the U.S., the user would simply need to prepare the following:

Geospatial and demographic information are loaded and declared. Land cover data and grids may be generated or based on real data. The domain, crop choices, profit profiles and agents are initialized. Each agent type is an individual class and is populated and updated every time step. The number and type of agents used in the model can be modified by adding agents using an analogous class structure or by adding or changing attributes, the two that are currently available are Farmer and Urban agents (Figure 3).

Figure 3 

Class diagram of currently available agents in Janus.

Once initialized, the decision-making process is looped through each agent in each timestep. The functions that make up the decision-making process are all included in the crop_functions directory, namely crop_decider.py. Within the crop decider a switching probability curve is created, and the profit comparison method the agent uses are defined (Table 1). After deciding whether to maintain their current crop or change to another, the land use decision is stored in the domain and agent attributes are updated accordingly. The resulting output are three 3-dimensional (time, x, y) NumPy arrays which contain the land cover, profit, and agent characteristics at each time step.

Table 1

Description of the three available price functions and associated parameters.

Price functionParametersNotes

Linear rampBeginning price, ending price, noise varianceGenerates linearly increasing, decreasing, or constant prices.
Step functionBeginning price, ending price, time fraction during simulation of steep change, noise varianceGenerate step increases or step decreases
Sinusoidal functionAverage price, amplitude of variation, number of periods during simulation, noise varianceGenerates fluctuating prices or (using the number of periods simulated) a monotonically increasing or decreasing price

The following is the decision-making process implemented in Janus:

  • LULCC decisions are made through a process by which farmer agents compare the profitability of current land use (i.e., crop choice) with potential alternatives.
  • The difference between anticipated profit for alternative crop choices for the next year and the anticipated profit of the current land use for the next year are computed.
  • For those alternatives for which there is an anticipated profit increase, a decision to adopt an alternative is made probabilistically.
    • Each agent is assigned behavioral characteristics that define the probability that the agent would choose the alternative as a function of the increase in profit associated with that alternative.
    • That probabilistic characteristic curve is parameterized using a beta distribution cumulative distribution function and, therefore, each agent is assigned parameters alpha and beta that describe the shape of the beta distribution.
    • The beta distribution was chosen for its relative simplicity and because it allows significant flexibility in describing agent behavior.
    • For instance, Figure 4 shows two alternative agent behaviors given different values of the beta distribution. One agent exhibits a high probability of changing land use for a relatively modest increase in anticipated profit and the other exhibits a correspondingly low probability for switching land use at the same anticipated increase in profit.
  • Based on the anticipated increase in profit associated with a particular alternative land use and the agent behavior, as defined by the parameters of the beta distribution, the value of the beta cumulative distribution function (CDF) is determined for that agent.
  • The CDF value is compared to a uniform random number on the interval from 0 to 1 and, if the random number is less than the value of the CDF, the alternative crop is identified as a potential selection.
  • This process of comparing anticipated profit between alternative crops and the current crops based on the associated agent characteristics captured by the beta distribution is repeated for all alternative crops for which profit is anticipated to increase.
  • At the end of the comparison step, if there are no potential alternatives selected based on the stochastic decision process, the land use remains the same as the previous time step.
  • If there is only one potential selection identified, then the land use will switch to that particular alternative land use.
  • If multiple potential selections have been identified, the user can configure the model to select the alternative with the largest anticipated increase in profit or select randomly from all potential selections.
Figure 4 

Example switching probability curves for two beta distributions that describe agent behavior.

In most envisioned scientific applications, the input price signals would be derived from the output of a global integrated human-Earth systems model that is simulating global markets and associated crop prices under alternative scenarios, or from some other external data source that includes annual projections of a basket of alternative crops over some fixed time horizon. For purposes of functional testing of the model; however, we developed a simple script that can create an arbitrary number of synthetic price signals that can be used as input to the model. This synthetic price generator assumes that all synthetic price time series will be represented via either a: (1) linearly increasing or decreasing function, (2) a step increase decrease in price, or (3) a sinusoidally varying. Each of these functional forms is associated with specific parameters that the user specifies and that allow the user significant flexibility in the associated behavior of the prices that drive agent decisions.

Quality Control

Janus was developed with a robust testing suite that has been built to ensure unit performance and functional accuracy. Tests are triggered upon alteration to the remote repository through continuous integration using Travis CI (https://travis-ci.org/). We have also developed a suite of tests that are executed at runtime to provide informative feedback for any warnings and errors that may be raised. Users are also provided with an example data set for testing that can be installed automatically (see README). Users may raise issues on GitHub for additional support on using the software.

(2) Availability

Operating System

Mac OS X; Linux

Programming Language

Python >= 3.3

Dependencies

numpy>=1.11

scipy>=0.18

matplotlib>=1.3.1,<3.0

pandas>=0.19

geopandas>=0.5.0

setuptools>=24.2.0

rasterstats>=0.13.0

gdal>=2.1

joblib>=0.11

rasterio>=1.0.8

pycrs>=1.0.1

shapely>=1.6.1

nass>=0.1.1

fiona>=1.7.13

pyyaml>=3.12

Software location

Archive: Zenodo

Name: GitHub

Persistent Identifier: https://doi.org/10.5281/zenodo.3763731

Publisher: Kendra E. Kaiser

Version published: v1.0.1

Date published: 23 April 2020

Code repository

Name: GitHub

Identifier: https://github.com/LEAF-BoiseState/janus/tree/v1.0.0

License: BSD 2-Clause

Date published: 23 April 2020

Language

English

(3) Reuse

The code employs the use of docstrings throughout to ensure clarity about what each function does and alternative options that the user can declare. One of the main utilities of the model is the ability for users to automatically pre-process land cover data to larger scales and set how land categories are aggregated. We use CDL data in the example dataset, but other land cover data products could be used and aggregated with the preprocessing code with slight modifications to the code. Additional environmental data such as elevation and slope could also be added to include in decision-making (Figure 1).

The methods and functions in Janus are very well suited for future extension through addition of agents, agent attributes and decision-making processes. These additional agents could represent regulatory agents that constrain how and where land use changes occur. For example, in some places, agricultural land may be expanding, in order for this to occur, agents would need to be assigned to land parcels that have the potential to be developed, and those agents would need additional decision-making processes that would enable them to develop their land, or sell to an urban or agricultural agent.

Additional agent attributes that characterize how individuals might make decisions would more highly resolve local details about land use choices. Urban agents currently only have urban density as an attribute, additional attributes that characterize their values regarding agricultural land will enable additional decision-making options that include the ability for urban agents to buy agricultural land. Incorporation of social networks, and social learning strategies would lead to interesting experimental designs that lend themselves to alternative spatial patterns and decision-making outcomes.

In addition to modifying or expanding upon the existing code base, this model could be integrated with multisector models. Land use choices will be dependent on the availability of water resources, proximity to transportation for goods, and various nested scales of economic signals. The model structure is particularly suited to integrate with other gridded models.