model4you: An R Package for Personalised Treatment Effect Estimation

Typical models estimating treatment effects assume that the treatment effect is the same for all individuals. Model-based recursive partitioning allows to relax this assumption and to estimate stratified treatment effects (model-based trees) or even personalised treatment effects (model-based forests). With model-based trees one can compute treatment effects for different strata of individuals. The strata are found in a data-driven fashion and depend on characteristics of the individuals. Model-based random forests allow for a similarity estimation between individuals in terms of model parameters (e.g. intercept and treatment effect). The similarity measure can then be used to estimate personalised models. The R package model4you implements these stratified and personalised models in the setting with two randomly assigned treatments with a focus on ease of use and interpretability so that clinicians and other users can take the model they usually use for the estimation of the average treatment effect and with a few lines of code get a visualisation that is easy to understand and interpret.


Introduction
Studies in various fields randomly assign individuals to one of two groups with different exposure and then measure a response. For example, in clinical trials patients are assigned to one of two treatment groups where usually one treatment group receives a new treatment or drug and the other treatment group receives the standard of care or a placebo. Other examples are in A-B testing in marketing studies or any other two group comparisons such as the mathematics exam discussed below, where students were divided into different exam groups and received slightly different exam tasks. In the following we will refer to the two groups as treatment groups and to the group indicator as treatment indicator, which always takes values 0 (individual in first group) and 1 (individual in second group).
Treatment effect estimation is often done using simple models with the binary treatment indicator as only covariate. In the example of a clinical trial the treatment indicator would be 1 if the patient receives the new treatment and 0 if the patient receives standard of care. In R such a simple model can be estimated as follows: base_model <-model(response ~ treatment, data) with response being the response measured, treatment being the treatment indicator and data being the data set containing these variables. The function model() can be replaced for example by lm() to estimate a linear model, glm() to estimate a generalised linear model or survreg() to estimate a parametric survival model. These models estimate intercept and treatment effect for all individuals in the data and allow for predicting the response of other individuals given they do or do not receive the treatment of interest.
For cases where the assumption that all individuals have the same intercept and treatment effect is too strict the R package model4you offers two options: 1. Model-based trees identify subgroups where within the subgroups the model parameters are similar and between groups the model parameters are different. This is achieved by finding instabilities in the model parameters with respect to a variable (characteristic) and recursively partitioning the data into groups. If, for example the algorithm finds that men and women have differing treatment effects, the data are partitioned into two subgroups. Details on model-based trees in general can be found in [9] and for the special use case for stratified treatment effect estimation in [6]. Just a single line of code lets the user compute a model-based tree in R: strat_models <-pmtree (base_model) Note that pmtree() uses the data given in the call of the base model. It automatically uses variables not used in the model formula (in the example above response treatment) as potential subgroup defining variables. This can be edited using the zformula argument.
2. Personalised models use model-based random forests to estimate similarity of individuals in terms of model parameters. For each individual a personalised model can be estimated based on a weighted set of the original data, where the similarity measure corresponds to the weight. Details on the personalised models can be found in [7]. Computing personalised models for all observations in the training data is simple: Again here the potential effect-modifying variables are taken by default as all variables not given in the model formula and can be defined using the zformula argument in pmforest().

Mathematics exam analysis:
In 2014 first-year business and economics students at the University of Innsbruck were divided into two examination groups. Group 1 took the exam in the morning and group 2 started after the first group finished. The exams for the two groups were slightly different. The data can be accessed and prepared as follows: data("MathExam14W", package = "psychotools") ## scale points achieved to [0, 100] percent MathExam14W$tests <-100 * MathExam14W$tests/26 MathExam14W$pcorrect <-100 * MathExam14W$nsolved/13 ## select variables to be used MathExam <-MathExam14W[ , c( "pcorrect", "group", "tests", "study", "attempt", "semester", "gender")] To investigate the correlation between exam group and exam performance, we compute a simple linear model regressing the percentage points of correct answers on the exam group.  The model can be visualised by plotting the estimated densities (see Figure 1): Both the estimates and confidence intervals and the density curves suggest that there is almost no difference between the two groups. But does this really hold for all types of students? A tree based on this model can be computed and visualised in only two lines of code: tr_math <-pmtree(bmod_math, control = ctree_control(maxdepth = 2)) plot (tr_math, terminal_panel = node_pmterminal (tr_math, plotfun = lm_plot)) We restrict the depth of the tree to two (maxdepth= 2) for illustration purposes. If this setting is not used a more complex tree would be estimated. Also the cutpoints and effect estimates are rounded in the plot. To view the tree in the R console, the print() function can be used. Art. 17, page 3 of 6 The tree (see Figure 2) divides students based on the percentage of successful online tests. These online tests were conducted biweekly throughout the semester. The largest difference between the two exam groups is in the students who did very well in the online tests (more than 92.3 percent correct). The tree thus gives us much more information on the fairness of the exam than the simple linear model, which is that it does not seem to be fair for students who did very well throughout the semester (at this point we should state that the students self selected into the two exam groups which might also be the reason for differences in exam performance). Note that confidence intervals shown in the tree are not adjusted for selection of the cutpoints in the tree and should hence be interpreted as a measure for variablity and not be used for inference.
Estimating personalised models is almost as simple as the stratified models: Dependence plots with the group effect (treatment effect) on the y-axis and the student characteristics on the x-axis are a good way of visualising the personalised models and for getting knowledge about the interactions between student characteristics and the exam group. Note that the plot is related to but not the same as the classical partial dependence plot [3], as each point in the figure is one patient in the given data set. Since the percentage of successful online tests is measured on a grid, a beeswarm plot (a variant of jittered scatter plot) possibly shows the relationships even better than the scatter plot (both shown in Figure 3, Figure 3a includes a loess curve). We see a nonlinear relation between the percentage of successful online tests and the exam group effect. Students with great performance during the semenster are estimated to have the strongest effect.  For categorical variables such as the number of previous attempts to pass the exam (attempt) and gender we can use box plots, beeswarm plots or a combination thereof (as in Figure 4a). Figure 4b shows the number of previous attempts on the x-axis and the color indicates the gender of each student. This way we can see that for students writing the exam for the first time, gender seems to play a bigger role in terms of estimated individual exam group effect than for others.

Implementation and architecture
The R package model4you is focused on ease of use and interpretability. Users can take a simple model that they know and understand as basis. Models currently available for use are linear models (function lm() in R), generalised  linear models (glm()) and survival models (survreg() and coxph() from the R package survival). Models are restricted to those with a single binary covariate (e.g. two treatment options). The user can simply plug the model object into pmtree() or pmforest() depending on whether they want subgroup wise or personalised models.
The basis for these functionalities is provided by the partykit package which is a widely used R package for trees and forests [4,5]. The model4you package provides wrappers for the well implemented and tested functions partykit::ctree() and partykit::cforest() and extends the functionalities to allow for the computation of personalised models and to improve usability and interpretability.

Quality control
All packages on CRAN undergo standard checks for compatibility with the R package ecosystem. The R package contains examples and tests. These were run and checked on Linux 86_64 and Windows.

Operating system
Should work on all operating systems that run R.

(3) Reuse potential
The software is intentionally written to make usage as simple as possible. The most prominent use case are clinical trials where the assumption of an average treatment effect for all patients is too strict and the efficacy of the treatment depends on patient characteristics (e.g. gender, biomarkers, etc.). For subgroup analyses (stratified treatment effects) modelbased trees (pmtree()) can be used; For personalised treatment effects model-based forests (pmforest()) provide a way of estimating similarity between patients and using this similarity measure to estimate personalised models (pmodel()). The target audience are people who deal with heterogeneous treatment effects, such as medical researchers, pharmaceutical companies or analysts in marketing (A-B testing). In general the software is useful to researchers dealing with scenarios where two exposures are compared and responses of subjects possibly depend on other variables. Currently the packages supports a limited set of model types (linear and generalised linear models and survival models). Further models can be added.
We encourage users to use the party tag on Stackoverflow (http://stackoverflow.com/questions/tagged/party) in case of questions or problems.