Mediation refers to the effect transmitted by mediators that intervene in the relationship between an exposure and a response variable. Mediation analysis has been broadly studied in many fields. However, it remains a challenge for researchers to consider complicated associations among variables and to differentiate individual effects from multiple mediators. [

Mediation effect refers to the effect conveyed by an intervening variable (denoted as

Mediation Diagram.

There are generally two settings for mediation analysis. One is based on linear models to assess the mediation effects. In this branch, there are usually three methods to test the mediation effect. The _{i}^{∗} (e.g., from 0 to 1 for binary _{i}_{i}^{∗}). It is usually impossible to estimate the individual causal effect since only one of the responses, _{i}_{i}^{∗}), is observed. [_{i}_{i}^{∗})). If the subjects are randomly assigned to control or treatment groups, the average causal effect equals the expected conditional causal effect, _{i}_{i}^{∗}). Denote _{i}_{i}^{∗} is defined as _{i}_{i}_{i}^{∗}, _{i}^{∗})). Conventional mediation analysis decomposes the total effect into direct effect from _{i}_{i}^{∗}, _{i}_{i}_{i}^{∗})) – _{i}^{∗}, _{i}^{∗})). Both direct effects measure the change in ^{∗} while ^{∗}. Consequently, the natural indirect effect is the difference between total effect and natural direct effect, δ_{i}_{i}_{i}_{i}_{i}^{∗})). In comparison, the difference between a total effect and a controlled direct effect cannot in general be interpreted as an indirect effect [^{∗}, have to be preset. However, when the relationship among variables cannot be assumed linear, it is hard to choose representative exposure levels, especially if the exposure variable is multi-categorical or continuous.

[

Let _{1}, …, _{p})^{T}_{j}_{X}^{∗} be the infimum positive unit such that there is a biggest subset of _{X}_{X∗}^{∗} ∈ _{X}_{X}_{X∗}^{∗}|_{X∗}^{∗} exists, it is unique. Figure

The mediation effects include the total effect, direct effect from ^{∗} unit.

^{∗} is defined as the change rate in ^{∗} unit: _{|}_{Z}_{x}_{∗}[_{|}_{Z}^{∗})], where the density of ^{∗} is

_{j}_{j}_{j}_{j}

The definition of direct effect not from _{j}_{j}

_{j}_{j}

Compared with conventional definitions of the average mediation effect that focus on the differences in expected ^{∗}, we define mediation effect based on the rate of change so that the effect will not change with either the unit or the changing unit (^{∗} –

When generalized linear models are used to fit the variable relationships, the mediation effects can be estimated based on the coefficients in the generalized linear models. The variances of the estimates can be calculated using Delta method. [

The algorithm is based on the assumption that the relationships among mediators are that for the

Also the fitted model for

where _{i}_{ki}_{k}_{i}_{ki}_{k}_{i}_{k}_{k}

Denote the observations (_{i}, x_{i}, M_{1}_{i}_{pi}^{∗} if it is positive. For continuous ^{∗} is zero, we may set _{x}_{i}_{i}_{X}, i

_{x} with replacement, denote as_{j}, j

_{1j1}, …, _{pj1}^{T} given X = x_{j} from equation (1) for j

_{1j2}, …, _{pj2}^{T} given X = x_{j} + a from equation (1) for j

If _{j}_{k}_{k}_{k}

_{k}:

Use the samples generated by Steps 1 to 3 of Algorithm 1.

Combine the vectors _{k}

Due to the randomness brought in by sampling, the two algorithms are repeated more than once, and the average results from the repetitions are the estimate of the mediation effects. The two algorithms are realized by the function

The proposed mediation analysis consists of three steps which are completed by three functions: function

The package also includes a real data set

Variables in data set “Weight_Behavior”.

Variable Name | Description |
---|---|

bmi | Body mass index |

age | Age at survey |

sex | Male or female |

race | African American, Caucasian, Indian, Mixed or Other |

numpeople | Number of people in family |

car | Number of cars in family |

gotosch | Four levels of methods to go to school |

snack | Eat a snack in a day or not |

tvhours | Number of hours watching TV per week |

cmpthours | Number of hours using computer per week |

cellhours | Number of hours playing with cell phones per week |

sports | In a sport team or not |

exercise | Number of hours of exercises per week |

sweat | Number of hours of sweat-producing activities per week |

overweigh | The child is overweight or not |

The function

The result of

The following codes are to identify mediators and covariates that explain the sexual difference in being overweight in the data set

The function

The output from

In the above codes, algorithms are repeated 100 times (set by

For both models, the response variable is on the scale of log-odds of being overweight. The mediation effects calculated by the logit model or MART are not exactly the same but close. The main reason is that MART considers nonlinear relationships, while the logit model assumes a linear relationship between variables and the response. Also note that the sum of direct and indirect effects may not equal the total effects. This is because there are potential correlation and there- fore overlapping mediation effects among mediators. If one would like to assume independent mediation effects, calculating the total effect by adding up the direct and indirect effects is preferred. Based on MART, overall, girls are more likely to be overweight than boys (the total effect is 0.14>0). That is on average, the odds of being overweight for girls is

We use the bootstrap method to measure the uncertainty in estimating the mediation effects. The following codes are used to calculate the variances and confidence intervals of the estimated mediation effects. The argument _{2} indicates the number of bootstrap iterations. The mediation effects, variances, and confidence intervals are estimated based on the estimated mediation effects from bootstrap samples.

The results from the bootstrap functions are classed as “mma”. Generic functions “print”, “summary”, and “plot” are generated for the class to help users interpret the results easily. This will be shown in the Results section.

Finally, the whole process, from identifying mediators to estimating and making inferences on the mediation effects, can be carried out by the function

The output from the

If the predictor is continuous, the default margin (“a” in Algorithms 1 and 2) is 1. It can be changed by setting

In general, the function

Outputs from the functions

Output from summary (mma).

Using the quantile confidence interval,

The

By the top plot of Figure

Output from plot(mma) on “sports”.

Output from plot(mma) on “exercise”.

All the functions of

The package can work with either Windows, Mac OS X or Linux.

R version 2.14.1 or higher.

An Internet connection is required to install the

R packages:

This package was created by Drs. Qingzhao Yu and Bin Li.

R

Package

The mediation analysis can be extended to the survival model and/or multilevel contexts. [

One limitation of

The authors have no competing interests to declare.