Epidemiological studies typically examine the causal effect of exposure on a health outcome. Standardization is one of the most straightforward methods for estimating causal estimands. However, compared to inverse probability weighting, there is a lack of user-centric explanations for implementing standardization to estimate causal estimands. This paper explains the standardization method using basic R functions only and how it is linked to the R package stdReg, which can be used to implement the same procedure. We provide a step-by-step tutorial for estimating causal risk differences, causal risk ratios, and causal odds ratios based on standardization. We also discuss how to carry out subgroup analysis in detail.

Epidemiological studies often examine the causal relationship between exposures and outcomes. Randomized controlled trials (RCTs) are a key method for investigating whether a treatment causes an outcome of interest [

Researchers often have difficulty conducting RCTs despite their well-known advantages, since they are generally costly, time-consuming, and prone to ethical issues. For example, physicians would not randomly assign some liver cancer patients to undergo liver transplantation and other liver cancer patients not to receive transplantation. Instead, physicians provide patients with the best treatment option that is most likely to extend their lifetime or cure their condition based on scientific evidence. In addition, if researchers want to conduct an experimental study to determine the effect of a treatment on a disease, financial support and agreements with funders must be secured.

An observational study is an alternative option for investigating the effects of a treatment on an outcome. However, in observational studies, patients may be influenced by various confounders related to their treatment choices, which results in an estimate of the association rather than causation [

Causal estimands may take various forms, including the risk difference, relative risk, or odds ratio. Standardization is a straightforward method for estimating those estimands, in addition to inverse probability weighting [

Consider 2 random variables. One is the dichotomously measured treatment variable ^{T}^{=1} and ^{T}^{=0} are defined as the outcome variables that would be observed under treatments ^{T}^{=1}=^{1} and ^{T}^{=0}=^{0}.

The population causal effect of treatment exists if Pr[^{1}=1]≠ Pr[^{0}=1], with Pr representing the probability [

The causal effect can be represented by several measures such as the causal risk difference, causal risk ratio, or causal odds ratio (

If researchers randomly assign individuals to receive treatment ^{1}┴ ^{0}┴ ^{t}

In many cases, the potential outcome is naturally linked to the observed outcome ^{1} for the treatment group and ^{0} for the control group. This condition is known as consistency in the literature on causal inference [^{1}×1(^{0}×1(^{1}┴ ^{0}┴ ^{t}^{t}^{t}

Randomization can be undertaken according to the specific characteristics of individuals, and this approach is referred to as a conditional randomized experiment. For example, doctors may want to measure the causal effect of regular vitamin C supplement consumption (vitamin C=1, placebo pill=0) on the probability of developing of lung cancer (lung cancer =1, no event=0). They can choose to recruit random participants depending on their smoking status (smoker =1, ex-smoker =2, non-smoker =3). In this case, the potential outcome would be equally distributed between a group of smokers who take vitamin C supplements and a group of smokers who take a placebo pill. The vitamin C and placebo pill groups are no longer exchangeable, but they are exchangeable within each stratum for smoking status. This result from the conditional randomized experiment implies that Pr[^{t}^{t}^{t}^{t}^{t}_{c}^{t}

Valid causal inferences from observational studies can be made if the study mimics a conditional randomized experiment. In other words, within subpopulations with the same set of confounders ^{t}^{t}^{t}

Expert knowledge is commonly believed to be most easily communicated using simple visual representations. A direct acyclic graph (DAG) is one of the most intuitive methods for graphically depicting the causal relationship between treatments, potential outcomes, and covariates. A DAG depicts the direction of the effect of each variable on target variables and helps to determine if covariates

In addition to the conditional exchangeability and the consistency conditions, the probability of being assigned to either treatment group according to the values for covariate

An epidemiological study usually aims to establish evidence for implementing public health interventions. For example, if the ATE of vitamin C on lung cancer incidence suggests a preventive effect, campaigns to promote vitamin C intake should be recommended. Standardization is a method for estimating the causal estimand of treatment ^{t}_{c} Pr[^{t}^{t}_{c} Pr[

Using the standardization method, once Pr[

The causal effect of treatment in specific subgroups offers important evidence for determining the target populations of public health interventions. In these cases, standardization can be applied to each level of the variable _{c} Pr [_{c} Pr[

There are 2 methods for estimating the causal effects on subgroups using standardization. The first method is to fit a regression model and evaluate the estimand of the dataset separated by variable

This section provides a step-by-step tutorial on how to perform the analysis using the standardization method. The goal in the example is to estimate the average causal effect of initial liver cancer treatment modality on 3-year survival. We used synthetic data that was generated based on Liver Stage Data (LSD) from Korea. The LSD is a sample cohort composed of randomly selected liver cancer patients registered in the Korean Central Cancer Registry between 2008 and 2016 [

Our study design was based on that of a previous study by Kim et al. [^{3}/μL), sodium level (0–135, 135–145, >145 mmol/L), alpha-fetoprotein (AFP) level (0–200, 200–400, >400 ng/mL), Child-Pugh classification (A, B, C, U), and Barcelona Clinic Liver Cancer (BCLC) stage (0, A, B, C, D). We set a dichotomous variable (1=death, 0=survived) as the outcome. In addition, the causal estimands of treatment according to individuals’ BCLC stage were explored. Therefore, we considered using fitted models that included the interaction term between treatment and the BCLC stage variable. The final model was selected and consisted of variables with

This tutorial depicts 2 methods for estimating the causal effect of the treatment based on standardization using R software 4.0.5. One method used basic R functions only, while the other used the stdReg package in R. Hernán and Robins [

In the first method, a function for implementing the standardization for the causal estimands was created, and then bootstrapping was performed to calculate each estimand’s 95% confidence interval. The standardization function consisted of 4 parts. First, 2 datasets were duplicated by adding an index variable, and the copies of the datasets were combined. The first dataset was a copy of the original dataset (liver_syn_data0), and the second dataset (liver_syn_data2) included data on individuals that were identical to the original but with the treatment and death outcomes set to 0 and null, respectively. The third dataset (liver_syn_data3) included data on individuals that were identical to the original, but the treatment and death outcomes were set to 1 and null, respectively. The outcome variables in the 2 duplicated datasets were set as “NA” so that they would be ignored in the fitting step. Second, we fitted a logistic regression model in which the binary outcome was death within 3 years after the initial treatment prescription. Third, the probability of the event was calculated based on the fitted model. Therefore, the standardized probabilities of death were estimated for when all liver cancer patients underwent surgical resection and when all liver cancer patients underwent local ablation therapy. Finally, causal estimands such as risk difference, relative ratio, and odds ratio were computed.

Next, the standardization function was applied to the boot function to perform bootstrapping with 1000 replications. The results of bootstrapping were used to construct 2 versions of the 95% confidence interval. The first version was referred to as the “normal-based confidence interval,” which used the standard error computed from the bootstrapped estimates. The second version used the 2.5% and 97.5% percentiles of the bootstrapped estimates as the left and right limits of the 95% confidence interval, respectively. In this tutorial, we report the first confidence interval in

The same causal estimands can also be computed using the stdReg package. The steps for applying the function in the stdReg package for estimating the estimands were simpler than the previous method. First, we fitted the same logistic regression model and applied the fitted model to the stdGlm function. The function required specifying the name of the exposure variable and the levels of exposure. Since the exposure of interest was the “treatment” variable, we set X=”treatment” and x=(0, 1, 1) in the stdGlm function. Each value within the parenthesis of the x argument indicated the minimum value, maximum value, and the unit of increment in order. Next, the estimates for the causal estimands using the summary function were presented. The summary function enabled estimation of the causal estimand with the standard error and 95% confidence interval by setting the contrast and reference options.

The causal estimates from the standardization function and stdGlm function are compared in

Sensitivity analyses are generally performed to assess the robustness of findings in a study [

The R script for performing standardization on a subset of the population is shown in

Standardization is traditionally used in epidemiology when reporting the specific rate or ratio of a disease, such as the age-standardized incidence rate and the standardized mortality ratio [

Standardization in causal inference answers the causal question based on the potential outcome if everyone is either treated or untreated. Causal inference based on observational studies is difficult since the causal estimand is not fully identifiable without having to make a strong assumption. The observational study must also have conditions that are similar to those of a conditional randomized experiment. However, the required conditional exchangeability between treated and untreated subjects cannot be proven by data and can only be examined using subject matter knowledge.

Our tutorial demonstrated how standardization is implemented for estimating various causal estimands. Before fitting a regression model, creating a DAG is important since it guides the systematic selection of the potential confounders that will be included in a model. In particular, nodes and arrows in a DAG are drawn based on background knowledge [

Standardization is simply a method for estimating the causal estimand of interest. Other methods such as inverse probability weighting, propensity score matching, and instrumental variable methods are some alternatives with different assumptions. Inverse probability weighting is usually compared to standardization when estimating the causal estimand [

No ethics approval and consent to participate was necessary since we used a synthetic data.

Supplemental materials are available at

Appendix 1. Application of standardization for subsets of population using R

Table S1. Results of computed estimands in BCLC stage A subgroup using standardization and stdGlm function

Table S2. Results of sensitivity analysis for standardized estimates in BCLC stage A

None.

The authors have no conflicts of interest associated with the material presented in this paper.

Both authors contributed equally to conceiving the study, analyzing the data, and writing this paper.

This study was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIT) (No.2021R1A2C1014409).

Directed acyclic graph for presenting the relationship between variables. AFP, alpha-fetoprotein; MELD, Model for End-Stage Liver Disease; CPC, Child-Pugh classification; BCLC, Barcelona Clinic Liver Cancer.

Measures of causation and association

Name | Measures of causation | Measures of association |
---|---|---|

Risk difference | Pr[^{1}=1]–Pr[^{0}=1] |
Pr[ |

Risk ratio | ||

Odds ratio |

Standardization method for the causal estimand of treatment

Name | Causal estimand | Standardization method |
---|---|---|

Risk difference | Pr[^{1}=1]–Pr[^{0}=1] |
∑_{c} Pr[_{c} Pr[ |

Risk ratio | ||

Odds ratio |

Standardization for the causal effect in subgroups

Name | Causal estimand for subgroups | Standardization for subgroups |
---|---|---|

Risk difference | Pr[^{1}=1|^{0}=1| |
∑_{c} Pr[_{c} Pr[ |

Risk ratio | ||

Odds ratio |

Comparison of standardization results

Estimand | Standardization function | stdGlm function |
---|---|---|

Risk difference | 0.02 (−0.01, 0.05) | 0.02 (−0.01, 0.05) |

Relative risk | 1.11 (0.94, 1.27) | 1.11 (0.95, 1.27) |

Odds ratio | 1.13 (0.93, 1.34) | 1.13 (0.93, 1.33) |

Values are presented as estimate (95% confidence interval).

Results of sensitivity analyses based on different models

Estimand | Model 1 |
Model 2 |
Model 3 |
---|---|---|---|

Risk difference | 0.02 (−0.01, 0.04) | 0.02 (−0.01, 0.04) | 0.02 (−0.01, 0.04) |

Relative risk | 1.09 (0.94, 1.25) | 1.10 (0.94, 1.25) | 1.10 (0.94, 1.25) |

Odds ratio | 1.12 (0.92, 1.32) | 1.12 (0.92, 1.32) | 1.12 (0.92, 1.32) |

Values are presented as estimate (95% confidence interval).

BCLC, Barcelona Clinic Liver Cancer.

Interaction term between treatment, BCLC stage, and alpha-fetoprotein level and other confounders are included in the model.

Interaction term between treatment, BCLC stage, and cause of liver cancer and other confounders are included in the model.

Interaction term between treatment and Child-Pugh classification and other confounders are included in the model.