### INTRODUCTION

### REVISITING THEORY

### Potential Outcome and Causal Estimand

*T*(1: treatment, 0: no treatment), and the other is the dichotomously measured outcome variable

*Y*(1: event, 0: no event).

*Y*

^{T}^{=1}and

*Y*

^{T}^{=0}are defined as the outcome variables that would be observed under treatments

*T*=1 and

*T*=0, respectively, and they are referred to as the potential outcomes [4]. For simplicity, we will use

*Y*

^{T}^{=1}=

*Y*

^{1}and

*Y*

^{T}^{=0}=

*Y*

^{0}.

*Y*

^{1}=1]≠ Pr[

*Y*

^{0}=1], with Pr representing the probability [4]. A measurement of a causal effect answers the question, “What is the outcome if everyone is treated or untreated?” Meanwhile, a measurement of an association answers the question, “What is the difference between the treated group and the untreated group?” The causal effect is determined by measuring the effect of treatment on the same population, whereas the associational effect is determined by measuring the effect of the treatment on 2 different subpopulations.

### Randomized Experiments and Causal Effects

*T*, then the potential outcome should be independent of the treatment

*T*, which is expressed as

*Y*

^{1}┴

*T*and

*Y*

^{0}┴

*T*. This means that the potential outcome

*Y*

*would be equally distributed in both treatment and non-treatment groups. Thus, the subjects in both the treatment and non-treatment groups are considered exchangeable.*

^{t}*Y*as

*Y*=

*Y*

^{1}for the treatment group and

*Y*=

*Y*

^{0}for the control group. This condition is known as consistency in the literature on causal inference [4]. In our binary case, the consistency suggests

*Y*=

*Y*

^{1}×1(

*T*=1)+

*Y*

^{0}×1(

*T*=0), with 1( ) representing the indicator function that results in 1 when the event in the parentheses is true. Note that, although

*Y*

^{1}┴

*T*and

*Y*

^{0}┴

*T*,

*Y*is not independent of

*T*in general. Under randomized assignment and the consistency condition, the probability that

*Y*

*=1 is equal to the probability of an observed outcome for individuals who receive treatment*

^{t}*T*=

*t*, since Pr[

*Y*=1|

*T*=

*t*]= Pr[

*Y*

*=1|*

^{t}*T*=

*t*]=Pr[

*Y*

*=1]. Therefore, the measure of association from an RCT becomes the corresponding causal measure.*

^{t}*Y*=1|

*T*=

*t*,

*C*=

*c*]=Pr[

*Y*

*=1|*

^{t}*T*=

*t*,

*C*=

*c*]=Pr[

*Y*

*=1|*

^{t}*C*=

*c*], with

*C*representing smoking status. In the above equation, the key condition maintained from the conditional randomized experiment is

*Y*

*┴*

^{t}*T*|

*C*, which is referred to as “conditional exchangeability” in the causal inference literature [4]. The conditional probability Pr[

*Y*

*=1|*

^{t}*C*=

*c*] provides all of the necessary quantities for evaluating the causal measures described in Table 1 since Pr[

*Y*

*=1]=∑*

^{t}*Pr[*

_{c}*Y*

*=1|*

^{t}*C*=

*c*] Pr[

*C*=

*c*]. This connection is called standardization, which is explained below in more detail.

### Identification

*C*, if an observational study is similar to an RCT, the causal effect from observational data can be inferred. In this case, the primary condition needed for making a valid causal estimate based on an observational study is the conditional exchangeability

*Y*

*┴*

^{t}*T*|

*C*. If the conditional exchangeability from an observational study is maintained, the causal effect of the treatment can be identified from observed data using Pr[

*Y*=1|

*T*=

*t*,

*C*=

*c*]=Pr[

*Y*

*=1|*

^{t}*T*=

*t*,

*C*=

*c*]=Pr[

*Y*

*=1|*

^{t}*C*=

*c*] in the same manner as in a conditional RCT. However, since the potential outcome is not fully observed in an observational study, the conditional exchangeability for each researcher’s application cannot be confirmed. Therefore, expert knowledge is needed to confirm whether the conditional exchangeability is plausible.

*C*need to be adjusted as well as identifies potential biases such as selection bias, collider bias, and confounding bias [4,6–8]. Shrier and Platt [8] elaborated on the specific steps for building a DAG. Fortunately, there is software for selecting the confounder set

*C*based on causal DAGs. For practical information, researchers can visit http://www.dagitty.net/.

*C*must be greater than 0. In other words, Pr[

*T*=

*t*|

*C*=

*c*]>0 for all values in c with Pr[

*C*=

*c*]>0. Positivity guarantees that the causal effect can be assessed within the subpopulation if

*C*=

*c*. If the positivity condition is violated in some cases in which

*C*=

*c*, the subpopulation with

*C*=

*c*can be removed from the target population for evaluating the causal estimand of interest. Positivity can be violated in 2 ways: structural violations and random violations [4,9]. Structural violations can occur if the treatment cannot be assigned to individuals under certain conditions. For example, when doctors assign treatment

*Y*to individuals with

*C*=1, the positivity has been structurally violated. Random violation can occur if the 0 event of treatment randomly occurs within the strata of treatment and covariates when the probability within the population is not actually 0.

### Standardization

*T*on outcome

*Y*in a conditional RCT. The probability of the potential outcome is the weighted average of the probability of the potential outcome according to the values for covariate

*C*weighted by the proportion of the population in

*C*=

*c*for all subjects in c (Pr[

*Y*

*=1]=∑*

^{t}_{c}Pr[

*Y*

*= 1|*

^{t}*C*=

*c*] Pr[

*C*=

*c*]). The probability of the potential outcome can be replaced with the observed outcome according to the conditional exchangeability and consistency: Pr[

*Y*

*=1]=∑*

^{t}_{c}Pr[

*Y*=1|

*T*=

*t*,

*C*=

*c*] Pr[

*C*=

*c*]. Table 2 shows the method of estimating causal estimands based on standardization.

*Y*=1|

*T*=

*t*,

*C*=

*c*] is evaluated, the causal estimand can be straightforwardly computed since Pr[

*C*=

*c*] can simply be replaced with 1/

*n*, with

*n*representing the sample size. Pr[

*Y*=1|

*T*=

*t*,

*C*=

*c*] can be obtained using logistic regression analysis with

*Y*as a response variable and

*T*and

*C*as the covariates. For different types of

*Y*, other generalized linear models or machine learning techniques may be used.

*S*which classifies the subgroup of interest, in which

*S*contains

*s*groups. For example, if the conditional exchangeability of the subgroup remains

*S*=1, the ATE of treatment

*T*on

*Y*for the subgroup

*S*=1 can be estimated using ∑

_{c}Pr [

*Y*=1|

*T*=1,

*C*=

*c*,

*S*=1] Pr[

*C*=

*c*|

*S*=1]– ∑

_{c}Pr[

*Y*=1|

*T*=0,

*C*=

*c*,

*S*=1] Pr[

*C*=

*c*|

*S*=1] (Table 3).

*S*in each subset. Therefore, variable

*S*would be excluded from the regression model. The second method is to fit a regression model including interaction terms with the variable

*S*using the whole dataset and compute the estimand for each subgroup. The choice of the estimation method depends on the sample size of each subgroup. If the sample size of each subgroup is small, then the second method would be preferable since the regression model may not guarantee stability of the fitted model for a small sample size. However, if the sample size is large enough in each subgroup, the first method would be preferred since it would provide a more flexible method for estimating the causal effect of a subgroup since different models can be fitted for each subgroup. Therefore, the bias-variance tradeoff must be considered when selecting the method for estimating the causal estimands of individual subgroups. The estimate using the second method usually results in a more biased but less fluctuating treatment effect estimate than the first method, whereas the first method usually results in a less biased but more volatile treatment effect estimate.

### LIVER CANCER TREATMENT EXAMPLES

^{3}/μL), sodium level (0–135, 135–145, >145 mmol/L), alpha-fetoprotein (AFP) level (0–200, 200–400, >400 ng/mL), Child-Pugh classification (A, B, C, U), and Barcelona Clinic Liver Cancer (BCLC) stage (0, A, B, C, D). We set a dichotomous variable (1=death, 0=survived) as the outcome. In addition, the causal estimands of treatment according to individuals’ BCLC stage were explored. Therefore, we considered using fitted models that included the interaction term between treatment and the BCLC stage variable. The final model was selected and consisted of variables with

*p*-values of less than 0.05.

# Load librarieslibrary (readxl) library (dplyr) library (boot) library (stdReg)# Import liver synthetic dataliver_syn_data1 <- read.csv (‘~/liver_syn_data.csv’)# [1] Build function for standardizationstandardization <- function( data, indices ) { liver_syn_data0 <- data[indices, ]# [1]-1. data expansion# originalliver_syn_data0$data.class <- ‘ori’# 2nd copyliver_syn_data2 <- liver_syn_data0 %>% mutate (data.class=‘T0’, Treatment=0, death=NA)# 3rd copyliver_syn_data3 <- liver_syn_data0 %>% mutate (data.class=‘T1’, Treatment=1, death=NA)# Combine all dataonesample <- rbind ( liver_syn_data0, liver_syn_ data2, liver_syn_data3)# [1]-2. Fit the modelfit1 <-glm (death ~ Treatment * i_bclc+age+Liver_ Cancer_Cause+MELD+cpc_cat+platelet_cat+ Sodium_level+AFP_level+Ascites_status, family= ‘binomial’,data = onesample) # Confounders in the dataset (i_bclc: BCLC stage; age: Age; Liver_Cancer_Cause: cause of liver cancer; MELD: MELD score; cpc_cat: child pugh classification; platelet_cat: Platelet count; Sodium_level: sodium level, AFP_level: alpha-fetoprotein; Ascites_status: ascites status)# [1]-3. Predict the outcome Yonesample$predicted.meanY 1 <- predict.glm (fit1, onesample, type=“response”) Y1T1=mean( onesample$predicted.meanY1 [onesample$data.class==‘T1’]) Y1T0=mean( onesample$predicted.meanY1 [onesample$data.class==‘T0’])# [1]-4. Calculate the causal estimandsATE=Y1T1 - Y1T0 #Risk difference RR=Y1T1/Y1T0 #Relative ratio OR=(Y1T1/(1-Y1T1))/(Y1T0/(1-Y1T0)) #Odds ratio return(c(ATE, RR, OR))}

# [2] Generate confidence intervals# [2]-1. Calculate the 95% confidence intervalset.seed(1234) results <- boot(data=liver_syn_data1, statistic= standardization, R=1,000, parallel=“multicore”) se <- c(sd(results$t[, 1]), sd(results$t[, 2]), sd(results$t[, 3])) mean <- results$t0# 95% normal confidence interval using sell1 <- mean - qnorm(0.975) * se ul1 <- mean + qnorm(0.975) * se# 95% percentile confidence intervalll2 <- c (quantile (results$t[,1], 0.025), quantile (results$t[,2], 0.025), quantile (results$t[,3], 0.025)) ul2 <- c (quantile (results$t[,1], 0.975), quantile (results$t[,2], 0.975), quantile (results$t[,3], 0.975))# [2]-2. Present the resultbootstrap <-data.frame(cbind(c(“ATE”, “RR”, “OR”), round (mean, 4), round (se, 4), round (ll1, 4), round (ul1, 4), round (ll2, 4), round (ul2, 4)), row.names=NULL) colnames (bootstrap) <- c(“Estimand”, “mean”, “se”, “Lower1”, “Upper1”, “Lower2”, “Upper2”)

# [3] Standardization using the stdReg packagefit1 <-glm (death ~ Treatment * i_bclc+age+Liver_ Cancer_Cause+MELD+cpc_cat+platelet_cat+ Sodium_level+AFP_level+Ascites_status, family= ‘binomial’, data=liver_syn_data1) fit.std <- stdGlm (fit=fit1, data=liver_syn_data1, X=“Treatment”, x= seq (0,1,1)) summary (fit.std, contrast=‘difference’, reference=0)#Risk differencesummary (fit.std, contrast=‘ratio’, reference=0)#Relative ratiosummary (fit.std, transform=‘odds’, contrast=“ratio”, reference=0)#Odds ratio

### DISCUSSION AND CONCLUSION

*T*=

*t*|

*C*=

*c*], while standardization models the outcome using Pr[

*Y*=1|

*T*=

*t*,

*C*=

*c*]. The propensity score matching method estimates the causal estimands from exposed and unexposed groups that are matched based on the propensity score [20]. Instrumental variables, associated with treatment but not associated with outcomes, as well as confounders, can be used to estimate the estimands even when confounding variables are not measured. However, all methods require either the conditional exchangeability condition or its corresponding untestable conditions. Therefore, researchers should be fully aware that causal inference from an observational study cannot replace causal inference from an RCT. Conditional exchangeability is an untestable assumption in an observational study, whereas in an RCT, conditional exchangeability holds true in practice.