A statistical technique used in causal inference and program evaluation to construct a synthetic control, weighted combination of comparison units that closely resembles the relevant characteristics of a treated unit before it went through the treatment.

Then, we compare the interested outcome of the synthetic control and that of the treated unit.

Case study: Using other Spanish regions to construct a synthetic control for Basque country

(Abadie & Gardeazabal, 2003) assessed the impact that terrorism has had on economic growth for the Basque Country. To do so, they created a weighted combination of other Spanish regions chosen to resemble the characteristics of the Basque Country before terrorism, to act as a “synthetic” Basque Country without terrorism. Then they compared its economic evolution with the actual Basque Country with terrorism

Usage

When it’s challenging to find comparable control units, synthetic control can act as a counterfactual
- Limited observational data
- Treated unit’s characteristics are unique
- A randomized control group is not feasible or ethical
When the data satisfies
- All units are observed at the same time periods
- The treatment has no effect on the treated unit and potential comparison units during the pre-treatment period
- The pre-treatment period is long

Strengths

Avoid extrapolation: The synthetic control is based on what the data actually is, doesn’t go beyond the range of the treat unit (when the weights are non-negative, see ⬇️)
Construct synthetic control doesn’t require access to the post-treatment outcomes during the design phase of the study, unlike regression
Provide explicit weights for comparison units, allowing researchers to understand the contribution of each unit to the synthetic control
synthetic control allows flexibility in the selection of covariates

Weaknesses

The synthetic control might not reproduce exact characteristics of treated unit. It can only be as close as possible.
Doesn’t remove the subjective researcher bias, in choosing the covariates, time periods, and combinations of them for the synthetic control ➡️ present various models
The treatment might have happened to comparison units in the synthetic control, so the individual treatment effect observed in the treat unit might be weaker
The choice of covariates and the weighting scheme can impact the resulting synthetic control, and there is a need for sensitivity analysis to assess the robustness of findings.
The synthetic control is specific to the treated unit and may not generalize well to other contexts or treatments.

Common mistakes

Poor fit of synthetic control compared to treated unit during the pre-treatment period
Pre-treatment period is short

Algorithms

(Abadie et. al, 2015) Given:

: Outcome variable for the treated unit before the treatment.
: Outcome variable for the treated unit after the treatment.
: Number of potential comparison units.
: Number of variables/covariates
: vector of covariates for one treated unit before the treatment.
: matrix of the same covariates for comparison units
: diagonal matrix with non-negative values across the diagonal reflecting the relative importance of different variables. is the weight that reflects the relative importance assigned to the variable.
: Matrix of weights for control units.

How to choose a donor pool (of potential comparison units)

(Abadie et. al, 2015) Because comparison units are meant to approximate the counterfactual of the treated unit without the intervention, it is important to restrict the donor pool to units with outcomes that

are thought to be driven by the same structural process as the treated unit and

were not subject to structural shocks to the outcome variable during the sample period of the study.

Start with an initial guess for the weights, .
The goal is to minimize the differences between the pre-intervention characteristics of the treated unit and a synthetic control

where denotes the weighted sum of squared differences between the treated unit and the synthetic control 3. To obtain the optimal weights , minimize the following

with the constraint that weight values in are non-negative and sum to one (might sacrifice the fit). However, this constraint is not recommended when the pre-treatment fit is very poor or the time periods are small (Cunningham) 4. Calculate synthetic control using optimal weights 5. Assess the validity of the synthetic control by checking the similarity of the treated unit’s pre-treatment characteristics to the synthetic control. 6. Compare the actual post-treatment outcome of the treated unit with the counterfactual outcome predicted by the synthetic control. 7. Refine the synthetic control by adjusting the set of covariates, exploring alternative weighting schemes, or incorporating additional control units.

Evaluation

Placebo Studies

We want to be sure that the individual treatment effect is only observed by the treated unit in the specific time, we can

in-time placebo: reapply the treatment to a different time period (before the intervention). A bad placebo would look like below: the red vertical line is true treatment (1998), the black vertical line placebo treatment (1995). We expect there is NO gap between the solid line and the dotted line pre-treatment (1998), but there is actual gap between 1995 and 1998.
in-space placebo: reapply the treatment to a comparison unit (e.g. the unit with largest weight in the synthetic control)
1. Check that the model works: any control unit should not be affected by the treatment.

We want to NO effect when applying the intervention for cases or times other than the treated unit

Reunification of Germany (1990)

Placebo Reunification at 1975

Robustness Test

Iteratively exclude each comparison unit with positive weights (leave-one-out), reconstruct the synthetic control, and re-estimate the individual treatment effect, to see if the conclusion still holds. Though we sacrifice some goodness of fit, this sensitivity analysis evaluates to which extent our results are driven by any particular comparison unit.

Tradeoff between number of comparison units and goodness of fit

Using the weighted combinations of units generally increases the goodness of fit, more than using a single control unit (unless we do matching).

P-value

Sum of squared difference between synthetic control and treated unit pre-treatment

Code

install.packages("Synth")
library(Synth)
 
## First Example: Toy panel dataset
 
# load data
data(synth.data)
 
# create matrices from panel data that provide inputs for synth()
dataprep.out<-
  dataprep(
    foo = synth.data,
    predictors = c("X1", "X2", "X3"),
    predictors.op = "mean",
    dependent = "Y",
    unit.variable = "unit.num",
    time.variable = "year",
    treatment.identifier = 7,
    controls.identifier = c(29, 2, 13, 17, 32, 38),
    time.predictors.prior = c(1984:1989),
    time.optimize.ssr = c(1984:1990),
    unit.names.variable = "name",
    time.plot = 1984:1996
  )

My (Chiffon) Nguyen

Explorer

synthetic control

Usage

Strengths

Weaknesses

Common mistakes

Algorithms

Evaluation

Placebo Studies

Robustness Test

Tradeoff between number of comparison units and goodness of fit

P-value

Code

Graph View

Table of Contents

Backlinks