Quasi-experimental methods

Quasi‑experimental methods are a broad group of research design and statistical methods that aim to identify the impact of a program or policy on outcomes of interest.

Unlike randomised controlled trials (RCTs), quasi‑experiment methods are typically used after the program or intervention is implemented and rely on data that is already collected.

Quasi‑experimental methods require us to make stronger assumptions that than RCT to confidently attribute a change in outcome to the program.

Three of the most popular quasi‑experimental are:

  • regression discontinuity design (RDD)
  • difference‑in‑differences
  • instrumental variables

Regression discontinuity design

Regression discontinuity design (RDD) is used to assess the impact of programs that have a cut‑off point determining who is eligible to participate.

The idea is that we can compare those immediately above and below this cut‑off point, where it is reasonable to assume that individuals are similar.

When implemented well, an RDD is generally regarded as a very strong way of estimating program impact.

RDD in practice

For example, we might want to look at the impact of an academic scholarship on future income where receiving a scholarship is determined by being at or above a grade point average (GPA) of 6 in the third year of study.

In this case, a small difference in GPA can make a large difference to what happens to the student. Those with a GPA of 5.9 miss out on the scholarship, while those with a GPA of 6 receive one.

It’s reasonable to assume that those with a GPA of 5.9 and those with a GPA of 6 are fairly similar. Because of this, a comparison of average post‑university income just around the GPA cut‑off can be used to estimate the causal impact of receiving a scholarship on future income.

Assumptions

The main assumption is that receiving a scholarship is entirely down to the GPA threshold. That is students have no way of manipulating their GPA to receive a scholarship.

It is worth keeping in mind that RDDs only estimate program impact for those close to the cut‑off. The impact of receiving a scholarship might be weaker for those with a GPA closer to 7, but the RDD only tells us about those around the threshold of 6.

Difference‑in‑differences

Difference‑in‑differences (DiD) is a widely used quasi‑experimental method that compares outcomes over time between those enrolled in a program and those who were not.

This method requires data on the outcome over time in both groups, spanning the period before and after the program was delivered.

It also requires us to make some strong assumptions.

DiD in practice

Simple DiD design involves 2 groups and 2 time points.

For example, we want to know whether providing employees with a gym membership improves self‑reported wellbeing.

In this example, we have two divisions in the same workplace. Workers from one division were given a gym membership mid‑year, while workers from the second division were not. In both divisions, self‑reported wellbeing was collected at the start and end of the year.

First, we take the difference in average wellbeing scores between the start and end of the year within the division that received the gym membership. By comparing the same group to itself like this, we remove the influence of characteristics that stay constant over time. This includes factors such as employees’ motivation and their tendency to report wellbeing as good or bad.

We can’t leave it here though because our estimate might still be impacted by time‑varying factors. What if the workplace rolls out another initiative, this time to all staff, that improves wellbeing over the course of the year? This will hinder our ability to estimate the direct effect of gym memberships.

We need to calculate the difference in wellbeing scores between the start and end of the year among those that didn’t receive the gym membership.

These individuals should have been exposed to the same workplace initiative as those that received the gym membership.

This difference should capture any factor that changes self‑reported wellbeing over time that isn’t related to the gym membership.

This group stands in as our counterfactual, it tells us about the change in self‑reported wellbeing that we would expect to see if those that received the gym membership had not received it.

We take the two differences we just calculated, one from those that received the gym membership and one from those didn’t, and then difference the differences.

By doing this we ‘clean’ our estimate of program impact of any influence of time varying factors.

Assumptions

The main assumption that is required to interpret the DiD estimate as a causal effect of gym membership. In the absence of treatment, the difference between the ‘treatment’ and ‘control’ group needs to be constant over time, the two groups can’t diverge in any way that is unrelated to the gym membership.

We call this the ‘parallel trends assumption’, and it is a big assumption to make. We can never really prove that we have met the parallel trends assumption, however, visual inspection of historical outcome data from both groups can help to make a strong case.

Instrumental variables

The method of instrumental variables (IV) is used to deal with unmeasured ‘confounders’ or variables that affect whether people participate in a program while also affecting the outcome of interest.

For example, high motivation levels could be a confounder for a job training program because it is likely to influence a person’s willingness to participate in the program as well as their chance of finding a job.

To be labelled an ‘instrument’, a variable must not have a direct effect on the outcome of interest. Instead it may only influence the outcome by changing the probability that an individual receives the program.

IV in practice

We might be interested in estimating the impact of a drug on high blood pressure.

If we simply compared those who took the drug and those who didn’t, we might be misled by confounders.

Instead, we could use ‘distance from a pharmacy’ as an instrument.

It works because those who live closer to a pharmacy are more likely to fill their script and thus take the drug.

We assume that distance from the pharmacy impacts blood pressure solely through the drug, and because of this any observed difference in blood pressure between those living further and closer from the pharmacy must be due to the impact of the drug.

For argument’s sake, let’s say that those that live close to a pharmacy are 34 per cent more likely to take the blood pressure drug.

We also find that blood pressure is 2 per cent lower among those who live close to a pharmacy.

We can conclude that the effect of a 34 per cent increase in the probability of taking the drug results in a 2 per cent decrease in blood pressure.

We can then scale this figure up by dividing 2 per cent by 34 per cent to estimate that the drug reduces blood pressure by almost 6 per cent.

This gives us an accurate estimate of the drug’s impact among those who are influenced by the instrument (distance from the pharmacy) to take up the drug.

Consequently, this isn’t necessarily a reliable estimate of the drug’s impact for the whole population.

Assumptions

The key assumption is called the ‘exclusion restriction’.

The exclusion restriction says that the instrument has no effect on the outcome in any way except through the treatment.

We can never verify that the exclusion restriction is true using the data. Because of this, we need to make the case for our instrument using domain‑specific knowledge.

In the above example, the exclusion restriction is unlikely to be true, it would be fair to argue that living far away from a pharmacy probably means you also live far away from a GP, which in turn may mean you are more likely to have uncontrolled high blood pressure.

Back to top