You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: guides/analysis-procedures/randomization-inference_en.qmd
+59-27Lines changed: 59 additions & 27 deletions
Original file line number
Diff line number
Diff line change
@@ -1,7 +1,7 @@
1
1
---
2
2
title: "10 Things to Know About Randomization Inference^[We focus here on randomization inference as applied to hypothesis testing. Randomization inference may also be used for construction of confidence intervals, but this application requires stronger assumptions. See @gerber_green_2012, chapter 3.]"
3
3
author:
4
-
- name: "Donald Green^[I am grateful to Winston Lin and Gareth Nellis, who commented on an earlier draft.]"
4
+
- name: "Donald Green^[Originating author: Don Green. Thanks to Winston Lin and Gareth Nellis, who commented on an earlier draft. Revisions: Jake Bowers, 8 July 2025. The guide is a live document and subject to updating by EGAP members at any time.]"
5
5
url: https://egap.org/member/donald-green/
6
6
image: randomization-inference.png
7
7
bibliography: randomization-inference.bib
@@ -20,36 +20,68 @@ After we have conducted an experiment, we observe outcomes for the control group
20
20
21
21
```{r, message = F, warning = F}
22
22
# Worked example of randomization inference
23
-
rm(list=ls()) # clear objects in memory
24
-
library(ri) # load the RI package
23
+
library(ri2) # load the RI2 package
24
+
library(coin) # load the coin package
25
+
25
26
set.seed(1234567) # random number seed, so that results are reproducible
26
27
# Data are from Table 2-1, Gerber and Green (2012)
27
-
Y0 <- c(10, 15, 20, 20, 10, 15, 15)
28
-
Y1 <- c(15, 15, 30, 15, 20, 15, 30)
29
-
Z <- c(1,0,0,0,0,0,1) # one possible treatment assignment
30
-
Y <- Y1*Z + Y0*(1-Z) # observed outcomes given assignment
31
-
probs <- genprobexact(Z,blockvar=NULL) # no blocking is assumed when generating probability of treatment and probs are 2/7 for all units
32
-
ate <- estate(Y,Z,prob=probs) # estimate the ATE
33
-
perms <- genperms(Z,maxiter=10000,blockvar=NULL) # set the number of simulated random assignments
34
-
# show all 21 possible random assignments in which 2 units are treated
Ys <- genouts(Y,Z,ate=0) # create potential outcomes under the sharp null of no effect for any unit
40
-
# show the apparent potential outcomes under the sharp null
41
-
Ys
42
-
distout <- gendist(Ys,perms,prob=probs) # generate the sampling distribution based on the implied schedule of potential outcomes implied by the null hypothesis
43
-
ate # estimated ATE
44
-
sort(distout) # list the distribution of possible estimates under the sharp null of no effect
45
-
sum( distout >= ate )/nrow(as.matrix(distout)) # one-tailed comparison used to calculate p-value
46
-
sum(abs(distout) >= abs(ate))/nrow(as.matrix(distout)) # two-tailed comparison used to calculate p-value
47
-
dispdist(distout,ate) # display p-values, 95% confidence interval, standard error under the null, and graph the sampling distribution under the null
28
+
dat <- data.frame(
29
+
Y0 = c(10, 15, 20, 20, 10, 15, 15),
30
+
Y1 = c(15, 15, 30, 15, 20, 15, 30),
31
+
Z = c(1,0,0,0,0,0,1)) # one possible treatment assignment
32
+
dat$Y <- with(dat,Y1*Z + Y0*(1-Z)) # observed outcomes given assignment
33
+
# Represent the design with 2 units assigned to treatment and 5 to control
34
+
declaration <- declare_ra(N = 7, m = 2)
35
+
print(declaration)
36
+
37
+
# Notice that there are 21 ways to assign 2 treatments to 7 total units
38
+
# the first way is to assign treatments to units 1 and 2, and the second to units 1 and 3, etc.
39
+
combn(7,2)
40
+
41
+
# Conduct Randomization Inference
42
+
# using a difference in means test statistic by default
# The equivalent of the table showing the distribution above
72
+
# only here using standardized test statistics
73
+
rbind(support(t_test_exact),
74
+
# The probabilities of each possible test statistic value under the null.
75
+
dperm(t_test_exact,x=support(t_test_exact))*21)
76
+
48
77
# Compare results to traditional t-test with unequal variance
49
-
t.test(Y~Z,
78
+
79
+
# notice that the results are not the same because the t.test is assuming a
80
+
# t-distribution for the null distribution of the test statistic.
81
+
t.test(Y~Z,data=dat,
50
82
alternative = "less",
51
83
mu = 0, var.equal = FALSE)
52
-
t.test(Y~Z,
84
+
t.test(Y~Z, data=dat,
53
85
alternative = "two.sided",
54
86
mu = 0, var.equal = FALSE)
55
87
```
@@ -98,6 +130,6 @@ On the other hand, randomization inference cannot be applied with additional ass
98
130
99
131
Old-fashioned approximate methods work well when the assumptions on which the approximations rest are sound. For example, when an experiment involves random assignment of individual subjects, outcomes are distributed more or less symmetrically around the mean, and the number of subjects is greater than 100, the difference between conventional p-values and RI p-values may be negligible. Randomization inference may still be useful as the final word, but it rarely changes inferences substantively under these circumstances. The method is valuable primarily for nonstandard applications in which outcomes are skewed, subject pools are small, or the method of assignment is complex.
100
132
101
-
Note on available software for implementing randomization. For the latest R package for randomization inference, see [here](http://alexandercoppock.com/ri2/articles/ri2_vignette.html). For randomization inference code specifically tailored to the special features of binary outcomes, see [here](https://cran.r-project.org/web/packages/RI2by2/index.html). Stata users may find an all-purpose package [here](https://github.com/simonheb/ritest).
133
+
Note on available software for implementing randomization inference. For the latest R package for randomization inference, see [here](http://alexandercoppock.com/ri2/articles/ri2_vignette.html). For randomization inference code specifically tailored to the special features of binary outcomes, see [here](https://cran.r-project.org/web/packages/RI2by2/index.html). Stata users may find an all-purpose package [here](https://github.com/simonheb/ritest).
0 commit comments