Why Urn Randomization?¶
Complete randomization — flipping a fair coin for each participant — is the simplest allocation method, but it provides no guarantee of treatment balance, especially in small-to-moderate trials. Urn randomization (Wei, 1978) adaptively adjusts allocation probabilities to maintain balance while preserving the unpredictability needed to prevent selection bias.
This page summarizes a Monte Carlo simulation comparing four randomization strategies across 1,000 independent trials of 2,500 participants each with three treatment arms.
Strategies Compared¶
Strategy |
Description |
|---|---|
Complete Randomization |
Each participant is assigned to a treatment arm with equal probability (1/k), independent of prior assignments. |
Urn (β=1, D=χ²) |
Wei’s urn model with β=1 ball added for the non-assigned treatment after each draw. The imbalance metric D is the χ² statistic across arms. |
Urn (β=1, D=range) |
Same as above, but the imbalance metric D is the range (max − min) of arm counts. |
Urn (β=2, D=χ²) |
Stronger adaptive correction: β=2 balls added for the non-assigned treatment, making the urn pull more aggressively toward balance. |
In all urn strategies, α=0 (no extra balls for the assigned treatment) and w=1 (one initial ball per treatment).
How the Simulation Works¶
For each of the 1,000 trials:
Initialize an urn with
wballs per treatment arm.For each of the 2,500 participants, draw from the urn to assign a treatment.
After each assignment, update the urn: add
αballs for the assigned arm andβballs for all other arms.After every assignment, record the maximum proportional imbalance across strata:
\[d = \frac{\max_k n_k - \min_k n_k}{\sum_k n_k}\]where \(n_k\) is the count assigned to arm k.
The simulation tracks this imbalance metric d at every enrollment
step across all 1,000 trials to compute means, confidence intervals, and
tail probabilities.
Results¶
Treatment Imbalance Over Enrollment¶
The mean maximum proportional imbalance d decreases as enrollment
grows. Urn strategies drive imbalance down faster than complete
randomization.
At small sample sizes (n < 50), complete randomization frequently produces imbalances above 20%, while urn methods keep imbalance below 10% on average. The gap is largest in the critical early phase of a trial when interim analyses are most sensitive to allocation imbalance.
Increasing β from 1 to 2 produces even tighter balance, at the cost of slightly more predictable allocation sequences.
Tail Probabilities¶
The fraction of trials where the maximum proportional difference exceeds a given threshold provides a practical measure of risk:
Strategy |
Trials with d ≥ 10% |
|---|---|
Complete Randomization |
~47% |
Urn (β=1, D=χ²) |
< 1% |
Urn (β=1, D=range) |
< 1% |
Urn (β=2, D=χ²) |
< 0.1% |
At 100 participants, nearly half of completely randomized trials have a ≥ 10% imbalance, while urn randomization virtually eliminates this risk.
Choice of Imbalance Metric (D)¶
The two D metrics — χ² and range — produce similar results for two-arm trials. With three or more arms, χ² is more sensitive to multiway imbalance because it considers all arms simultaneously, while range only looks at the gap between the largest and smallest arms.
Key Takeaways¶
Urn randomization reduces imbalance compared to complete randomization at every sample size, with the largest benefit at small n.
No block sizes need to be pre-specified — unlike permuted block randomization, the urn adjusts continuously without fixed block lengths.
Higher β values produce tighter balance but increase the predictability of the next assignment (Berger et al., 2003). β=1 is a standard default that balances these concerns.
Stratification multiplies the benefit — applying urn randomization within each stratum (factor-level combination) maintains balance both overall and within subgroups defined by prognostic factors.
References¶
Wei, L.J. (1978). The Adaptive Biased Coin Design for Sequential Experiments. Annals of Statistics, 6(1), 92–100. doi:10.1214/aos/1176344068
Wei, L.J. (1978). An Application of an Urn Model to the Design of Sequential Controlled Clinical Trials. Journal of the American Statistical Association, 73(363), 559–563. doi:10.1080/01621459.1978.10480054
Berger, V.W., Ivanova, A., & Deloria-Knoll, M. (2003). Minimizing predictability while retaining balance through the use of less restrictive randomization procedures. Statistics in Medicine, 22(19), 3017–3028. doi:10.1002/sim.1538