Why Urn Randomization? ====================== Complete randomization — flipping a fair coin for each participant — is the simplest allocation method, but it provides no guarantee of treatment balance, especially in small-to-moderate trials. Urn randomization (Wei, 1978) adaptively adjusts allocation probabilities to maintain balance while preserving the unpredictability needed to prevent selection bias. This page summarizes a Monte Carlo simulation comparing four randomization strategies across 1,000 independent trials of 2,500 participants each with three treatment arms. .. contents:: On this page :local: :depth: 2 Strategies Compared ------------------- .. list-table:: :header-rows: 1 :widths: 30 70 * - Strategy - Description * - **Complete Randomization** - Each participant is assigned to a treatment arm with equal probability (1/k), independent of prior assignments. * - **Urn (β=1, D=χ²)** - Wei's urn model with β=1 ball added for the non-assigned treatment after each draw. The imbalance metric D is the χ² statistic across arms. * - **Urn (β=1, D=range)** - Same as above, but the imbalance metric D is the range (max − min) of arm counts. * - **Urn (β=2, D=χ²)** - Stronger adaptive correction: β=2 balls added for the non-assigned treatment, making the urn pull more aggressively toward balance. In all urn strategies, α=0 (no extra balls for the assigned treatment) and w=1 (one initial ball per treatment). How the Simulation Works ------------------------ For each of the 1,000 trials: 1. Initialize an urn with ``w`` balls per treatment arm. 2. For each of the 2,500 participants, draw from the urn to assign a treatment. 3. After each assignment, update the urn: add ``α`` balls for the assigned arm and ``β`` balls for all other arms. 4. After every assignment, record the maximum proportional imbalance across strata: .. math:: d = \frac{\max_k n_k - \min_k n_k}{\sum_k n_k} where :math:`n_k` is the count assigned to arm *k*. The simulation tracks this imbalance metric ``d`` at every enrollment step across all 1,000 trials to compute means, confidence intervals, and tail probabilities. Results ------- Treatment Imbalance Over Enrollment ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ The mean maximum proportional imbalance ``d`` decreases as enrollment grows. Urn strategies drive imbalance down faster than complete randomization. At small sample sizes (n < 50), complete randomization frequently produces imbalances above 20%, while urn methods keep imbalance below 10% on average. The gap is largest in the critical early phase of a trial when interim analyses are most sensitive to allocation imbalance. Increasing β from 1 to 2 produces even tighter balance, at the cost of slightly more predictable allocation sequences. Tail Probabilities ^^^^^^^^^^^^^^^^^^ The fraction of trials where the maximum proportional difference exceeds a given threshold provides a practical measure of risk: .. list-table:: Fraction of trials with d ≥ 10% (at n = 100 participants) :header-rows: 1 :widths: 40 30 * - Strategy - Trials with d ≥ 10% * - Complete Randomization - ~47% * - Urn (β=1, D=χ²) - < 1% * - Urn (β=1, D=range) - < 1% * - Urn (β=2, D=χ²) - < 0.1% At 100 participants, nearly half of completely randomized trials have a ≥ 10% imbalance, while urn randomization virtually eliminates this risk. Choice of Imbalance Metric (D) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ The two D metrics — χ² and range — produce similar results for two-arm trials. With three or more arms, χ² is more sensitive to multiway imbalance because it considers all arms simultaneously, while range only looks at the gap between the largest and smallest arms. Key Takeaways ------------- 1. **Urn randomization reduces imbalance** compared to complete randomization at every sample size, with the largest benefit at small n. 2. **No block sizes need to be pre-specified** — unlike permuted block randomization, the urn adjusts continuously without fixed block lengths. 3. **Higher β values** produce tighter balance but increase the predictability of the next assignment (Berger et al., 2003). β=1 is a standard default that balances these concerns. 4. **Stratification multiplies the benefit** — applying urn randomization within each stratum (factor-level combination) maintains balance both overall and within subgroups defined by prognostic factors. References ---------- - Wei, L.J. (1978). The Adaptive Biased Coin Design for Sequential Experiments. *Annals of Statistics*, 6(1), 92–100. `doi:10.1214/aos/1176344068 `_ - Wei, L.J. (1978). An Application of an Urn Model to the Design of Sequential Controlled Clinical Trials. *Journal of the American Statistical Association*, 73(363), 559–563. `doi:10.1080/01621459.1978.10480054 `_ - Berger, V.W., Ivanova, A., & Deloria-Knoll, M. (2003). Minimizing predictability while retaining balance through the use of less restrictive randomization procedures. *Statistics in Medicine*, 22(19), 3017–3028. `doi:10.1002/sim.1538 `_