Bandit Strategies for Ethical Sequential Allocation

Janis Hardwick     Quentin F. Stout
University of Michigan

 

Abstract: The problem of allocating patients in a two treatment clinical trial with dichotomous responses is considered. The trial goal is to determine the better treatment while incurring as few patient losses as possible. Several sampling procedures are compared, including equal allocation, which maximizes power, and the uniform bandit, which minimizes expected failures. It is found that a modified bandit strategy performs well on both criteria in that it achieves nearly optimal power while keeping expected trial failures nearly minimal. The modified bandit model is based on an approximation to the Gittins' index. Bandits form an important class of models for adaptive sampling problems, and this approach can be used in many settings other than clinical trials, achieving a good compromise between two objectives.

The rules are also evaluated according to their computational complexity. By using an approximation to the Gittins index, rather than the true value, the computational complexity of the modified bandit is significantly reduced.

Keywords: medical ethics, controlled clinical trial, statistical computing, sequential allocation, adaptive sampling procedure, bandit problem, design and analysis of experiments, Gittins index, power, probability of correct selection, indifference region, computational learning theory

Complete paper. This paper appears in Computing Science and Statistics 23 (1991), pp. 421-424.

 

Related Work
Adaptive Allocation:
Here is an explanation of this topic, including a description of bandit problems, and here are our relevant papers
Dynamic Programming (also known as Backward Induction):
Here is an overview of our work.


Quentin's Home Copyright © 2005-2009 Quentin F. Stout