Design and Analysis of Adaptive Experiments
With Special Focus on Clinical Trials

Janis Hardwick   and   Quentin F. Stout

 

This page describes some of our efforts in helping researchers produce adaptive designs that are of use in a range of applications. Towards this end, we emphasize very flexible approaches which can accommodate a variety of cost and risk structures and statistical criteria. Our work applies widely, though we have emphasized its use in clinical trials and have collaborated with researchers in the pharmaceutical industry.

We work in adaptive sampling procedures because of their power and efficiency. Typically, classical ``fixed'' sampling procedures, in which all decisions regarding an experiment are made prior to the observation of data, are suboptimal since learning only occurs at the conclusion of the experiment. Sequential or adaptive procedures, on the other hand, allow adjustments to the design as it is being carried out. In this way, adaptive procedures can make more efficient use of resources without diminishing the statistical power of the procedure. Such designs exploit the inherently sequential nature of many real life processes, and offer great flexibility.

A major disadvantage of adaptive procedures is that sampling distributions of common statistics are affected by the fluidity of the design and generally cannot be described analytically. While there are asymptotic analyses of a few procedures, they rarely give accurate indications of what happens for useful sample sizes. Computer programs overcome this difficulty, and can design and analyze procedures for far more complicated scenarios, ones for which asymptotic results would quite be difficult to obtain. Some problems, however, impose significant computational challenges, and overcoming these has been an important aspect of our work.

By analysis of an experimental procedure we mean determining its properties, such as expected sample length, mean squared error, or number of failures. It can include robustness studies to examine sensitivity to design assumptions and studies to determine operating characteristics. Design means the creation of a procedure, i.e., a rule for deciding what to do at each step of the experiment. Typically there is an objective function, such as the probability that the better treatment will be determined, and the goal of the design is to optimize this. Often we have been able to find the optimal design, and then analyze other designs that have been suggested to determine how close they are to optimal. Some suboptimal designs have desirable properties such as simplicity, but until an optimal design has been found one doesn't know how close to optimal the simpler design is.

Arm will be used to indicate one of the populations that may be sampled, such as treatments in a clinical trial or types of components in a reliability study. This terminology comes from the analogy to the bandit problem, where the arms represent different slot machines and the goal is to optimize profits. To achieve this goal bandits must do some exploration, to collect enough evidence to make informed decisions about which arm is best, while also exploiting the better arm to accummulate profits. This corresponds to design goals such as treating each patient in a study as well as possible. Our work has focused on Bernoulli arms, though it can be extended to more general outcomes.

There are a variety of problems which we work on, especially controlled clinical trials. Important classes include

Two-arm trials
We have studied these extensively, with goals such as maximizing successes. Sometimes we look at tradeoffs, such as expected failures vs. the probability of selecting the better arm, as discussed in this presentation.
Dose-response trials
Phase I or phase I/II clinical trials often have the goals of quickly finding the optimal dosage and having most of the sampling occur at or near this dosage. There may be competing failure modes with complete or incomplete information about the failure mode.
Screening trials
These are a form of one-arm trial, deciding whether to reject the current candidate being tested, or pass it on to the next stage. In acceptance/rejection testing the next stage might be to use the shipment of parts tested, while in preclinical or clinical studies it typically means continuing on to a larger, far more expensive, testing phase. Here is a presentation on our use of the flexible staged designs and cost- and constraint-based approach described below.
There are often options as to how a trial is conducted. For example, it may be fully sequential, in which case decisions are made one observation at a time based on all of the information to date, or staged, in which case decisions are made on how to sample the next batch of observations, i.e., how many to sample on each arm. Often one must take into account additional factors such as delayed responses or missing observations.

For several problems we have developed somewhat new goal and design options. These include

Flexible staged designs
Most statisticians fix the size of each stage and only use the data collected so far to determine the proportion of observations allocated to each arm. We also allow the size of each stage to vary, and allow the experiment to terminate before the maximum sample size or maximum number of stages is reached. Here is a presentation showing that this flexilbility can yield significant performance improvements.
Efficiency measures for decisions and sampling
In dose-response settings often the goal is to maximize the probability of selecting the optimal dose at the end of the experiment. However, this treats all suboptimal doses equally, ignoring the fact that some may have success rates far below the optimum, while others may be quite close. One can instead use efficiency measures for the objective function, trying to maximize the expected success rate of the dose selected.
Cost- and constraint-based design and optimization
Rather than fixing many aspects in advance, such as the number of stages, or α and β parameters in a test of hypothesis setting, one can instead model a problem in terms of the costs of each sample, the cost of starting a new stage, the (probably nonlinear) costs of false positives and false negatives, an upper bound on the number of stages (and hence time) allowed, etc. Note that this allows companies and researchers to more accurately model the complex setting which the trial is a part of. Given these costs and constraints we can determine the optimal design and evaluate arbitrary designs. This is also called a decision theoretic approach.

Our work relies on extensive development of new algorithms and high-performance programs which implement them. Using these, we have been able to obtain exact evaluations and optimizations for many of the problems listed above. For example, we can analyze 2-arm fully sequential designs with hundreds of observations, and flexible staged designs with multiple stages and more than a hundred observations. By using serial and parallel computers we have been able to solve problems involving fully sequential 3-arm designs, and situations involving 2-arms with delayed responses or missing outcomes. For some of these problems researchers had stated that they were not feasible.

No matter how a design was created, it can be analyzed by either frequentist or Bayesian criteria. We create optimal designs by using a Bayesian approach, partially because it provides a framewok within which dynamic programming can be used for optimization. By using weak priors the design quickly adapts to the data obtained, making it quite robust and giving it desirable frequentist properties as well.

In contrast, trying to directly optimize a frequentist design is often infeasible. For example, in a classical test of hypothesis screening trial with specified error bounds, via a Bayesian approach we create designs with the optimal expected sample size under either the alternative or null hypotheses. These designs use flexible stages and often have sample sizes significantly smaller than those used in practice. For some of these problems, trying to find the optimal sample size by the usual frequentist techniques would take longer than the age of the universe.

 

Further Information

Here is a searchable list of our papers and an introduction to the subject of adaptive designs. For further information or consultation help in this area, please contact us:

Janis Hardwick
University of Michigan, Ann Arbor, MI 48109-2121 USA
janishardwick @ gmail · com

Quentin F. Stout
Computer Science, University of Michigan, Ann Arbor, MI 48109-2121 USA
+1-734-763-1518      +1-734-763-8094 (fax)
qstout @ umich · edu      www.eecs.umich.edu/~qstout/


Quentin F. Stout home Copyright © 2006-2017 Quentin F. Stout.