Quentin F. Stout Research

Research Projects and Publications

Many projects are listed below, some from the past and some I'm pursuing now in group sizes varying from one to a dozen. Collaboration is welcomed, or you might suggest new projects.

Research projects for students: you should read this if you are interested in working with me.

Research areas are organized as follows:

Parallel computing (both real supercomputers and theoretical models)
Adaptive sampling for statistical and learning applications
Scientific computing and computational science
Algorithms, data structures, theoretical computer science
Discrete Mathematics: Graph theory, combinatorics, coding
Operator theory, analysis

Here is a listing of my books, papers, etc., organized by research area, and here is a standard academic vita.

Parallel Computing

Our work in parallel computing has involved developing efficient algorithms for large problems in computational science and engineering, and more theoretical algorithms for a variety of problems and models. Other work has involved understanding and improving the performance of parallel systems, and proposing new parallel architectures.

Most of the work on using large machines involves the research mentioned in scientific computing and computational science and some of that described in adaptive sampling. Research involving more theoretical algorithms includes algorithms for graph theory, geometry, image processing, sorting, routing, and message-passing. Russ Miller and I wrote Parallel Algorithms for Regular Architectures: Meshes and Pyramids which has many algorithms for these machines.

Some of the papers we've written for various machine models are

Mesh connected: geometry, graph theory, geometric properties of images.
Hypercube: convex hulls, optimal communication for large messages, images distributing resources within the hypercube
Pyramid: graphs and images, geometric properties of images. Russ Miller won a university Best Thesis award for his work in this area.
Reconfigurable meshes, and meshes with buses: Buses are broadcast mechanisms that send messages to all processors that are connected to the bus. They might have one bus for the entire system, or several. Reconfigurable meshes have buses that are generated dynamically as the computation proceeds. For a few problems they are more powerful than a PRAM.
PRAM (the most common theoretical model of parallelism): sublogarithmic expected time algorithms for Hamiltonian cycles and other problems on random graphs and for Voronoi diagrams and other problems on random points. Phil MacKenzie won a university Best Thesis award for this work. Note that the answers are found faster than a PRAM can count.

For fundamental problems such as sorting, finding a minimal spanning tree, etc., all of the above models are faster than quantum computers. Few people are aware that while quantum is extremely good for factoring, for many other problems it is slower than classical mechanics.

Some projects I'm involved with, or would like to start:

Energy-aware parallel algorithms and architectures: this has become increasingly important in areas ranging from supercomputers to hand-held devices. I'm interested in creating models and algorithms for these areas, and new ones such as fine-grained parallel computers. For example, here is a rat-based model of minimizing the peak energy used. Patrick Poon looked at energy-aware algorithms where optical interconnections on a chip were used to optimize time/power tradeoffs.
Algorithmic limits of classical physics: This is closely related to the work on energy. Matrix multiplication seems 2-dimensional, and it is fairly easy to use an nxn mesh to multiply nxn arrays in Θ(n) time. What happens if you embed the matrices in a 3-dimensional grid? You can think of this as a model of classical physics. It's known that you can multiply the matrices faster by spreading the entries out and using more processors (something that cannot help in 2-d), but the best algorithms aren't known.
Utilization of parallel computers: simple measurements show that sometimes individual users repeatedly run very inefficient parallel codes, which can be improved with very simple changes. I've also shown that the overall system utilization is often much worse than it could be, and again can be improved with reasonable changes. My goal is to develop simple tools that can help get a view of what the utilization is, and to emphasize simple changes that result in significant improvements.
Modeling and using cooperating agents: In some cases this just involves modeling the behavior of parallel computers, such as the degradation of MPI collective communication performance due to operating system interrupts. Ted Tabe first noticed this, and it lead to Julia Lipman's thesis on a mathematical model of the rate at which agents complete tasks when they are randomly interrupted, where each does part of the task and then passes it on. Reiko Tanese created the island model of the genetic algorithm, and I'm interested in other models of agents cooperating.

Some older work concerned the use of clerks to solve some long-standing open problems. Clerks work by having many cellular automata group together to function as a single more powerful processor, with these more powerful processors themselves forming a parallel computer.

The parallel computing topics I'm interested often change, and, as I said, I'm interested in collaboration and new problems. I'm particularly interested in problems where being able to exploit a very large number of processors helps one solve problems that otherwise are impossible to solve in a useful amount of time.

Here is a somewhat whimsical explanation of parallel computing.

Additional publications

Adaptive Sampling, Learning, Statistics, Probability

In standard clinical trials, one may allocate equal numbers of patients to one of two treatments, and then after all of the outcomes have been observed make a statistical decision as to which is the better treatment. With an adaptive design, however, one may modify the allocation as the outcomes are being observed, so that more patients in the trial can be allocated to the better treatment. This can simultaneously allow one to achieve a desired statistical goal (such as determining which is the better treatment) while also achieving ethical or cost goals (such as reducing patient suffering during the trial, or reducing costs in an industrial setting).

Adaptive designs have many natural uses in resource allocation problems or choosing parameter settings for computer simulations, since the computer can both collect the information and run a program to decide what to do next. They can be viewed as a dynamic learning process, and there are numerous ties to optimization.

Here is a more extensive discussion of adaptive sampling and bandit problems.

Additional publications

Scientific Computing, Computational Science

Here the emphasis is on developing efficient algorithms and data structures that are motivated by needs in computational science, and on ways of putting large scientific codes together. Most of my work in this area started with a scientist coming to me requesting help on a research project. I've worked with social scientists, environmental engineers, space scientists, statisticians, etc. Some of the scientists are internationally famous, while others are graduate students just starting to explore a topic. I married one of these scientists - you never know where exciting research will lead.

For adaptive sampling I'm involved in both the science and the computing, while within the Center for Space Environment Modeling (CSEM) and Center For Radiative Shock Hydrodynamics (CRASH) I concentrate on the computer science aspects. CRASH is a Dept. of Energy Center of Excellence in Predictive Science, where we are focusing not just on predicting the results of an experiment, but also quantifying the uncertainty of our predictions, and why we think we know the uncertainty. ``Uncertainty quantification'' is a research area which is attracting a lot of attention.

Because several of these applications involve large problems, much of our work in this area involves parallel computing, including some of the largest supercomputers in the world, though some involves serial algorithms. Projects include

Adaptive blocks: a scalable parallel data structure used for adaptive mesh refinement. Bob Oehmke developed a spherical version that is being used in atmospheric and space environment modeling codes, while a cartesian version is being used for the major codes in CSEM and CRASH.
Software frameworks: we created the Space Weather Modeling Framework for merging large scientific codes running on parallel machines into an efficient complex simulation of the Sun to Earth space environment. This is being used by NASA and the Air Force Space Command for space weather prediction. I'm also on the executive committee of the Earth Systems Modeling Framework, which is being used by several groups, including the daily national weather predictions by NOAA.
Load balancing: Andy Poe and I developed a new approach for balancing geometrically based problems where there are two aspects that need to be balanced. This improved the efficiency of a parallel program for car crash simulation. The algorithm is based on the Ham Sandwich theorem from algebraic topology. We've also used space-filling curves for geometric data.
Efficient parallel algorithms using dynamic programming: for optimizing and evaluating modeling some of the complex situations we encounter in the adaptive sampling work. For example, we were able to optimize a clinical trial with 200 patients and three treatment alternatives. This had previously been considered to be infeasible. This paper, with Bob Oehmke and Janis Hardwick, was finalist for best paper at the Supercomputing conference. A related research question is how to automatically turn recursive equations into efficient serial and parallel programs. Denny Vandenberg has worked on this.

Additional publications in modeling aspects of space and Earth and more general scientific computing

Algorithms, Data Structures, Theoretical Computer Science

Most of the work here concentrates on trying to develop optimal algorithms, optimal at least as measured by O-notation. The majority of my work on algorithms has been in parallel computing, adaptive allocation, or graph theory, but there are several other topics that my students or collaborators and I have looked at. Some of the work we did involves

Balancing binary search trees, requiring only linear time and constant space. This was with Bette Warren.
Computing the Busy Beaver function, with Rona Kopp. We computed the largest value known. It can be proven that there is a natural number such that the busy beaver cannot be computed for that number or any larger one, but no one knows what that number is.
Sublogarithmic algorithms for PRAMs, solving problems in geometry and graph theory, and proving lower bounds for them. One interesting aspect is that we were able to solve some problems faster than a PRAM can count. Phil MacKenzie won the university Best Thesis award for this work.
Reiko Tanese and Ted Belding studied an ``island'' model of genetic algorithms where there are evolving subpopulations, with occasional migration between islands. This is better than classical GA approaches. This started as a term project in my parallel computing class, and then we noticed that the behavior wasn't what we expected. Trying to figure out what was happening led to Reiko's thesis.

Some projects I'm currently involved with, or would like to start (in addition to the topics listed in parallel computing)

Algorithms for random data: Suppose you know that points are chosen from some 2-dimensional distribution, but you don't know what the distribution is. How fast (in expected time) can you find the convex hull of a set of n points, for either serial or parallel computers? Phil MacKenzie and I worked on algorithms for PRAMs when the distribution was known, and Doug Van Wieren worked on related algorithms for a reconfigurable mesh.
Isotonic and unimodal regression algorithms, and more general shape-constrained nonparametric regression. E.g., given data, you might want to predict weight as a function of height and shirt size. Suppose you are only willing to assume that the weight increases with height, and with shirt size, so you want to find the best regression that has no restrictions other than these. Standard linear regression makes no sense on ordered sets such as shirt size.

Additional publications in serial algorithms and data structures and other topics in theoretical computer science.

Discrete Mathematics: Graph Theory, Combinatorics, Coding

Many questions about resource allocation, fault tolerance, routing, scheduling, synchronization, etc., for distributed memory machines can be modeled via graphs, and one can exploit the graph structure to find solutions. We have particularly emphasized hypercube, mesh, and torus models, since many parallel computers are of this form. Example of this work are a rather comprehensive paper on the subcube fault tolerance of the hypercube and one on embedding graphs into hypercubes.

One set of graph theory questions Marilynn Livingston and I pursued concerns the determination of various properties of families such as k x n meshes, with k fixed. We have shown that values such as domination numbers, packing numbers, number of code words, number of tilings, etc., can be computationally determined in closed form, using dynamic programming and properties of finite state automata. This was a rather dramatic improvement upon earlier results. There are several interesting open questions and conjectures related to this.

The work with Julia Lipman mentioned above, concerning repeated synchronization of multiple agents, is also research in discrete mathematics.

Related publications

Operator Theory, Analysis

I don't work in this area any more. Most of the work in operator theory concerned Schur multiplication of matrices, where the Schur product is formed by multiplying matrices termwise (much as addition is performed). For infinite dimensional Hilbert spaces or l_p spaces, this forms an interesting Banach algebra, which has a many ties to normal multiplication. Investigating the Schur product led to questions about the essential numerical range and majorization of diagonal entries. Slightly related results concerned determining the numerical range of finite dimensional weighted shift operators.

One paper analyzed conditionally convergent series and their rearrangements. There is a natural duality between sets of conditionally convergent series and sets of permutations which leave their sums unchanged, and this duality has some unusual properties.