Research Projects and Publications
Many projects are listed below, and they are being pursued with varying
levels of activity and group sizes varying from zero to a dozen.
Collaboration is welcomed, or you might suggest new projects.
For example, while most of my graduate students work on projects I suggest,
others have worked on projects that they suggested, sometimes quite far
from my usual interests.
Most of the current research of myself and my students is in computer science,
adaptive sampling, or computational science, and often these areas overlap.
Research areas are organized as follows:
Here is a listing of my
books, papers, etc.,
organized by research area, and here is a standard
academic vita.
My work in parallel computing has involved developing efficient
algorithms for large problems in computational science and engineering,
and more theoretical algorithms for a variety of problems and
architectures.
Other work has involved evaluating the performance of parallel machines
(especially communication and synchronization aspects, on both real and
abstract systems), and proposing new
parallel architectures.
Most of the work on using large machines involves the research
mentioned in scientific
computing and computational science and some of that described
in adaptive sampling.
Research involving more theoretical algorithms includes
load balancing, graph theory, geometry, image processing, databases, sorting,
routing, and message-passing.
Russ Miller and I wrote
Parallel Algorithms for Regular Architectures: Meshes and Pyramids,
MIT Press, which incorporates many of the previously published algorithms
for these machines, and also includes algorithms that
have never appeared previously. A sequel volume, with an emphasis on
hypercube computers, also has both old and new algorithms, but it wasn't
published. If more people buy the first volume then MIT might
publish the second volume.
Some of the papers we've written for various machine models are
Some additional projects I'm involved with, or would like to start, are:
- Power-aware parallel algorithms and architectures:
this has become increasingly important in areas such
as sensor arrays, large parallel computers, and hand-held devices.
I'm interested in creating models and algorithms for these areas,
and new ones such as fine-grained
parallel computers. For example, sometimes it is useful to use
a model based on rats scurrying around on a surface, doing calculations
and leaving information at the sites they visit, working on problems
such as deciding if a maze can be solved.
This model is a useful way to think about
minimizing the peak energy used.
Another way to approach these concerns for high performance computing is
to use wafer scale integration. This goal failed 20 years
ago, but there may be new ways to view the problem.
- Utilization of parallel computers:
measuring and improving their performance.
Simple measurements show that
sometimes individual users repeatedly run very inefficient parallel
codes, which can be improved with very simple changes. I've also shown
that the overall system utilization is often much worse than it could
be, and again can be improved with reasonable changes. My goal is
to develop simple tools that can help get a view of what the
utilization is, and to emphasize simple changes that result in
significant improvements.
- Modeling the performance degradation due to random aspects of
computer systems:
An example is the degradation of MPI collective communication
performance due to
operating system
interrupts, which was examined by Ted Tabe.
Along with this modeling, of course,
comes suggestions for how to improve performance.
Related work concerns the effects of
transient load imbalances. Our
work compares the
local-synchronization
used in message-passing versus
the global-synchronization typically used in shared-memory systems.
The local-synch permits a relativistic view of computation and causality,
while the global-synch essentially keeps time globally consistent.
There are very nice mathematical models of the problems involved here,
and they lead to some interesting new questions.
Julia Lipman is working on this aspect.
- Cost/benefit tradefoffs for load-balancing:
Creating models, doing timing runs to determine parameters, and
then tuning for optimal performance. In many situations, one can
spend larger amounts of time on load-balancing to improve run-time, but
eventially the time spent on balancing exceeds the improvement in
run time. Thus one needs to determine how
long to spend on load-balancing. There are also interesting research
questions involving tradeoffs between computational balance vs.
communication time.
- Reconfigurable architectures:
developing new models, such as the reconfigurable mesh
with row and column buses (which was essentially implemented in the
MasPar series), and creating algorithms for them.
Here is an article that illustrates the
power of the reconfigurable mesh.
- Self-organizing structures:
some older work concerned the use of clerks to
solve some long-standing open problems.
Clerks work by having many cellular automata group together
to function as a single more powerful processor, with these
more powerful processors themselves forming a parallel computer.
Some papers using this approach are
Using Clerks
in Parallel Processing and Topological
Matching.
The parallel computing topics I'm interested often change, and, as I said,
I'm interested in collaboration and new problems.
Here is a somewhat whimsical explanation of
parallel computing.
Adaptive sampling designs are ones where the accruing data from experiment
(i.e., the observations) is used to make decisions affecting the experiment,
as opposed to the standard fixed statistical designs where all decisions
have been made in advance of any experiments.
For example, in standard fixed designs for clinical trials, one may
allocate equal numbers of patients to one of two treatments, and then
after all of the outcomes have been observed make a statistical decision
as to which is the better treatment. With an adaptive design, however,
one may modify the allocation as the outcomes are being observed, so
that more patients in the trial can be allocated to the better treatment.
This can simultaneously allow one to achieve a desired statistical goal
(such as determining which is the better treatment) while also achieving
ethical or cost goals (such as reducing patient suffering during the trial,
or reducing costs in an industrial setting).
Adaptive designs have many natural uses in computer-related areas
such as resource allocation problems or choosing parameter settings
for computer simulations, since the computer can both collect the information
and run a program to decide what to do next.
Here the emphasis is on developing efficient
algorithms and data structures that are motivated by needs
in computational science, and on ways of putting large scientific codes
together.
For adaptive sampling I'm involved in both the
science and the computing, while within the
Center for Space Environment Modeling (CSEM)
I concentrate on the computer science aspects.
Most of my work in this area started with a scientist coming to me requesting
help on a research project. I've worked with social scientists,
environmental engineers, space scientists, statisticians, etc.
Because several of these applications involve large problems, much
of our work in this area
involves parallel computing, though some
involves serial algorithms.
Projects include
- Adaptive blocks: a scalable parallel
data structure used for adaptive mesh refinement.
Bob Oehmke developed a spherical version
that is being used in atmospheric and space environment modeling
codes, while a cartesian version is
being used for the major codes in CSEM.
- Software frameworks: a
framework for merging large scientific
codes running on parallel machines into an efficient complex simulation.
- Load balancing:
Andy Poe and I developed a new approach for balancing
geometrically based problems where there
are two aspects that need to be balanced.
This was needed for improving the efficiency of parallel programs
for car crash simulation.
The algorithm is based on the Ham Sandwich theorem from algebraic
topology. We've also used space-filling curves for geometric data.
- Efficient parallel algorithms using dynamic programming:
for optimizing and evaluating modeling some of the complex situations
we encounter in the adaptive sampling work.
For example, we were able to optimize a
clinical trial with 200 patients
and three treatment alternatives.
This had previously been considered to be infeasible.
This paper, with Bob Oehmke and Janis Hardwick, was finalist for best
paper at the Supercomputing conference.
One related research question is how
to automatically turn recursive equations into efficient serial and parallel
programs.
Most of the work here concentrates on trying to develop optimal
algorithms, optimal at least as measured by O-notation.
The majority of my work on algorithms has been in
parallel computing,
adaptive allocation, or
graph theory, but there are several other
topics that my students or collaborators and I have looked at.
Some of the work we did involves
- Balancing binary search trees,
requiring only linear time and constant space. This was with
Bette Warren.
- Generating unbiased bits from a
biased coin. This too was with
Bette.
- Computing the
Busy Beaver function, with Rona Kopp.
It can't be computed for all values, but we computed
the largest value known.
- Sublogarithmic algorithms for PRAMs,
solving problems in geometry and graph theory, and proving lower
bounds for them. One interesting aspect is that
we were able to solve some problems faster than a PRAM can count.
Phil MacKenzie won a best thesis award for this work.
Some current projects involve
- Algorithms for iid random data:
One example is determining the expected number
of extreme points when points are randomly drawn from a distribution
in 2 or more dimensions.
Another is in extending linear-time results for points
randomly selected from the unit interval or square to results for arbitrary,
unknown, distributions. For example, extending bucket sort or finding
the nearest neighbor of each point. Phil MacKenzie and I worked on
such problems for PRAMs, and Doug Van Wieren worked on related
algorithms for a reconfigurable mesh.
- Self-organizing trees with fixed shapes,
where the items can move between locations.
The asymptotic probability distributions for the analogues of move-to-front
and transposition have been determined.
One unusual result is that,
for certain shapes and request probabilities, move-to-front is
asymptotically superior to transposition.
Another result is that the distributions for transposition depend
only on the request probabilities and number of locations at each
level, not on the shape of the tree.
This allows generalizations to self-organizing leveled digraphs.
- Unimodal regression algorithms,
and more general shape-constrained regression.
My original interest in this area came from dose-response problems
in clinical trials.
- Genetic algorithm: I worked with Reiko Tanese and Ted Belding
on an ``island population'' model where there are evolving subpopulations,
with
occasional migration between islands. There is more work to be done here,
and on comparing GA to other
optimization techniques.
Some early work involved
codings of the natural
numbers,
and more generally
codings of linearly
ordered sets.
These results can also be viewed in terms of search times
for the sets.
A different, so-far unpublished, result was the determination of all possible
order types of
finite words over countable linearly
ordered sets.
Many questions about simulation, resource allocation, fault tolerance,
routing, scheduling, mapping, synchronization,
etc., for distributed memory machines can be modeled via graphs, and one
can exploit the graph structure to find solutions.
We have particularly emphasized hypercube, mesh, and torus models, since many
parallel computers are of this form.
Example of this work are a rather comprehensive paper on the subcube
fault tolerance
of the hypercube and one on
embedding graphs into hypercubes.
One set of graph theory questions Marilynn Livingston and I
have pursued concerns the determination
of various properties of families such as k x n meshes,
with k fixed.
We have shown that values such as domination numbers, packing numbers,
number of code words, number of tilings, etc., can be
computationally
determined in closed form, using
dynamic
programming and
properties of finite state automata.
This was a rather dramatic improvement upon earlier results.
There are several interesting
open
questions and conjectures related to this.
A recent project that Jonathan Brown and I are working on concerns models
of the small world phenomena for citations and web links. We're looking
at a model where links are made based on relevance in different aspects,
rather than ignoring the multidimensional nature of picking relevant
references. Applying research to
real-world problems will likely require efficient parallel algorithms
since some applications, such as web links, involve huge graphs.
Of particular interest is locating subgraphs with specific properties.
The work with Julia Lipman mentioned above, concerning repeated
synchronization of multiple agents, is also research in discrete
mathematics.
I don't work in this area any more.
Most of the work in operator theory concerned Schur multiplication of
matrices, where the Schur product is formed by multiplying matrices
termwise (much as addition is performed).
For infinite dimensional
Hilbert
spaces or
lp spaces,
this forms an
interesting Banach algebra, which has a great many ties to normal
multiplication.
Investigating the Schur product led to questions about the
essential numerical
range and majorization of diagonal entries.
Slightly related results concerned determining the numerical range of
finite dimensional
weighted shift
operators.
One paper analyzed conditionally convergent series and
their rearrangements.
There is a natural duality between sets of
conditionally convergent series and sets of permutations which leave their
sums unchanged, and this duality has some unusual properties.
 |
Copyright © 2000-2006, Quentin F. Stout
|