Thursday, April 8
4:30 - 5:30 PM
Room 1001 EECS
Abstract-
We treat the problem of entropy estimation for a
multidimensional density function. In particular, we
consider estimation of p-th order Renyi entropy of a
distribution function P based on a random sample from the
mixture distribution (1-alpha) P + alpha D where alpha is an
unknown mixture parameter between zero and one and D is an
unknown high entropy contaminating distribution. Entropy is
an important parameter for pattern analysis, process
complexity assessment, vector quantization, and
discrimination tasks arising in random process models. Here
we present a direct method of entropy estimation based on
the length of a pruned minimal spanning tree (MST)
constructed by a greedy algorithm which rejects a portion of
the data points due to the contaminating distribution D.
The pruned graphs are naturally insensitive to contaminating
distributions and generalize the concept of rank order
statistics to multiple dimensions. We prove asymptotic
consistency, specify convergence rates, and use Hampel's
influence functions to establish quantitative robustness of
these optimally pruned trees.
The MST entropy statistic for multidimensional data samples
is naturally translation, scale and rotation invariant and
can also be used for clustering and pattern recognition
tasks. This invariance property is important for
applications where patterns or process trajectories of
interest may be articulated at arbitrary orientations and
spatial positions, e.g. as occurs in automated pattern
matching, word spotting, radar target detection, and
industrial inspection. This work is being applied in the
context of several applications including: estimation of
phase space complexity for detection of chaotic phase
transitions of turbulent flow fields, multi-modality image
registration using mutual information matching criteria, and
robust feature classification in ATR.
To link to Prof. Hero's Home Page just click here