Thursday, April 8
4:30 - 5:30 PM
Room 1001 EECS
We treat the problem of entropy estimation for a multidimensional density function. In particular, we consider estimation of p-th order Renyi entropy of a distribution function P based on a random sample from the mixture distribution (1-alpha) P + alpha D where alpha is an unknown mixture parameter between zero and one and D is an unknown high entropy contaminating distribution. Entropy is an important parameter for pattern analysis, process complexity assessment, vector quantization, and discrimination tasks arising in random process models. Here we present a direct method of entropy estimation based on the length of a pruned minimal spanning tree (MST) constructed by a greedy algorithm which rejects a portion of the data points due to the contaminating distribution D. The pruned graphs are naturally insensitive to contaminating distributions and generalize the concept of rank order statistics to multiple dimensions. We prove asymptotic consistency, specify convergence rates, and use Hampel's influence functions to establish quantitative robustness of these optimally pruned trees.
The MST entropy statistic for multidimensional data samples is naturally translation, scale and rotation invariant and can also be used for clustering and pattern recognition tasks. This invariance property is important for applications where patterns or process trajectories of interest may be articulated at arbitrary orientations and spatial positions, e.g. as occurs in automated pattern matching, word spotting, radar target detection, and industrial inspection. This work is being applied in the context of several applications including: estimation of phase space complexity for detection of chaotic phase transitions of turbulent flow fields, multi-modality image registration using mutual information matching criteria, and robust feature classification in ATR.
To link to Prof. Hero's Home Page just click here