EECS 651 Project Abstracts Winter 1995 1. "On the Decoding of Tree-Structured Vector Quantization Over Noisy Channels" Dennis Goeckel and Peter Moo In this work, the decoding of tree-structured vector quantized (TSVQ) sources transmitted over noisy channels is considered. In standard decoding, the maximum a posteriori (MAP) rule is applied to the channel symbols and the source vector corresponding to the chosen channel codeword output as the estimate. However, if the decoder is allowed some knowlege of the source code, it is possible to obtain lower overall distortion. While the optimal decoder, which chooses the best vector from the continuum of vectors in the source signal space, is well-known for unstructured VQ (and thus for TSVQ as well), its complexity can be prohibitive. Thus, in this work a number of suboptimal decoders are put forward. These consist of two types of decoders: (i) decoders for general unstructured VQ which are basically low-SNR and high-SNR approximations to the optimal decoder, and (ii) novel decoders for TSVQ systems. The latter set of decoders is viewed as the main contribution of this work. Analytical expressions (where possible) for the distortion of each of the decoders are provided, and numerical results are provided for the transmission of TSVQ-quantized Gauss-Markov sources over an additive white Gaussian noise (AWGN) channel. 2. "A study of oversampling in quantization" Tod Paulus When coding an analog waveform, the waveform should be sampled at the Nyquist rate or higher to retain all the information. Oversampling the waveform clearly increases the rate of the encoder. It also decreases the distortion in a few different ways. This is a study of three ways by which oversampling affects distortion: 1) After oversampling a waveform, the resulting discrete time signal has a band where there is no signal power. So the quantization noise that falls in this band can be filtered out. 2) When reconstructing a signal with a nonideal D/A, there is added distortion. This distortion is less when the signal was oversampled. 3) Consecutive samples of an oversampled signal are more correlated than those of a critically sampled signal, and DPCM can take advantage of that. This study includes computer simulations using two different images to exhibit these three phenomena. 3. "A CELP speech coder" C. Mitrpant In this project, we simulate a Code Excited Linear Predictive (CELP) speech coder. The system employs codewords generated based on normal distribution. It utilises analysis-by-synthesis technique with short-term predictor and pitch predictor as its main components. The correlations of adjacent signals are captured in the short-term predictor coefficients, and the correlations due to the quasi-periodicity of speech are captured in thepitch predictor. The weighted mean square error criterion is used to incorporate human audio perception. We use the system to examine the effects of various system parameters, such as shape of window that is applied to signal segments and the number of coefficients in the two predictors, on signal-to-noise ratio. Characteristics of codebooks used are also investigated. 4. "Perceptual audio compression with transform and subband coding" Daoyuan Ren A perceptual entropy coder is designed to compress digital audio signals originally sampled at 11.025 KHz. The first scheme uses a FFT to calculate 20 critical band spectral masking thresholds for each block of 256 samples. The phase of the FFT spectal lines are quantized using scaler quantization. The magnitude of the FFT spectral lines are thresholded, scaled, and quantized using scaler quantization. The second scheme uses 8 constant-bandwidth subband decompostion on the 256-point block samples. The spectral masking thresholds are used to determine the threshold and scaling factors in each subband. Each subband signals are then scaler- quantized. The bit allocation strategy is fixed in both scheme but can be made adaptive. Subjective compression quality and SNR of the two schemes are compared. 5. "VQ's design via genetic algorithms" S. Choi and W.K. Ng A vector quantizer design using genetic algorithms (GA) coupled with a conventional generalized Lloyd algorithm (GLA) is presented. Initially, a finite number of codebooks, named chromosomes, is selected. Then, each codebook is reproduced iteratively by GA or GLA. The selection between GA and GLA for a codebook at an iteration depends on a selection rule. With a GA process, each chromosome is selected, mated, crossovered, and mutated. Experimental results with some alternative approaches and comparisons of these are given when quantizing Gauss-Markov processes, speech, and image. The performance measure is signal-to-noise ratio (SNR). In most cases, the proposed hybrid (Genetic Generalized Lloyd Algorithm) GGLA design methods result in some performance improvements comparing with the conventional GLA. 6. "Optimized bitrate allocations for the FBI's standardized wavelet subband coder for fingerprint images" Bob Kidd In this project I propose to investigate the effects of optimized bitrate- allocation on the FBI+s WSQ (Wavelet Scalar Quantization) standard for fingerprint image compression. In earlier work I have shown that JPEG with optimized bitrate-allocation (equivalently, optimized quantization matrix design) performed as well as WSQ in both RMS SNR and in subjective image quality. My results contrasted with those of the FBI, which chose WSQ over JPEG based on ostensibly superior SNR and subjective image quality. Characteristics of the typical subband images generated by the DCT and wavelet transform, however, indicated that the wavelet transform was better suited to fingerprint images than was the DCT. This is not surprising, since the wavelet filters were optimized for fingerprints. These results led to the hypothesis that despite its superior transform WSQ could not outperform JPEG due to sub-optimal allocation of bits among its 64 subbands. In this project I propose to adapt our JPEG quantizer design algorithm to WSQ. I will then compare the RMS SNR performance of unoptimized JPEG, optimized, JPEG, FBI-standard WSQ and optimized WSQ at the same average bitrates. In addition, I will ask human subjects to compare the quality of FBI-standard WSQ and optimized WSQ in subjective forced- choice experiments. Results will be analyzed statistically to confirm or deny the hypothesis that sub-optimal quantization accounted for the failure of WSQ to outperform JPEG on fingerprint images. My prediction is that optimized WSQ will outperform both FBI-standard WSQ and optimized JPEG in SNR and subjective image quality. 7. "Visual pattern image coding" Victor W. Cheng, Chia-Ning Peng Visual Pattern image coding (VPIC) is a framework of digital image compression. VPIC operates by coding the images to be transmitted using a small set of visual patterns which are localized subimages containing visually important information. For a 512*512 gray-level image, such as Lena, the first approach is using VPIC with 4*4 subimages. Different sizes of subimage groups were examined, and, obviously, larger size of subimage sets always achieve smaller m.s.e. and better perceptual quality. Also, when the size of subimage sets is fixed, source-related approach for choosing best subimages shows improvement for encoding quality without increasing rate. One way to decrease coding rate without highly increasing m.s.e. is differential coding. Within tolerable visual acceptance, differential coding can save us the coding rate by the amount 0.19 bits/pixel for the image "Lena". Another way to decrease coding rate is the adoption of subimages of different sizes; i.e. in addition to 4*4 subimages, 8*8, 16*16, or hybrid-mode are also considered. Though achieving extremely low rates, 8*8 or 16*16 subimages can no longer give us a good perceptual quality, even when a great variety of image patterns is adopted. Hybrid- mode is a good way to maintain acceptable visual quality, but the tradeoff is the goal of fairly low coding rate cannot be attained unless a more complicated algorithm is adopted. VPIC can give us a recognizable coded image even at a low rate. However, the disadvantage of this method is the more obvious blocking effect, especially when the rate is low.