Waveform Interpolative Speech Coding

Signal Compression Laboratory Research Project

 

Researcher: Oded Gottesman
Faculty: Prof. Allen Gersho
Research Focus: We have developed a 2.8 kbps Enhanced Waveform Interpolative (EWI) speech coder [1-15], which uses the slowly-evolving waveform (SEW) and rapidly-evolving waveform (REW) decomposition. New features and improvements were studied and incorporated in the EWI coder paradigm to improve its performance. Among the new features are a novel way to model phase dispersion [1, 4-6], SEW Analysis-by-Synthesis optimization [1, 4, 5], improved REW modeling, improved synthesis waveform interpolation, switched-predictive gain VQ [1, 4, 5], special pitch search for transitions and onsets [2, 3], REW parametrization and its perceptually-weighted AbS VQ [1-3], and the Dual-Predictive AbS SEW VQ [1, 3]. We are exploring novel ways to improve the 2.8-4 kbps coder's performance, targeting for toll-quality, and to match G.729 at 8.0 kbps. Later, we plan to reduce the bit rate to 2 kbps, to surpass MPEG-4 at 2 kbps and MELP at 2.4 kbps, and to match G.723.1 at 5.3 kbps.

We have implemented and tested a fully quantized simulation of the new coder. The coder achieves toll-quality at the modeling phase without quantization. Subjective test results indicate that the 2.8 kbps EWI coder slightly exceeds the G.723.1 coder's performance at 6.3 kbps and therefore it is very close to toll quality at least under clean speech conditions [1]. Additional subjective tests indicate that the quality of the 2.8 kbps EWI coder surpasses that of the 4 kbps MPEG-4, and G.732.1 at 5.3 kbps [1-5].

 REFERENCES

[1] O. Gottesman and A. Gersho, "Enhanced Waveform Interpolative Coding at Low Bit Rate,"IEEE Trans. on Speech and Audio Processing, vol. 9, no. 8, November 2001. Open Arcobat Reader PDF document

[2] O. Gottesman and A. Gersho, "Enhancing Waveform Interpolative Coding with Weighted REW Parametric Quantization," in IEEE Workshop on Speech Coding Proceedings, pp. 50-52, September 2000, Wisconsin, USAOpen Arcobat Reader PDF document

[3] O. Gottesman and A. Gersho, "High Quality Enhanced Waveform Interpolative Coding at 2.8 kbps," IEEE ICASSP'2000, June 5-9, 2000, Istanbul, Turkey. Open Arcobat Reader PDF document

[4] O. Gottesman and A. Gersho, "Enhanced Analysis-by-Synthesis Waveform Interpolative Coding at 4 kbps," EUROSPEECH'99, 1999, Hungary. Open Arcobat Reader PDF document

[5] O. Gottesman and A. Gersho, "Enhanced Waveform Interpolative Coding at 4 kbps," IEEE Workshop on Speech Coding Proceedings, pp. 90-92, 1999, Finland. Open Arcobat Reader PDF document

[6] O. Gottesman, "Dispersion Phase Vector Quantization For Enhancement of Waveform Interpolative Coder," IEEE ICASSP'99, vol. 1, pp. 269-272, 1999. Open Arcobat Reader PDF document

[7] W. B. Kleijn, "Continuous Representations in Linear Predictive Coding," IEEE ICASSP'91, pp. 201-203, 1991.

[8] Y. Shoham, "High Quality Speech Coding at 2.4 to 4.0 kbps Based on Time-Frequency-Interpolation," IEEE ICASSP'93, Vol. II, pp. 167-170, 1993.

[9] I. S. Burnett and R. J. Holbeche, "A Mixed Prototype Waveform/Celp Coder for Sub 3 kb/s," IEEE ICASSP'93, Vol. II, pp. 175-178, 1993.

[10] W. B. Kleijn and J. Haagen, "Transformation and Decomposition of The Speech Signal for Coding," IEEE Signal Processing Letters, Vol. 1, No. 9, pp. 136-138, 1994.

[11] W. B. Kleijn and J. Haagen, "Speech Coder Based on Decomposition of Characteristic Waveforms," IEEE ICASSP'95, pp. 508-511, 1995.

[12] I. S. Burnett and G. J. Bradley, "New Techniques for Multi-Prototype Waveform Coding at 2.84 kb/s," IEEE ICASSP'95, pp. 261-263, 1995.

[13] I. S. Burnett and G. J. Bradley, "Low Complexity Decomposition and Coding of Prototype Waveforms," IEEE Workshop on Speech Coding for Telecom., pp. 23-24, 1995.

[14] W. B. Kleijn and J. Haagen, "Waveform Interpolation for Coding and Synthesis," from Speech Coding Synthesis by W. B. Kleijn and K. K. Paliwal, Elsevier Science B. V., Chapter 5, pp. 175-207, 1995.

[15] W. B. Kleijn, Y. Shoham, D. Sen, and R. Haagen, "A Low-Complexity Waveform Interpolation Coder," IEEE ICASSP'96, pp. 212-215, 1996.

Award:

Ericsson-Nokia Best Paper Award

Presentation:

Enhanced Waveform Interpolative Coding at 2.8 kbps

Demo:

Enhanced Waveform Interpolative Coding at 2.8 kbps