Research
Focus: |
We have developed a 2.8 kbps
Enhanced Waveform Interpolative (EWI) speech coder [1-15], which uses the slowly-evolving
waveform (SEW) and rapidly-evolving waveform (REW) decomposition. New features and
improvements were studied and incorporated in the EWI coder paradigm to improve its
performance. Among the new features are a novel way to model phase dispersion [1, 4-6],
SEW Analysis-by-Synthesis optimization [1, 4, 5], improved REW modeling, improved
synthesis waveform interpolation, switched-predictive gain VQ [1, 4, 5], special pitch
search for transitions and onsets [2, 3], REW parametrization and its
perceptually-weighted AbS VQ [1-3], and the Dual-Predictive AbS SEW VQ [1, 3]. We are
exploring novel ways to improve the 2.8-4 kbps coder's performance, targeting for
toll-quality, and to match G.729 at 8.0 kbps. Later, we plan to reduce the bit rate to 2
kbps, to surpass MPEG-4 at 2 kbps and MELP at 2.4 kbps, and to match G.723.1 at 5.3 kbps. We
have implemented and tested a fully quantized simulation of the new coder. The coder
achieves toll-quality at the modeling phase without quantization. Subjective test results
indicate that the 2.8 kbps EWI coder slightly exceeds the G.723.1 coder's performance at
6.3 kbps and therefore it is very close to toll quality at least under clean speech
conditions [1]. Additional subjective tests indicate that the quality of the 2.8 kbps EWI
coder surpasses that of the 4 kbps MPEG-4, and G.732.1 at 5.3 kbps [1-5].
REFERENCES
[1] O. Gottesman and A. Gersho, "Enhanced Waveform Interpolative
Coding at Low Bit Rate,"IEEE Trans. on Speech and Audio
Processing, vol. 9, no. 8, November 2001.
[2] O. Gottesman and A. Gersho, "Enhancing Waveform Interpolative
Coding with Weighted REW Parametric Quantization," in IEEE Workshop on
Speech Coding Proceedings, pp. 50-52, September 2000, Wisconsin, USA
[3] O. Gottesman and A. Gersho, "High Quality Enhanced Waveform
Interpolative Coding at 2.8 kbps," IEEE ICASSP'2000, June 5-9, 2000,
Istanbul, Turkey.

[4] O. Gottesman and A. Gersho, "Enhanced Analysis-by-Synthesis
Waveform Interpolative Coding at 4 kbps," EUROSPEECH'99, 1999, Hungary.

[5] O. Gottesman and A. Gersho, "Enhanced Waveform Interpolative
Coding at 4 kbps," IEEE Workshop on Speech Coding Proceedings, pp. 90-92,
1999, Finland.

[6] O. Gottesman, "Dispersion Phase Vector Quantization For
Enhancement of Waveform Interpolative Coder," IEEE ICASSP'99, vol. 1, pp.
269-272, 1999.

[7] W. B. Kleijn, "Continuous Representations in Linear Predictive
Coding," IEEE ICASSP'91, pp. 201-203, 1991.
[8] Y. Shoham, "High Quality Speech Coding at 2.4 to 4.0 kbps Based
on Time-Frequency-Interpolation," IEEE ICASSP'93, Vol. II, pp. 167-170,
1993.
[9] I. S. Burnett and R. J. Holbeche, "A Mixed Prototype
Waveform/Celp Coder for Sub 3 kb/s," IEEE ICASSP'93, Vol. II, pp. 175-178,
1993.
[10] W. B. Kleijn and J. Haagen, "Transformation and Decomposition
of The Speech Signal for Coding," IEEE Signal Processing Letters, Vol. 1,
No. 9, pp. 136-138, 1994.
[11] W. B. Kleijn and J. Haagen, "Speech Coder Based on
Decomposition of Characteristic Waveforms," IEEE ICASSP'95, pp. 508-511,
1995.
[12] I. S. Burnett and G. J. Bradley, "New Techniques for
Multi-Prototype Waveform Coding at 2.84 kb/s," IEEE ICASSP'95, pp. 261-263,
1995.
[13] I. S. Burnett and G. J. Bradley, "Low Complexity Decomposition
and Coding of Prototype Waveforms," IEEE Workshop on Speech Coding for
Telecom., pp. 23-24, 1995.
[14] W. B. Kleijn and J. Haagen, "Waveform Interpolation for Coding
and Synthesis," from Speech Coding Synthesis by W. B. Kleijn and K. K. Paliwal,
Elsevier Science B. V., Chapter 5, pp. 175-207, 1995.
[15] W. B. Kleijn, Y. Shoham, D. Sen, and R. Haagen, "A
Low-Complexity Waveform Interpolation Coder," IEEE ICASSP'96, pp. 212-215,
1996. |