The objective of this project is to achieve a comprehensive, unifying framework and corresponding universal methodologies for the coding of aural signals. The challenges are due to the combination of elusive perceptual criteria, complex signal structures, and great diversity in signal type, operational setting, complexity and delay requirements. Historical efforts were tailored to narrowly defined signal types and scenarios, such as linear prediction for low rate speech communication versus transform-based music coding for storage and streaming. However, there is a growing realization of their insufficiency to handle the heterogeneous aural signals and network settings encountered in many real-world applications, as evidenced in particular by recent major initiatives of the multimedia and networking industries demanding joint speech-audio coding standardization.
The research formalizes the tradeoffs that underly universal aural signal coding, and develops a unifying framework and methodologies to enable efficient optimization of resource-scalable coding under heterogeneous signal and network setting scenarios. The main thrusts of the project are: i) Development of a unifying resource-scalable framework coupled with effective perceptual distortion criteria, which covers the continuous gamut of aural signal types and networking scenarios, and is scalable in bit rate, encoding/decoding complexity and delay, etc.; ii) Theoretical analysis of rate-(perceptual) distortion performance limits within such unified compression paradigms; iii) A new class of universal methodologies and effective optimization algorithms for efficient coder design and various resource allocation within this unifying framework.
This project is sponsored by the National Science Foundation under grant CCF-0917230. The award is funded under the American Recovery and Reinvestment Act of 2009 (Public Law 111-5).
- Vinay Melkote (Signal Compression Lab) PhD, June 2010, titled "Optimal delayed decisions in encoding and decoding of audio signals and general sources".
- Tejaswi Nanjundaswamy (Signal Compression Lab) PhD, March 2013, titled "Advances in audio coding and networking by effective exploitation of long term correlations".
- Ying-Yi Li (Vivonets Lab, ECE, UCSB) PhD, December 2012, titled "Low Complexity Multimode Tree Coding and Practical Rate Distortion Bounds for Speech".
- Emmanuel Ravelli (Signal Compression Lab)