February 6, 2025
12:00 PM - 1:00 PM
Speaker: Shirin Saeedi Bidokhti (UPenn)
When/where: Thursday February 6th, 12-1pm, AGH 414
Title: Learning-Based Data Compression: Fundamental limits and Algorithms
Abstract: Data-driven methods have been the driving force of many scientific disciplines in the past decade, relying on huge amounts of empirical, experimental, and scientific data. Working with big data is impossible without data compression techniques that reduce the dimension and size of the data for storage and communication purposes and effectively
denoise for efficient and accurate processing. In the past decade, learning-based compressors such as nonlinear transform coding (NTC) have shown great success in the task of compression by learning to map a high dimensional source onto its representative latent space of lower dimension using neural networks and compressing in that latent space. Despite this success, it is unknown how the rate-distortion performance of such compressors compare with the optimal limits of compression (known as the rate-distortion function) that information theory characterizes. It is also unknown how advances in the field of information theory translate to practice in the paradigm of deep learning.
In the first part of the talk, we develop neural estimation methods to compute the rate-distortion function of high dimensional real-world datasets. Using our estimate, and through experiments, we show that the rate-distortion achieved by NTC compressors are within several bits of the rate-distortion function for real-world datasets such as MNIST. We then ask if this gap can be closed using ideas in information theory. In particular, incorporating lattice coding in the latent domain, we propose lattice transform coding as a novel framework for neural compression. LTC provides significant improvement compared to the state of the art on synthetic and real-world sources.
Bio: Shirin Saeedi Bidokhti is an assistant professor in the Department of Electrical and Systems Engineering at the University of Pennsylvania (UPenn). She received her M.Sc. and Ph.D. degrees in
Computer and Communication Sciences from the Swiss Federal Institute of Technology (EPFL). Prior to joining UPenn, she was a postdoctoral scholar at Stanford University and the Technical University of Munich. She has also held short-term visiting positions at ETH Zurich, University of California at Los Angeles, and the Pennsylvania State University. Her research interests broadly include the design and analysis of network strategies that are scalable, practical, and efficient for use in Internet of Things (IoT) applications, information transfer on networks, as well as data compression
techniques for big data. She is a recipient of the 2023 Communications Society & Information Theory Society Joint Paper Award, 2022 IT society Goldsmith lecturer award, 2021 NSF-CAREER award, 2019 NSF-CRII Research Initiative award and the prospective researcher and advanced postdoctoral fellowships from the Swiss National Science Foundation.