Audio coding is the science behind the art of immersive audio. Marina Bosi is a pioneer in the field, having contributed important work to the masterful rendering of immersive audio in portable and lossless formats for audiophiles to enjoy across systems. We are excited to share her perspective on her past work and the future of the industry going forward.
You are renowned for your work in audio coding; for those that aren’t familiar with the field, how would you explain it?
Audio coding is about efficiently representing sounds for transmission or storage. Think about streaming music to your phone or playing audio files stored there – in either case you are playing music that was converted into bits, compressed, delivered to your phone, and decoded to produce the sound you hear. A typical audio coder is a device that takes in an analog audio signal (e.g. music), transforms it into a digital representation that can be transmitted or stored prior to being transformed back into an analog signal for the listener to enjoy.
The key challenge in early endeavors was taking into account the perceptual limits of human hearing in a way that would allow to compress the data beyond what is achievable using lossless coders so that you could get the file sizes/data rates down to workable levels without degrading the signal in ways you can hear.
Success back then led to the ubiquity of digital music in our pockets, on devices throughout our homes, and in practically every place we go. However, at times people pushed those technologies beyond their intended use which led to many people going for quantity over quality. Luckily, storage and bandwidth have expanded recently to the point where most people can have high quality music with them wherever they go without too much compression.
What would you tell students and young coders to get them more involved in immersive audio in their careers? Is there a particular challenge in the field scholars should be focusing on?
I would say that immersive audio has always been a pursuit for artists and scientists but nowadays the technology is advanced enough, affordable enough, and widespread enough to make it accessible to the wider public. Digital signal processing technology, our understanding of how we perceive sound in space, and digital transmission and storage capabilities have reached the point where what seemed almost impossible in the past is now a reality. Think, for example, of the relatively new UHDTV (ultra-high-definition television) formats where immersive audio is an important part of the experience. While it may seem a bit of an abstract concept for the home, once you experience immersive audio it opens a new way of thinking about and enjoying sound.
The challenge is getting ubiquitous high-quality immersive audio. To meet that challenge, the task is finding the right balance between recording, distribution, and reproduction technologies, where this chain can accommodate a variety of “real-life” situations.
You entered the world of academia to share your skills with the next generation, and even developed Stanford's first course in digital audio coding at the Center for Computer Research in Music and Acoustics (CCRMA). What have been the most rewarding parts of teaching?
Many rewards: first, by teaching I continue to gain a better understanding of sound; second, the students in my class(es) are such a vital motivation and, each week, I find interacting with them a joy; third, the hope that my continued efforts in the field will be transmitted to and amplified by the next generation.
Who are some artists or engineers who have shaped your work?
So many! Starting from listening to and studying the many accomplishments of J.S. Bach, to meeting and working with composers like Pierre Boulez, Luciano Berio, and John Chowning – each one trying to move the field forward using all the technologies available to them. So many artists and engineers I’ve had the privilege of knowing and working with over the years. Interestingly, most were not exactly in my specific field because, when I started working in it 30 years ago, the field of perceptual audio coding didn’t really exist. Through a combination of the hard work (of many of my colleagues and myself), supporting technology developments in other areas, and market needs, my field almost magically appeared. It has been so exciting to see an entire field coalesce from wraithlike threads in a diverse set of related areas.
When did you first begin coding digital audio in immersive formats?
My dream when I started in this field was to create a spherical concert hall (much along what K. Stokchausen created with the Technical University of Berlin for the 1970 Osaka World Expo). Hence, I’ve always been working toward immersive audio achievements.
One of my first jobs after I left the Institute for Research and Coordination in Acoustics and Music in Paris (where I worked on my dissertation), was to work with the Italian composer Luciano Berio to help distribute music in a new venue, the Lingotto, in Turin, Italy. Because of that assignment, I came to CCRMA to better understand how John Chowning used sound spatialization in his piece Turenas. I ended up creating with David Zicarelli a real-time program that allowed the composer/performer to move sound sources in space by moving a computer mouse on an interface that represented the performing environment. So, working on immersive audio was why I moved to America!
Then while at Dolby Labs I was part of the research team that created the 5.1-channel Dolby Digital format. 5.1 channel surround sound systems associated with home theater film and television were an important step of advancing immersive audio into regular consumer homes. I’d say this was my first (but not last!) opportunity to really code digital audio aiming to an immersive audio format.
What do you envision as the future of immersive audio now that the world is increasingly more “remote”?
Very good question! I think we are all racing to find remote solutions that satisfy our needs for sound/music. I believe this quest will enable new technologies to emerge and will create opportunities that we could not have imagined before. Lately (and in part “inspired” by the current COVID-19 restrictions), I am working at Stanford’s CCRMA on several projects related to enabling long distance music playing. Musicians trying to teach lessons, play chamber music, and perform with others are faced with daunting obstacles of latency (i.e. delay), intermittency, and low sound quality in the tools readily available to me and others. For this reason, I am working with other researchers at CCRMA and worldwide to develop easy-to-use and deploy options for long distance music playing.
In your opinion, what is one of the most exciting developments in audio technology?
Whether the large public is aware of it or not, I think perceptual audio coding has revolutionized the way we consume audio. Most of the music/audio we consume nowadays goes through a perceptual audio coding stage (think of streaming- MP3 or AAC, mobile - AAC, television and DVD-Dolby Digital, etc.).
We are about to witness another major shift in the way we “consume/produce” music. The way we go to a concert (remotely), the way that we engineer/master music (with players in different locations), the way we provide music lessons, orchestra rehearsal, etc., are migrating from a “live in person” occurrence to a “live over the internet” affair. My concern is that while doing this we continue to aim for high-quality audio. Nowadays we have the technology to provide both the musicians and the public with an enjoyable music/sound experience, including immersive audio, over the internet. We just have to connect the dots in a responsible way!
You have a varied and impressive portfolio, including projects at Dolby and Digital Theater Systems, recognition from both the French and Italian government, a term as president and board member of the Audio Engineering Society, and the publishing of a groundbreaking textbook on digital audio coding. What is your next dream venture?
My next dream venture is to continue to share my hard-won experience with my students and to create new music and immersive audio possibilities over the internet.