Adobe unveils Project Sound Lift, an innovative AI-powered audio tool

Mon, 20th Nov 2023

FYI, this story is more than a year old

Adobe has announced an innovative AI-powered technology named Project Sound Lift, designed to improve audio quality in videos by separating speech recordings into individual tracks for voices, non-speech sounds, and other background noise. This one-click solution allows users to easily manipulate audio recordings across different scenarios, utilising the power of AI to enhance, transform, and control speech and sound independently.

Poor audio quality has historically been a significant challenge for video creators. Issues caused by factors such as wind interference, sub-optimal microphone locations, crowd noise and other sound nuisances can make videos unworkable. But with this exciting new tool, based on advancements in AI, these audio processing challenges will be transformed, making high-quality audio production attainable for everyone.

Specifically, Project Sound Lift incorporates Adobe's Enhance Speech technology, which is now available in Adobe applications like Premiere Pro. It empowers creators to produce and control studio-quality audio content in a completely new way. Developed by leading speech AI researchers at Adobe Research, this technology was shown to the public for the first time at Adobe's Sneaks showcase at MAX in Japan.

The Sneaks showcase is an event where Adobe's engineers and research scientists provide a sneak peek at their most promising ideas and technologies. These exciting developments have the potential to become significant elements of Adobe's wide range of products, used and trusted by millions of users globally.

Historically, audio AI models have been limited in their effectiveness. They usually require clean, distinct input sounds, such as a single speaker or sound event sans background noise or echoes. This is rarely achievable in real-world recordings, which can contain noise, reverb, multiple speakers and other sound events that are impossible to control. These limitations have hindered the application of audio AI in everyday recordings and added difficulty for non-experts in utilising often complex audio tools.

However, Project Sound Lift can now separate voices and ambient sounds from daily life scenarios. It can split speech, applause, laughter, music, and other sounds into distinct tracks. Each track can then be individually controlled to enhance the quality and content of the video.

Project Sound Lift also has the ability to separate overlapping speakers simultaneously and isolate them from background noise. This is particularly useful when recording at a public event, for instance, the busy Adobe MAX conference. By reducing the volume of background speakers while maintaining the ambient sounds from the background, each on-camera speaker's voice can be clearly heard without losing the atmosphere of the environment.

In addition to enhancing audio quality, Project Sound Lift offers a platform for creativity. For example, it can not only separate a speaker's audio track from background street noise, but also apply voice modulation techniques, which can transform human voice into a whimsical robot-like sound—unleashing a new degree of creative control for video creators.

Share on: