Practical Convolution in Digital Audio

By Bjarki Kaikumo, Helsinki Sept. 2004

What is convolution?

Traditionally digital reverbs have been accomplished with DSP algorithms, usually a series of allpass and comb filters. Additionally, 3D-audio simulations have used convolution with HRTF (Head Related Transfer Function) to approximate placing of a sound source. Another, and less known method, is to use convolution: one signal is analyzed and its spectra is applied to a second signal, a blend or a morph. There are various applications for convolution, both useful and esoteric, and we'll look at some here.

Convolution theory and reverbs

Reverb designers can use convolution to "sample" a given space, such as a concert hall or recording room, and then apply this to any signal, giving this new signal the characteristics (reverb) of the hall. This is done by recording an impulse response to a sound, (typically a sound with a short tail, such as a starter pistol or a short burst of white noise that contains energy at all frequencies), removing the impulse (leaving only the reverb) and then convolving that with a new sound.

An impulse response is the response of a system, such as a room or a filter, to an impulse (any signal). Convolution is actually a sample-by-sample operation of two signals. This requires enormous computational power and is therefore impractical in real-time applications. Instead, convolution software usually implements an analysis/resynthesis process called spectrum multiplication. This is a mathematical equivalent of the direct convolution mentioned above. Firstly, the spectra of two signals are analyzed with a Fast Fourier Transform (FFT), then the resulting spectra are multiplied. This signal is then resynthesized through inverse FFT. When the two spectra are multiplied, the resulting effect is called spectral intersection. In other words, frequencies either reinforce or weaken one another, just as they do in natural reverb.

Real-time software includes Emagic's Space Designer and Audio Ease's Altiverb, which uses the Mac's G4/5 Altivec vector processor chip. Hardware based solutions include Yamaha's SREV1, utilising 32 DSP chips, and Sony's DRE-S777, using similar techniques.

Non real-time convolvers can be found in Peak, Sound Forge (called Acoustic Mirror) and Soundhack. On the hardware front, perhaps the first convolver was in E-mu's samplers, called Spectrum Multiplication.

Convolution in 3D-audio

In applications such as video-game sound, 3D audio effects are often done by convolving audio streams with HRTF to simulate acoustic spaces on headphones or small stereo speakers. Binaural heads use this theory to simulate sound waves arriving at the head, ears and shoulders, which is how the human brain localises sounds. This alone, however, is an anechoic (acoustically dead) process, representing a rather unnatural environment which is free from all sound reflections and reverberation. In modern videogames that include multiple streams of audio/music playing at the same time a real-time convolver is currently impractical but we can expect to see this implemented in the future generations of consoles and computers where convolution is an attractive option for 5.1 playback.

Musical and creative applications

For sound-designers working in music and multimedia, convolution is a good tool for the palette. The creative possibilities are endless, from combing different instruments into a new (and usually weird) sound, to special effects for films and games etc.

To realise this, let's look at the idea of the impulse response (IR). An impulse response is basically any signal that has passed through a filter. As stated earlier, any system can be considered a filter, from the obvious, like a synthesizer filter to the less so, an acoustic instrument body for instance. An instrument usually resonates in a narrow frequency range and acts as a filter. We can for instance record the IR of an analogue synth filter and apply it to a signal. This would of course only be one-off without the sweepable controls. Or how about convolving some snare hits with a ride cymbal hit? Or in a network game where a players' voice can be convolved with a characters'?

