PhaVoRIT - A Phase Vocoder for Real-Time interactive Time-StretchingIn recent years, interaction with time-based media such as audio and video has been subject to increasing research efforts. Current research aims to control the speed of time-based media in real-time, while maintaining the integrity of the content. Even though this process of time-stretching has been achieved for many types of media, playback speed control has proved to be particularly difficult for digitally recorded audio. Current solutions tend to alter sound characteristics and do not preserve pitch, timbre, microtiming, and transients.
A well-established tool for this kind of audio time-stretching is the Phase Vocoder (Flanagan:1966) algorithm. It modifies the phase spectra of subsequent short-time Fourier transformed signal blocks to obtain a continuous signal curve when overlap-adding those blocks with altered time spacing. This transfers the distribution of frequency components over time, which make up the sound, to a finer or coarser timescale. Although the Phase Vocoder changes the playback speed of pre-recorded audio data without changing the pitch, it introduces considerable artifacts (commonly referred to as "transient smearing" and "reverberation"), limiting its application to recordings of instrumental or vocal music.
This thesis gives a comprehensive introduction of the concepts and the state-of-the-art of audio time-stretching technology. We propose four new methods to further improve the audio quality of Phase Vocoder-based time-stretching:
- Multiresolution peak-picking for a full bass and heights without unwanted overtones.
- Silent passage phase reset for increased robustness and stronger phase coherence.
- Sinusoidal trajectory heuristics for clear note onsets and semantically sound phase propagation.
- For the detection and processing of transients we extended a simplified version of Röbel's (Röbel:2003) algorithm.
If you are interested in obtaining PhaVoRIT please contact Prof. Dr. Borchers
- Download the thesis paper
- karrer_thesis.pdf.zip (17.81 Mb) (internal access only)
- Download the sound examples (27.7 Mb zip)
Back to the homepage of Thorsten karrer