TimeScalingListeningTest

TIME-SCALE MODIFICATIONS BASED ON A FULL-BAND ADAPTIVE HARMONIC MODEL

George P. Kafentzis, Gilles Degottex, Olivier Rosec, and Yannis Stylianou

Abstract - In this paper, a simple method for time-scale modifications of speech based on a recently suggested model for AM-FM decomposition of speech signals, is presented. This model is referred to as the adaptive Harmonic Model (aHM). A full-band speech analysis/synthesis system based on the aHM representation is built, without the necessity of separating a deterministic and/or a stochastic component from the speech signal. The aHM models speech as a sum of harmonically related sinusoids that can adapt to the local characteristics of the signal and provide accurate instantaneous amplitude, frequency, and phase trajectories. Because of the high quality representation and reconstruction of speech, aHM can provide high quality time-scale modifications. Informal listenings show that the synthetic time-scaled waveforms are natural and free of some common artifacts encountered in other state-of-the-art models, such as “metallic quality”, chorusing, or musical noise.

Thank you for your time !

In this test, the goal is to evaluate the perceptual quality between recordings of speech and their time-scaled reconstruction by several algorithms and for several time-scale factors.

You will listen several artificially time-scaled speech waveforms. The time scale factor varies from 0.5 to 6. The criteria of quality include artifact-free waveforms, such as chorusing, buzziness, whispering, metallic/robotic voice, etc.

At first, you will play the original sound as many times as you want (labeled Original). Then, you will play the other sounds as many times as you want again, and select which one time-scaled speech signal has the highest quality. The original signal is given as a reference.

Recommendations

If there is any technical problem with one sound, select Prob.
Absolutely use headphones. Do not use earphones or speakers.
Verify that the sound is loud enough to hear the details properly.
Do the test in a quiet place.
Take the time to listen !
Please, do not stop the sound before it finishes!
Please, do not play audio files simultaneously !
Before answering the test, do not hesitate to ask me any question.

The following time scaling modifications have a factor of 0.5 to 6, respectively.

The test

Please, select the sound that is more naturally time-scaled according to the Original.

FCnas_magnifique
Original	WSOLA	HNM	aHM	aHNM	aHNM + Fmax=5500Hz	Prob

0.5
0.8
1.2
1.5
2.0
2.5
3.0
4.0
5.0
6.0

Edit - History - Print - Recent Changes - Search

Page last modified on November 19, 2018, at 05:47 PM