ENRICH

INTERSPEECH2020 Demo: Neural Based Speech Enrichment using Greek Harvard Corpus

Mr. Muhammed Shifas PV
Speech Signal Processing Lab (SSPL)
University of Crete (UoC), Greece

Email: shifaspv@csd.uoc.gr


Abstract :- While listening in noise, speech is hard to understand due to the noise induced masking. Although one could increase the volume of the speech play back system, by continuing to press the volume button on your TV remote, loudness levels can become painfull at the threshold of hearing. My research activity is focussed on developing models, which can improve speech intelligibility wothout touching the volume button, or by keeping loudness level unchnaged. To this end, we have developed a neural model (wSSDRC) that has been trained to produce as equivalent intelligibility as the signal processing model (SSDRC). We have used the Greek Harvard (GrHarvard) corpus (refer to [1]) to train the model. We have opened a tensorflow implementation of the model to public at this link. Few samples from the trained model are displayed below:.


Listen to the samples in quite

Plain speech SSDRC wSSDRC


Listen to the same samples in speech shaped noise (SSN) at -7dB level

Plain speech SSDRC wSSDRC

* Samples are provided with the speaker's consent.

[1] A. Sfakianaki, "Designing a Modern Greek sentence corpus for audiological and speech technology research," In Proc. 14th International Conference on Greek Linguistics (ICGL14), 2019 (in press); https://www.csd.uoc.gr/~asfakianaki/GrH.html

Acknowledgment: This work was funded by the E.U. Horizon2020 Grant Agreement 675324, Marie Sklodowska-Curie Innovative Training Network, ENRICH.