We will present novel ideas to successfully build end-to-end speaker recognition on deep learning. The analysed approach aims to model both speaker and phonetic information of a speech utterance through specific hidden representations of deep neural network. Performance of this new approach will be measured on a standard (RSR 2015) task and compared to conventional speaker recognition systems. Large relative improvement of about 50% in equal error rate has been observed for a fixed-phrase condition.


End-to-end approach for recognizing speakers from audio
Subhadeep Dey, Idiap Research Institute
19 April 2018 · 11:09 a.m.