Identity Vector Extraction Using Shared Mixture of PLDA for Short-Time Speaker Recognition

WANG Wenchao; XU Ji; YAN Yonghong

doi:10.1049/cje.2018.06.005

WANG Wenchao, XU Ji, YAN Yonghong. Identity Vector Extraction Using Shared Mixture of PLDA for Short-Time Speaker Recognition[J]. Chinese Journal of Electronics, 2019, 28(2): 357-363. DOI: 10.1049/cje.2018.06.005

Citation:

Identity Vector Extraction Using Shared Mixture of PLDA for Short-Time Speaker Recognition

Graphical Abstract

Graphical Abstract

Abstract

Abstract

The state-of-the-art speaker recognition system degrades performance rapidly dealing with shorttime utterances. It is known to all that identity vectors (i-vectors) extracted from short utterances have large uncertainties and standard Probabilistic linear discriminant analysis (PLDA) method can not exploit this uncertainty to reduce the effect of duration variation. In this work, we use Shared mixture of PLDA (SM-PLDA) to remodel the i-vectors utilizing their uncertainties. SM-PLDA is an improved generative model with a shared intrinsic factor, and this factor can be regarded as an identity vector containing speaker indentification information. This identity vector can be modeled by PLDA. Experimental results are evaluated by both equal error rate and minimum detection cost function. The results conducted on the National institute of standards and technology (NIST) Speaker recognition evaluation (SRE) 2010 extended tasks show that the proposed method has achieved significant improvements compared with ivector/PLDA and some other advanced methods.