Speaker gender identification based on audio fractal dimension and pitch feature.

Zhenhua WANG; Cuirong YANG; Wei WU; Yingle FAN

Return

Speaker gender identification based on audio fractal dimension and pitch feature.

Author: Zhenhua WANG ¹ ; Cuirong YANG ; Wei WU ; Yingle FAN
Author Information

1. Biomedical Engineering & Instrument Institute, Hangzhou Dianzi University, Hangzhou 310018, China. zhenhua0987@eyou.com
Publication Type:Journal Article
MeSH: Algorithms; Artificial Intelligence; Biometry; methods; Humans; Nonlinear Dynamics; Pattern Recognition, Automated; methods; Pitch Discrimination; Sex Characteristics; Signal Processing, Computer-Assisted; Speech; Speech Acoustics; Voice
From: Journal of Biomedical Engineering 2008;25(4):805-810
CountryChina
Language:Chinese
Abstract: Automatic speaker gender identification based on voice feature is an important task in voice processing and analysis fields. In this paper non-linear parameters such as fractal dimension are applied to be one part of feature space for improving the ability of describing speaker gender feature through conventional linear parameters method. Pitch is picked using lifting scheme, and audio fractal dimension is extracted. Then based on Takens theory, the time delay method is used to reconstruct the phase space of fractal dimension sequence. And fractal dimension complexity is obtained by calculating Approximate Entropy. Three dimension feature vectors, including the pitch, the fractal dimension and the fractal dimension complexity, are applied to speaker gender identification. Experiment results show that through adding non-linear parameters, compared with the linear parameter using one dimension only such as pitch, the proposed method is more accurate and robust, and thus provides a new way for speaker gender identification.