Voice Activity Detection Method Based on Multi-valued Coarse-graining Lempel-Ziv Complexity


Huan Zhao, Gangjin Wang, Cheng Xu, Fei Yu




One of the key issues in practical speech processing is to locate precisely endpoints of the input utterance to be free of non- speech regions. Although lots of studies have been performed to solve this problem, the operation of existing voice activity detection (VAD) algorithms is still far away from ideal. This paper proposes a novel robust feature for VAD method that is based on multi-valued coarse- graining Lempel-Ziv Complexity (MLZC), which is an improved algorithm of the binary coarse-graining Lempel-Ziv Complexity (BLZC). In addition, we use fuzzy c-Means clustering algorithm and the Bayesian information criterion algorithm to estimate the thresholds of the MLZC characteristic, and adopt the dual-thresholds method for VAD. Experimental results on the TIMIT continuous speech database show that at low SNR environments, the detection performance of the proposed MLZC method is superior to the VAD in GSM ARM, G.729 and BLZC method.