now publishers - A sampling-based speaker clustering using utterance-oriented Dirichlet process mixture model and its evaluation on large-scale data

APSIPA Transactions on Signal and Information Processing > Vol 4 > Issue 1

A sampling-based speaker clustering using utterance-oriented Dirichlet process mixture model and its evaluation on large-scale data

Naohiro Tawara, Waseda University, Japan, tawara@pcl.cs.waseda.ac.jp , Tetsuji Ogawa, Waseda University, Japan, Shinji Watanabe, Mitsubishi Electric Research Laboratories, USA, Atsushi Nakamura, Nagoya City University, Japan, Tetsunori Kobayashi, Waseda University, Japan

Suggested Citation

Naohiro Tawara, Tetsuji Ogawa, Shinji Watanabe, Atsushi Nakamura and Tetsunori Kobayashi (2015), "A sampling-based speaker clustering using utterance-oriented Dirichlet process mixture model and its evaluation on large-scale data", APSIPA Transactions on Signal and Information Processing: Vol. 4: No. 1, e16. http://dx.doi.org/10.1017/ATSIP.2015.19

Publication Date: 28 Oct 2015

Subjects

Keywords

Sampling approach, Non-parametric Bayesian model, Gibbs sampling, Utterance-oriented Dirichlet process mixture model, Speaker clustering

Journal details

Open Access

This is published under the terms of the Creative Commons Attribution licence.

Downloaded: 1795 times

In this article:

Abstract

An infinite mixture model is applied to model-based speaker clustering with sampling-based optimization to make it possible to estimate the number of speakers. For this purpose, a framework of non-parametric Bayesian modeling is implemented with the Markov chain Monte Carlo and incorporated in the utterance-oriented speaker model. The proposed model is called the utterance-oriented Dirichlet process mixture model (UO-DPMM). The present paper demonstrates that UO-DPMM is successfully applied on large-scale data and outperforms the conventional hierarchical agglomerative clustering, especially for large amounts of utterances.

DOI:10.1017/ATSIP.2015.19

I. INTRODUCTION
II. UTTERANCE-ORIENTED MIXTURE MODEL FOR FINITE SPEAKERS
III. UTTERANCE-ORIENTED MIXTURE MODEL FOR INFINITE SPEAKERS
IV. SPEAKER CLUSTERING EXPERIMENTS
V. DISCUSSION
VI. CONCLUSION AND FUTURE WORK

A sampling-based speaker clustering using utterance-oriented Dirichlet process mixture model and its evaluation on large-scale data

Share

Journal details

Abstract