now publishers - Nested Gibbs sampling for mixture-of-mixture model and its application to speaker clustering

APSIPA Transactions on Signal and Information Processing > Vol 5 > Issue 1

Nested Gibbs sampling for mixture-of-mixture model and its application to speaker clustering

Naohiro Tawara, Waseda University, Japan, tawara@pcl.cs.waseda.ac.jp , Tetsuji Ogawa, Waseda University, Japan, Shinji Watanabe, Mitsubishi Electric Research Laboratories, USA, Tetsunori Kobayashi, Waseda University, Japan

Suggested Citation

Naohiro Tawara, Tetsuji Ogawa, Shinji Watanabe and Tetsunori Kobayashi (2016), "Nested Gibbs sampling for mixture-of-mixture model and its application to speaker clustering", APSIPA Transactions on Signal and Information Processing: Vol. 5: No. 1, e16. http://dx.doi.org/10.1017/ATSIP.2016.15

Publication Date: 31 Aug 2016

Subjects

Keywords

Fully Bayesian approach, Markov chain Monte Carlo, Nested Gibbs sampling, Mixture-of-mixture model, Speaker clustering

Journal details

Open Access

This is published under the terms of the Creative Commons Attribution licence.

Downloaded: 1748 times

In this article:

Abstract

This paper proposes a novel model estimation method, which uses nested Gibbs sampling to develop a mixture-of-mixture model to represent the distribution of the model's components with a mixture model. This model is suitable for analyzing multilevel data comprising frame-wise observations, such as videos and acoustic signals, which are composed of frame-wise observations. Deterministic procedures, such as the expectation–maximization algorithm have been employed to estimate these kinds of models, but this approach often suffers from a large bias when the amount of data is limited. To avoid this problem, we introduce a Markov chain Monte Carlo-based model estimation method. In particular, we aim to identify a suitable sampling method for the mixture-of-mixture models. Gibbs sampling is a possible approach, but this can easily lead to the local optimum problem when each component is represented by a multi-modal distribution. Thus, we propose a novel Gibbs sampling method, called “nested Gibbs sampling,” which represents the lower-level (fine) data structure based on elemental mixture distributions and the higher-level (coarse) data structure based on mixture-of-mixture distributions. We applied this method to a speaker clustering problem and conducted experiments under various conditions. The results demonstrated that the proposed method outperformed conventional sampling-based, variational Bayesian, and hierarchical agglomerative methods.

DOI:10.1017/ATSIP.2016.15

I. INTRODUCTION
II. FORMULATION
III. MODEL INFERENCE BASED ON FULLY BAYESIAN APPROACH
IV. IMPLEMENTATION OF MCMC-BASED MODEL ESTIMATION
V. SPEAKER CLUSTERING EXPERIMENTS
VI. CONCLUSION AND FUTURE WORK

Nested Gibbs sampling for mixture-of-mixture model and its application to speaker clustering

Share

Journal details

Abstract