Skip to main content Skip to navigation

Jurgen van Gael: Beam Sampling for Infinite Hidden Markov Models


The Infinite Hidden Markov Model (iHMM) [1,2] is an extension of the classical Hidden Markov Model widely used in machine learning and bioinformatics. As a tool to model sequential data, Hidden Markov Models suffer from the need to specify the number of hidden states. Although model selection and model averaging are widely used in this context, the Infinite Hidden Markov Model offers a nonparametric alternative. The core idea of the iHMM is to use Dirichlet Processes to define the distribution of the rows of a Markov Model transition matrix. As such, the number of used states can automatically be adapted during learning; or can be integrated over for prediction. Until now, the Gibbs sampler was the only known inference algorithm for the iHMM. This is unfortunate as the Gibbs sampler is known to be weak for strongly correlated data; which is often the case in sequential or time series data. Moreover, it is suprising that we have powerful inference algorithms for finite HMM's (the forward-backward or Baum-Welch dynamic programming algorithms) but cannot apply these methods for the iHMM. In this work, we propose a method called the “Beam Sampler” which combines ideas from slice sampling and dynamic programming for inference in the iHMM. We show that the beam sampler has some interesting properties such as: (1) it is less susceptible to strong correlations in the data than the Gibbs sampler, (2) it can handle non-conjugacy in the model more easily than the Gibbs sampler. We also show that the scope of the beam sampler idea goes beyond training the Infinite Hidden Markov Model, but can also be used to efficiently train finite HMM's.


[1] MJ Beal, Z Ghahramani, CE Rasmussen, The infinite hidden Markov model, Advances in Neural Information Processing Systems, 2002.

[2] YW Teh, MI Jordan, MJ Beal, DM Blei, Hierarchical Dirichlet Processes, Journal of the American Statistical Association, 2006.