AI in the News

Clustering high-dimensional directional data with the Watson EM algorithm

Features

Machine learning applications often involve data that can be analyzed as unit vectors on a d-dimensional hypersphere, or equivalently are directional in nature. Spectral clustering techniques generate embeddings that constitute an example of directional data and can result in different shapes on a hypersphere (depending on the original structure). Other examples of directional data include text and some sub-domains of bio-informatics.

The Watson distribution for directional data presents a tractable form and has more modeling capability than the simple von Mises-Fisher distribution. In the daily statistical practice it has mostly been used for low-dimensional cases (up to about 4 dimensions). In order to use a generative model of mixtures of Watson distributions on a hypersphere one has to be able to estimate the parameters efficiently, which is a bit tricky due to use of the Kummer function in the distribution and the numerical approximations. This model also allows us to present an explanation for choosing the right embedding dimension for spectral clustering. We analyze the algorithm on a generated example and demonstrate its superiority over the existing algorithms through results on real datasets.

The flexibility of the Watson distributions can lead to better performance, as illustrated with this toy example.

Label assignment using vMF and Watson distributions

The algorithms strong point is its ability to deal with unit-length data distributed on a hypersphere. It is not very fast right now (despite the tricks used).

Papers

Mixture of Watson Distributions: A Generative Model for Hyperspherical Embeddings

Code

The Matlab code from our paper will soon be available here.

Blog

Consider checking out my blog

Email me