Generalized Identifiability Bounds for Mixture Models with Grouped Samples

Robert A. Vandermeulen

René Saitenmacher

February 19, 2024

Recent work has shown that finite mixture models with m components are identifiable, while making no assumptions on the mixture components, so long as one has access to groups of samples of size 2 m - 1 which are known to come from the same mixture component. In this work we generalize that result and show that, if every subset of k mixture components of a mixture model are linearly independent, then that mixture model is identifiable with only (2 m - 1)/( k - 1) samples per group. We further show that this value cannot be improved. We prove an analogous result for a stronger form of identifiability known as “determinedness” along with a corresponding lower bound. This independence assumption almost surely holds if mixture components are chosen randomly from a k -dimensional space. We describe some implications of our results for multinomial mixture models and topic modeling.

https://doi.org/10.1109/TIT.2024.3367433

Generalized Identifiability Bounds for Mixture Models with Grouped Samples

BIFOLD AUTHORS