Autori:
Franzolini, Beatrice,
Rebaudo, GiovanniTitolo:
Entropy regularization in probabilistic clusteringPeriodico:
Statistical methods & applications : Journal of the Italian Statistical SocietyAnno:
2024 - Volume:
33 - Fascicolo:
1 - Pagina iniziale:
37 - Pagina finale:
60Bayesian nonparametric mixture models are widely used to cluster observations. However, one major drawback of the approach is that the estimated partition often presents unbalanced clusters’ frequencies with only a few dominating clusters and a large number of sparsely-populated ones. This feature translates into results that are often uninterpretable unless we accept to ignore a relevant number of observations and clusters. Interpreting the posterior distribution as penalized likelihood, we show how the unbalance can be explained as a direct consequence of the cost functions involved in estimating the partition. In light of our findings, we propose a novel Bayesian estimator of the clustering configuration. The proposed estimator is equivalent to a post-processing procedure that reduces the number of sparsely-populated clusters and enhances interpretability. The procedure takes the form of entropy-regularization of the Bayesian estimate. While being computationally convenient with respect to alternative strategies, it is also theoretically justified as a correction to the Bayesian loss function used for point estimation and, as such, can be applied to any posterior distribution of clusters, regardless of the specific model used.
SICI: 1618-2510(2024)33:1<37:ERIPC>2.0.ZU;2-R
Esportazione dati in Refworks (solo per utenti abilitati)
Record salvabile in Zotero
Biblioteche ACNP che possiedono il periodico