Autori: Franzolini, Beatrice, Rebaudo, Giovanni
Titolo: Entropy regularization in probabilistic clustering
Periodico: Statistical methods & applications : Journal of the Italian Statistical Society
Anno: 2024 - Volume: 33 - Fascicolo: 1 - Pagina iniziale: 37 - Pagina finale: 60

Bayesian nonparametric mixture models are widely used to cluster observations. However, one major drawback of the approach is that the estimated partition often presents unbalanced clusters’ frequencies with only a few dominating clusters and a large number of sparsely-populated ones. This feature translates into results that are often uninterpretable unless we accept to ignore a relevant number of observations and clusters. Interpreting the posterior distribution as penalized likelihood, we show how the unbalance can be explained as a direct consequence of the cost functions involved in estimating the partition. In light of our findings, we propose a novel Bayesian estimator of the clustering configuration. The proposed estimator is equivalent to a post-processing procedure that reduces the number of sparsely-populated clusters and enhances interpretability. The procedure takes the form of entropy-regularization of the Bayesian estimate. While being computationally convenient with respect to alternative strategies, it is also theoretically justified as a correction to the Bayesian loss function used for point estimation and, as such, can be applied to any posterior distribution of clusters, regardless of the specific model used.




SICI: 1618-2510(2024)33:1<37:ERIPC>2.0.ZU;2-R

Esportazione dati in Refworks (solo per utenti abilitati)

Record salvabile in Zotero

Biblioteche ACNP che possiedono il periodico
Nel rispetto della Direttiva 2009/136/CE, ti informiamo che il nostro sito utilizza i cookies. Se continui a navigare sul sito, accetti espressamente il loro utilizzo.