A proposal for supervised clustering with Dirichlet Process using labels

Billy Peralta, Alberto Caro, Alvaro Soto

Resultado de la investigación: Article

3 Citas (Scopus)

Resumen

Supervised clustering is an emerging area of machine learning, where the goal is to find class-uniform clusters. However, typical state-of-the-art algorithms use a fixed number of clusters. In this work, we propose a variation of a non-parametric Bayesian modeling for supervised clustering. Our approach consists of modeling the clusters as a mixture of Gaussians with the constraint of encouraging clusters of points with the same label. In order to estimate the number of clusters, we assume a-priori a countably infinite number of clusters using a variation of Dirichlet Process model over the prior distribution. In our experiments, we show that our technique typically outperforms the results of other clustering techniques.

Idioma originalEnglish
Páginas (desde-hasta)52-57
Número de páginas6
PublicaciónPattern Recognition Letters
Volumen80
DOI
EstadoPublished - 1 sep 2016

Huella dactilar

Learning systems
Labels
Experiments

ASJC Scopus subject areas

  • Software
  • Signal Processing
  • Computer Vision and Pattern Recognition
  • Artificial Intelligence

Citar esto

@article{8cad525f2de1483cb2de15919ede5502,
title = "A proposal for supervised clustering with Dirichlet Process using labels",
abstract = "Supervised clustering is an emerging area of machine learning, where the goal is to find class-uniform clusters. However, typical state-of-the-art algorithms use a fixed number of clusters. In this work, we propose a variation of a non-parametric Bayesian modeling for supervised clustering. Our approach consists of modeling the clusters as a mixture of Gaussians with the constraint of encouraging clusters of points with the same label. In order to estimate the number of clusters, we assume a-priori a countably infinite number of clusters using a variation of Dirichlet Process model over the prior distribution. In our experiments, we show that our technique typically outperforms the results of other clustering techniques.",
keywords = "Clustering, Dirichlet Process, Supervised clustering",
author = "Billy Peralta and Alberto Caro and Alvaro Soto",
year = "2016",
month = "9",
day = "1",
doi = "10.1016/j.patrec.2016.05.019",
language = "English",
volume = "80",
pages = "52--57",
journal = "Pattern Recognition Letters",
issn = "0167-8655",
publisher = "Elsevier",

}

A proposal for supervised clustering with Dirichlet Process using labels. / Peralta, Billy; Caro, Alberto; Soto, Alvaro.

En: Pattern Recognition Letters, Vol. 80, 01.09.2016, p. 52-57.

Resultado de la investigación: Article

TY - JOUR

T1 - A proposal for supervised clustering with Dirichlet Process using labels

AU - Peralta, Billy

AU - Caro, Alberto

AU - Soto, Alvaro

PY - 2016/9/1

Y1 - 2016/9/1

N2 - Supervised clustering is an emerging area of machine learning, where the goal is to find class-uniform clusters. However, typical state-of-the-art algorithms use a fixed number of clusters. In this work, we propose a variation of a non-parametric Bayesian modeling for supervised clustering. Our approach consists of modeling the clusters as a mixture of Gaussians with the constraint of encouraging clusters of points with the same label. In order to estimate the number of clusters, we assume a-priori a countably infinite number of clusters using a variation of Dirichlet Process model over the prior distribution. In our experiments, we show that our technique typically outperforms the results of other clustering techniques.

AB - Supervised clustering is an emerging area of machine learning, where the goal is to find class-uniform clusters. However, typical state-of-the-art algorithms use a fixed number of clusters. In this work, we propose a variation of a non-parametric Bayesian modeling for supervised clustering. Our approach consists of modeling the clusters as a mixture of Gaussians with the constraint of encouraging clusters of points with the same label. In order to estimate the number of clusters, we assume a-priori a countably infinite number of clusters using a variation of Dirichlet Process model over the prior distribution. In our experiments, we show that our technique typically outperforms the results of other clustering techniques.

KW - Clustering

KW - Dirichlet Process

KW - Supervised clustering

UR - http://www.scopus.com/inward/record.url?scp=84973861004&partnerID=8YFLogxK

U2 - 10.1016/j.patrec.2016.05.019

DO - 10.1016/j.patrec.2016.05.019

M3 - Article

VL - 80

SP - 52

EP - 57

JO - Pattern Recognition Letters

JF - Pattern Recognition Letters

SN - 0167-8655

ER -