Mixture of experts with entropic regularization for data classification

Billy Peralta, Ariel Saavedra, Luis Caro, Alvaro Soto

Resultado de la investigación: Article

Resumen

Today, there is growing interest in the automatic classification of a variety of tasks, such as weather forecasting, product recommendations, intrusion detection, and people recognition. "Mixture-of-experts" is a well-known classification technique; it is a probabilistic model consisting of local expert classifiers weighted by a gate network that is typically based on softmax functions, combined with learnable complex patterns in data. In this scheme, one data point is influenced by only one expert; as a result, the training process can be misguided in real datasets for which complex data need to be explained by multiple experts. In this work, we propose a variant of the regular mixture-of-experts model. In the proposed model, the cost classification is penalized by the Shannon entropy of the gating network in order to avoid a "winner-takes-all" output for the gating network. Experiments show the advantage of our approach using several real datasets, with improvements in mean accuracy of 3-6% in some datasets. In future work, we plan to embed feature selection into this model.

Idioma originalEnglish
Número de artículo190
PublicaciónEntropy
Volumen21
N.º2
DOI
EstadoPublished - 1 feb 2019

Huella dactilar

weather forecasting
classifiers
recommendations
education
entropy
costs
output
products

ASJC Scopus subject areas

  • Physics and Astronomy(all)

Citar esto

Peralta, Billy ; Saavedra, Ariel ; Caro, Luis ; Soto, Alvaro. / Mixture of experts with entropic regularization for data classification. En: Entropy. 2019 ; Vol. 21, N.º 2.
@article{b81af015f6fd43ef89bff1d61ce3dade,
title = "Mixture of experts with entropic regularization for data classification",
abstract = "Today, there is growing interest in the automatic classification of a variety of tasks, such as weather forecasting, product recommendations, intrusion detection, and people recognition. {"}Mixture-of-experts{"} is a well-known classification technique; it is a probabilistic model consisting of local expert classifiers weighted by a gate network that is typically based on softmax functions, combined with learnable complex patterns in data. In this scheme, one data point is influenced by only one expert; as a result, the training process can be misguided in real datasets for which complex data need to be explained by multiple experts. In this work, we propose a variant of the regular mixture-of-experts model. In the proposed model, the cost classification is penalized by the Shannon entropy of the gating network in order to avoid a {"}winner-takes-all{"} output for the gating network. Experiments show the advantage of our approach using several real datasets, with improvements in mean accuracy of 3-6{\%} in some datasets. In future work, we plan to embed feature selection into this model.",
keywords = "Classification, Entropy, Mixture-of-experts, Regularization",
author = "Billy Peralta and Ariel Saavedra and Luis Caro and Alvaro Soto",
year = "2019",
month = "2",
day = "1",
doi = "10.3390/e21020190",
language = "English",
volume = "21",
journal = "Entropy",
issn = "1099-4300",
publisher = "Multidisciplinary Digital Publishing Institute (MDPI)",
number = "2",

}

Mixture of experts with entropic regularization for data classification. / Peralta, Billy; Saavedra, Ariel; Caro, Luis; Soto, Alvaro.

En: Entropy, Vol. 21, N.º 2, 190, 01.02.2019.

Resultado de la investigación: Article

TY - JOUR

T1 - Mixture of experts with entropic regularization for data classification

AU - Peralta, Billy

AU - Saavedra, Ariel

AU - Caro, Luis

AU - Soto, Alvaro

PY - 2019/2/1

Y1 - 2019/2/1

N2 - Today, there is growing interest in the automatic classification of a variety of tasks, such as weather forecasting, product recommendations, intrusion detection, and people recognition. "Mixture-of-experts" is a well-known classification technique; it is a probabilistic model consisting of local expert classifiers weighted by a gate network that is typically based on softmax functions, combined with learnable complex patterns in data. In this scheme, one data point is influenced by only one expert; as a result, the training process can be misguided in real datasets for which complex data need to be explained by multiple experts. In this work, we propose a variant of the regular mixture-of-experts model. In the proposed model, the cost classification is penalized by the Shannon entropy of the gating network in order to avoid a "winner-takes-all" output for the gating network. Experiments show the advantage of our approach using several real datasets, with improvements in mean accuracy of 3-6% in some datasets. In future work, we plan to embed feature selection into this model.

AB - Today, there is growing interest in the automatic classification of a variety of tasks, such as weather forecasting, product recommendations, intrusion detection, and people recognition. "Mixture-of-experts" is a well-known classification technique; it is a probabilistic model consisting of local expert classifiers weighted by a gate network that is typically based on softmax functions, combined with learnable complex patterns in data. In this scheme, one data point is influenced by only one expert; as a result, the training process can be misguided in real datasets for which complex data need to be explained by multiple experts. In this work, we propose a variant of the regular mixture-of-experts model. In the proposed model, the cost classification is penalized by the Shannon entropy of the gating network in order to avoid a "winner-takes-all" output for the gating network. Experiments show the advantage of our approach using several real datasets, with improvements in mean accuracy of 3-6% in some datasets. In future work, we plan to embed feature selection into this model.

KW - Classification

KW - Entropy

KW - Mixture-of-experts

KW - Regularization

UR - http://www.scopus.com/inward/record.url?scp=85061968315&partnerID=8YFLogxK

U2 - 10.3390/e21020190

DO - 10.3390/e21020190

M3 - Article

AN - SCOPUS:85061968315

VL - 21

JO - Entropy

JF - Entropy

SN - 1099-4300

IS - 2

M1 - 190

ER -