|Title||Robust clustering of data collected via crowdsourcing|
|Publication Type||Conference Paper|
|Year of Publication||2017|
|Authors||Pagès-Zamora, A, Giannakis, G, López-Valcarce, R, Gimenez-Febrer, P|
|Conference Name||IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP)|
Crowdsourcing approaches rely on the collection of multiple individuals to solve problems that require analysis of large data sets in a timely accurate manner. The inexperience of participants or annotators motivates well robust techniques. Focusing on clustering setups, the data provided by all annotators is suitably modeled here as a mixture of Gaussian components plus a uniformly distributed random variable to capture outliers. The proposed algorithm is based on the expectation-maximization algorithm and allows for soft assignments of data to clusters, to rate annotators according to their performance, and to estimate the number of Gaussian components in the non-Gaussian/Gaussian mixture model, in a jointly manner.