Fit Gaussian mixture model with weighted observations

Question

Wolfgang Schwanghart il 23 Nov 2019

0
Link

Link diretto a questa domanda

https://it.mathworks.com/matlabcentral/answers/492759-fit-gaussian-mixture-model-with-weighted-observations

Risposto: Omkar Mulekar il 5 Giu 2020

Hi everyone, looking at the help of fitgmdist, I cannot see that there is the possibility to weight observations. Is there a reason? Many functions of the Statistics and Machine Learning toolbox support weights. Does anyone have an idea how to include weights, or can anyone point me to an alternative?

3 Commenti
Mostra 1 commento meno recenteNascondi 1 commento meno recente

Wolfgang Schwanghart il 26 Nov 2019

This could be an option...

Adam Danz il 26 Nov 2019

If you end up giving that a try, keep in mind that the weights must be converted to integers and depending on how that's carried out, it could vastly increase the number of data points. Feel free to pull me in if you decide to go down this route and get stuck.

In a sense, by duplicating the values of the data being fit, you are strengthening their representation in the fit and that's kind of like weighting.

Accedi per commentare.

Accedi per rispondere a questa domanda.

Answer 1

Kaashyap Pappu il 26 Nov 2019

0
Link

Link diretto a questa risposta

https://it.mathworks.com/matlabcentral/answers/492759-fit-gaussian-mixture-model-with-weighted-observations#answer_403407

The function fitgmdist fits a distribution to a given data set. This data set generally has points belonging to the same class therefore the ‘weight’ parameter is not needed, since you are essentially just fitting a distribution model to given data.

Functions such as fitcknn, fitcsvm have weights because those are classification models. Weights become essential when data from multiple classes is present for training, but there is a class imbalance, that is data points for each class are not in equal proportion. To account for this imbalance, weights are used and are essential input arguments.

Hope this helps!

0 Commenti
Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

Accedi per commentare.

Answer 2

Jeff Miller il 26 Nov 2019

0
Link

Link diretto a questa risposta

https://it.mathworks.com/matlabcentral/answers/492759-fit-gaussian-mixture-model-with-weighted-observations#answer_403481

It's not exactly clear (to me either) what it means to weight the different observations in this context, but maybe you have something like this in mind:

You have observations X(1:n) with weights W(1:n). Let sumW = sum(W).

Make a new dataset Y with (say) 10000 observations consisting of

round(W(1)/sumW*10000) copies of X(1)

round(W(2)/sumW*10000) copies of X(2)

etc--that is, round(W(i)/sumW*10000) copies of X(i)

Now use fitgmdist with Y. Every Y value will be weighted equally, but the different X's will have weights approximately proportional to their original W values--because their numbers will be in those proportions.

I hope that is clear.

3 Commenti
Mostra 1 commento meno recenteNascondi 1 commento meno recente

Wolfgang Schwanghart il 26 Nov 2019

Thanks for your answer. Let me give some context: The data that I am working with are points with attributes that are located on a linear network. Imagine that this could be a road network and the points are accidents. Now I am interested where these accidents occur, for example as a function of distance from cross roads, and thus I calculate this distance along the network. A non-parametric estimate of the dependence of accidents on the distance can be obtained by kernel density estimator, but this estimation needs to account for the different frequency distribution of distances in the network. A nice solution to such a nonparametric estimate is shown here and relies on relative risk.

Now let's assume that I see two humps in the risk normalized kernel densities and I'd like to fit a Gaussian mixture model with two components to the observed distances. I think that I should also take account for the relative risk that reflects that some distances occur more often than others. My idea was to solve this with weighting the observations with their inverse relative risks. But maybe I am wrong ...

Jeff Miller il 27 Nov 2019

Modificato: Jeff Miller il 27 Nov 2019

What about generating a lot of pseudo-observations from the risk normalized kernel densities and then fitting the gmm to those?

Wolfgang Schwanghart il 27 Nov 2019

Yes, this was my initial thought, and Adam Danz (see above) also came up with the idea. However, after giving the whole approach some thought, I think that the weighting scheme may not lead to the desired results. Rather, I think, we should normalize the probability density function as we obtain it from pdf(gmm,distance) with, let's say, a kernel density estimate of the distance values. I guess this will turn out increasingly difficult if we have models with many variables.

Accedi per commentare.

Answer 3

Omkar Mulekar il 5 Giu 2020

0
Link

Link diretto a questa risposta

https://it.mathworks.com/matlabcentral/answers/492759-fit-gaussian-mixture-model-with-weighted-observations#answer_446853

There seems to be an answer in this paper:

https://arxiv.org/pdf/1509.01509.pdf

They talk about a couple of methods for EM using weighted data. See if it's useful for you!

0 Commenti
Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

Accedi per commentare.

Fit Gaussian mixture model with weighted observations

3 Commenti
Mostra 1 commento meno recenteNascondi 1 commento meno recente

Risposte (3)

0 Commenti
Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

3 Commenti
Mostra 1 commento meno recenteNascondi 1 commento meno recente

0 Commenti
Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

Vedere anche

Categorie

Tag

Prodotti

Release

Community Treasure Hunt

Fit Gaussian mixture model with weighted observations

3 Commenti Mostra 1 commento meno recenteNascondi 1 commento meno recente

Risposte (3)

0 Commenti Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

3 Commenti Mostra 1 commento meno recenteNascondi 1 commento meno recente

0 Commenti Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

Vedere anche

Categorie

Tag

Prodotti

Release

Community Treasure Hunt

3 Commenti
Mostra 1 commento meno recenteNascondi 1 commento meno recente

0 Commenti
Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti

3 Commenti
Mostra 1 commento meno recenteNascondi 1 commento meno recente

0 Commenti
Mostra -2 commenti meno recentiNascondi -2 commenti meno recenti