Parameter Estimation for Mixture Models Given Grouped Data
|Other Titles:||Schätzen der Parameter einer Mischverteilung bei gruppierten Daten||Authors:||Wengrzik, Joanna||Supervisor:||Timm, Jürgen||1. Expert:||Timm, Jürgen||2. Expert:||Brannath, Werner||Abstract:||
Finite mixture models are increasingly used to model heterogeneous data in various important practical situations, where the data can be viewed as arising from two or more subpopulations (components). The decomposition of those components leads to the problem of estimating the parameters of the mixture. The maximum likelihood estimation is a useful tool for obtaining estimates for the parameters. However, since in the case of a finite mixture the corresponding likelihood equations can not be solved analytically, a numerical procedure is necessary. The iterative Expectation Maximization (EM) algorithm provides a convenient way to obtain a solution for a likelihood equation, if a closed-form solution does not exist. Additionally, this work deals with the problem of observations, which are grouped into intervals. In fact, the basis of this work is observations arising from a mixing distribution, whereby only the number of observations falling into previously specified intervals is known rather than individual observations. To estimate the parameters of a mixture given such grouped observations, several EM algorithm based methods are investigated, i.e. two already known and four new algorithms are introduced. A simulation study is presented that compares the different estimation approaches for various two component Gaussian mixtures. It is discovered that the new simple methods that circumvent the grouping structure achieved much better results than the more complex algorithms. Thereby, it has to be distinguished between mixtures with highly overlapped and well separated components. The former mixtures cause failures in a higher number of samples and the estimation results differ distinctly from the true values. Particularly, algorithms that consist of more than one iteration procedure are more affected. Furthermore, it could be shown that all considered methods are almost comparable in cases where the interval width is small and/or the sample size large. In contrast, situations where the interval width is large and/or the sample size is small can be handled best with the new proposed algorithms. Finally, a new technique for obtaining suitable starting values is proposed.
|Keywords:||Mixture Models, EM-Algorithm, Grouped Data||Issue Date:||31-May-2012||URN:||urn:nbn:de:gbv:46-00102649-12||Institution:||Universität Bremen||Faculty:||FB3 Mathematik/Informatik|
|Appears in Collections:||Dissertationen|
checked on Sep 24, 2020
Items in Media are protected by copyright, with all rights reserved, unless otherwise indicated.