Parameter Estimation for Mixture Models Given Grouped Data

Wengrzik, Joanna

Zitierlink URN

https://nbn-resolving.de/urn:nbn:de:gbv:46-00102649-12

Parameter Estimation for Mixture Models Given Grouped Data

Veröffentlichungsdatum

2012-05-31

Autoren

Wengrzik, Joanna

Betreuer

Timm, Jürgen

Gutachter

Brannath, Werner

Zusammenfassung

Finite mixture models are increasingly used to model heterogeneous data in various important practical situations, where the data can be viewed as arising from two or more subpopulations (components). The decomposition of those components leads to the problem of estimating the parameters of the mixture. The maximum likelihood estimation is a useful tool for obtaining estimates for the parameters. However, since in the case of a finite mixture the corresponding likelihood equations can not be solved analytically, a numerical procedure is necessary. The iterative Expectation Maximization (EM) algorithm provides a convenient way to obtain a solution for a likelihood equation, if a closed-form solution does not exist. Additionally, this work deals with the problem of observations, which are grouped into intervals. In fact, the basis of this work is observations arising from a mixing distribution, whereby only the number of observations falling into previously specified intervals is known rather than individual observations. To estimate the parameters of a mixture given such grouped observations, several EM algorithm based methods are investigated, i.e. two already known and four new algorithms are introduced. A simulation study is presented that compares the different estimation approaches for various two component Gaussian mixtures. It is discovered that the new simple methods that circumvent the grouping structure achieved much better results than the more complex algorithms. Thereby, it has to be distinguished between mixtures with highly overlapped and well separated components. The former mixtures cause failures in a higher number of samples and the estimation results differ distinctly from the true values. Particularly, algorithms that consist of more than one iteration procedure are more affected. Furthermore, it could be shown that all considered methods are almost comparable in cases where the interval width is small and/or the sample size large. In contrast, situations where the interval width is large and/or the sample size is small can be handled best with the new proposed algorithms. Finally, a new technique for obtaining suitable starting values is proposed.

Schlagwörter

Mixture Models

;

EM-Algorithm

;

Grouped Data

Institution

Universität Bremen

Fachbereich

Fachbereich 03: Mathematik/Informatik (FB 03)

Dokumenttyp

Dissertation

Zweitveröffentlichung

Nein

Sprache

Englisch

Dateien

Name

00102649-1.pdf

Size

3.49 MB

Format

Adobe PDF

Checksum

(MD5):8b4b249146ddb616d60f064ebf870aa6