Trennung von mehrdimensionalen Mischverteilungen bei heterogenen Grundgesamtheiten
|Other Titles:||Separation of multivariate mixture distributions in heterogeneous populations||Authors:||Hagedorn, Heiko||Supervisor:||Timm, Jürgen||1. Expert:||Timm, Jürgen||2. Expert:||Brannath, Werner||Abstract:||
In medicine laboratory values are usually determined from blood or urine samples to diagnose and confirm diseases, or to measure the therapy success. The interpretation of laboratory values is generally based on reference values. These are determined according to standardized methods recommended by the International Federation of Clinical Chemistry and Laboratory Medicine (IFCC). According to this, reference values for a pre-determined population have to be established in a study including subjects classified as healthy. Since the implementation of such studies is both time and cost consuming and amongst others makes high demands on the choice of the reference population, there are approaches to determine reference values based on existing laboratory data sets, which contain a mixture of pathological and non-pathological values. With the chosen distribution-based approach in this work, the non-pathological distribution is separated from the overall distribution using a data cutout under certain assumptions. The parameters are determined numerically via ML estimation. To assess the goodness of fit a suitable test for truncated multivariate normal distributions is derived using the Monte Carlo simulation. The ML estimate and the test procedure are composed under use of optimality criteria to an algorithm to optimise the truncation point and thus the data cutout. The considerations are based on an existing procedure for the case of univariate distributions and are extended to the case of multivariate normal distributions, to account for the correlation structure of the laboratory values being separated and thus to obtain more accurate results in terms of reference limits and multi-dimensional reference ranges. Various optimality criteria and selection methods for truncation points are considered. The developed procedure is examined concerning various parameters such as overlap, first in simulated records and is submitted later to a practise test in data of Hannover Medical School. The implementation and evaluation was performed with the programming language and environment R (version 2.8.0 - 2.12.1). It was found that the developed method is well applicable for multidimensional data taking into account the correlation structure of the laboratory parameters. The distinct optimality criteria deliver similar results if the central assumption is fulfilled that there are mainly non-pathological values in the truncated area and if these values are approximately normally distributed. However the HZ criterion stands out clearly from the other optimality criteria regarding the effectiveness and it provides more stable results than these, especially in case of deviations from the central assumption.
|Keywords:||mixture, truncation, multivariate normal distribution, reference limits, reference range, overlap, laboratory values||Issue Date:||12-Dec-2011||URN:||urn:nbn:de:gbv:46-00102376-18||Institution:||Universität Bremen||Faculty:||FB3 Mathematik/Informatik|
|Appears in Collections:||Dissertationen|
checked on Oct 25, 2020
checked on Oct 25, 2020
Items in Media are protected by copyright, with all rights reserved, unless otherwise indicated.