Skip navigation
SuUB logo
DSpace logo

  • Home
  • Institutions
    • University of Bremen
    • City University of Applied Sciences
    • Bremerhaven University of Applied Sciences
  • Sign on to:
    • My Media
    • Receive email
      updates
    • Edit Account details

Citation link: https://doi.org/10.26092/elib/16
2020_Westphal_thesis.pdf
OpenAccess
 
by 3.0 de

Model Selection and Evaluation in Supervised Machine Learning


File Description SizeFormat
2020_Westphal_thesis.pdfDissertation_Westphal8.51 MBAdobe PDFView/Open
Authors: Westphal, Max  
Supervisor: Brannath, Werner  
1. Expert: Brannath, Werner  
Experts: Zapf, Antonia  
Abstract: 
In this thesis, we propose new model evaluation strategies for supervised machine learning. Our main goal is to reliably and efficiently infer the generalization performance of one or multiple prediction models based on limited data. So far, a strict separation of model selection and performance assessment has been recommended. While this approach is valid, it lacks flexibility as a flawed model selection can usually not be corrected without compromising the statistical inference. We suggest to evaluate multiple promising models on the test dataset, thereby taking into account more observations for the final selection process. We employ a parametric simultaneous test procedure to adjust the inferences (test decisions, point estimates) for multiple comparisons. We extend this method to enable a simultaneous evaluation of multiple binary classifiers with regard to sensitivity and specificity as co-primary endpoints. In both cases, approximate control of the family-wise error rate is warranted. Besides this established frequentist procedure, we propose a new multivariate Beta-binomial model for the analysis of multiple proportions with general correlation structure. This Bayesian approach allows to incorporate prior knowledge to the inference task. Finally, we derive a new decision rule for subset selection problems. Our method is developed in the framework of Bayesian decision theory by employing a novel utility function. Compared to previous approaches, this method is computationally more complex but hyperparameter-free. We illustrate in extensive simulation studies that our framework can improve the expected final model performance and statistical power, the probability to correctly identify a sufficiently good model. While an unbiased point estimation is no longer possible, the selection-induced bias can be corrected in a conservative manner. The family-wise error rate is controlled under realistic parameter configurations given that a moderate number of observations is available. We conclude that the test data can be used for model selection when suitable adjustments for multiple comparisons are applied. This increases the flexibility and statistical efficiency compared to traditional approaches. Our framework can help to prevent the implementation of flawed models into sensitive and large-scale applications while at the same time reliably identifying truly capable solutions.
Keywords: artificial intelligence; Bayesian inference; bias; classification; co-primary endpoints; decision theory; diagnosis; diagnostic accuracy; hypothesis testing; multiple comparisons; performance assessment; predictive modelling; prognosis; regulation; subset selection; uncertainty quantification
Issue Date: 16-Dec-2019
Type: Dissertation
Secondary publication: no
DOI: 10.26092/elib/16
URN: urn:nbn:de:gbv:46-elib42319
Institution: Universität Bremen 
Faculty: Fachbereich 03: Mathematik/Informatik (FB 03) 
Appears in Collections:Dissertationen

  

Page view(s)

560
checked on May 11, 2025

Download(s)

872
checked on May 11, 2025

Google ScholarTM

Check


This item is licensed under a Creative Commons License Creative Commons

Legal notice -Feedback -Data privacy
Media - Extension maintained and optimized by Logo 4SCIENCE