The role of visual grounding in visual question answering generalization and shortcut learning
File | Description | Size | Format | |
---|---|---|---|---|
Dissertation_Daniel_Reich_The_Role_of_VG_in_VQA_Generalization_and_Shortcut_Learning.pdf | Dissertation Daniel Reich: The Role of VG in VQA Generalization and Shortcut Learning | 10.36 MB | Adobe PDF | View/Open |
Authors: | Reich, Daniel | Supervisor: | Schultz, Tanja Putze, Felix |
1. Expert: | Schultz, Tanja | Experts: | Stiefelhagen, Rainer | Abstract: | We set out on this thesis’ journey with a conceptual idea of what Visual Grounding’s role and impact in VQA should be, but soon learn that - apart from the interference of shortcut learning with these expectations - there are also other fundamental limitations of contemporary Visual Grounding research that impede an in-depth analysis, namely a lack of well-defined procedures to measure Visual Grounding in VQA and properly evaluate its impact, We address these limitations with our contributions and in the process gain a much clearer understanding about why the impact of Visual Grounding on VQA performance has been difficult to grasp and how its role can be better highlighted with appropriately designed evaluation scenarios. Finally, in the last chapter, this thesis culminates in the definition of a theoretical model (VGR) that clearly describes the role of Visual Grounding in the context of VQA generalization and shortcut learning, thereby marking the end of our journey. |
Keywords: | Visual Question Answering; Visual Grounding; Shortcut Learning | Issue Date: | 6-Jun-2024 | Type: | Dissertation | DOI: | 10.26092/elib/3092 | URN: | urn:nbn:de:gbv:46-elib80583 | Institution: | Universität Bremen | Faculty: | Fachbereich 03: Mathematik/Informatik (FB 03) |
Appears in Collections: | Dissertationen |
Page view(s)
205
checked on Nov 21, 2024
Download(s)
54
checked on Nov 21, 2024
Google ScholarTM
Check
This item is licensed under a Creative Commons License