The role of visual grounding in visual question answering generalization and shortcut learning
Datei | Beschreibung | Größe | Format | |
---|---|---|---|---|
Dissertation_Daniel_Reich_The_Role_of_VG_in_VQA_Generalization_and_Shortcut_Learning.pdf | Dissertation Daniel Reich: The Role of VG in VQA Generalization and Shortcut Learning | 10.36 MB | Adobe PDF | Anzeigen |
Autor/Autorin: | Reich, Daniel | BetreuerIn: | Schultz, Tanja Putze, Felix |
1. GutachterIn: | Schultz, Tanja | Weitere Gutachter:innen: | Stiefelhagen, Rainer | Zusammenfassung: | We set out on this thesis’ journey with a conceptual idea of what Visual Grounding’s role and impact in VQA should be, but soon learn that - apart from the interference of shortcut learning with these expectations - there are also other fundamental limitations of contemporary Visual Grounding research that impede an in-depth analysis, namely a lack of well-defined procedures to measure Visual Grounding in VQA and properly evaluate its impact, We address these limitations with our contributions and in the process gain a much clearer understanding about why the impact of Visual Grounding on VQA performance has been difficult to grasp and how its role can be better highlighted with appropriately designed evaluation scenarios. Finally, in the last chapter, this thesis culminates in the definition of a theoretical model (VGR) that clearly describes the role of Visual Grounding in the context of VQA generalization and shortcut learning, thereby marking the end of our journey. |
Schlagwort: | Visual Question Answering; Visual Grounding; Shortcut Learning | Veröffentlichungsdatum: | 6-Jun-2024 | Dokumenttyp: | Dissertation | DOI: | 10.26092/elib/3092 | URN: | urn:nbn:de:gbv:46-elib80583 | Institution: | Universität Bremen | Fachbereich: | Fachbereich 03: Mathematik/Informatik (FB 03) |
Enthalten in den Sammlungen: | Dissertationen |
Seitenansichten
205
checked on 21.11.2024
Download(s)
54
checked on 21.11.2024
Google ScholarTM
Prüfe
Diese Ressource wurde unter folgender Copyright-Bestimmung veröffentlicht: Lizenz von Creative Commons