The role of visual grounding in visual question answering generalization and shortcut learning
Veröffentlichungsdatum
2024-06-06
Autoren
Betreuer
Gutachter
Zusammenfassung
We set out on this thesis’ journey with a conceptual idea of what Visual Grounding’s role and impact in VQA should be, but soon learn that - apart from the interference of shortcut learning with these expectations - there are also other fundamental limitations of contemporary Visual Grounding research that impede an in-depth analysis, namely a lack of well-defined procedures to measure Visual Grounding in VQA and properly evaluate its impact, We address these limitations with our contributions and in the process gain a much clearer understanding about why the impact of Visual Grounding on VQA performance has been difficult to grasp and how its role can be better highlighted with appropriately designed evaluation scenarios. Finally, in the last chapter, this thesis culminates in the definition of a theoretical model (VGR) that clearly describes the role of Visual Grounding in the context of VQA generalization and shortcut learning, thereby marking the end of our journey.
Schlagwörter
Visual Question Answering
;
Visual Grounding
;
Shortcut Learning
Institution
Fachbereich
Dokumenttyp
Dissertation
Sprache
Englisch
Dateien![Vorschaubild]()
Lade...
Name
Dissertation_Daniel_Reich_The_Role_of_VG_in_VQA_Generalization_and_Shortcut_Learning.pdf
Description
Dissertation Daniel Reich: The Role of VG in VQA Generalization and Shortcut Learning
Size
10.12 MB
Format
Adobe PDF
Checksum
(MD5):93c54ed1d41166d52f7c81f28d161473