Skip navigation
SuUB logo
DSpace logo

  • Home
  • Institutions
    • University of Bremen
    • City University of Applied Sciences
    • Bremerhaven University of Applied Sciences
  • Sign on to:
    • My Media
    • Receive email
      updates
    • Edit Account details

Citation link: http://nbn-resolving.de/urn:nbn:de:gbv:46-00104531-10
00104531-1.pdf
OpenAccess
 
copyright

Analysis and Modeling of Visual Invariance for Object Recognition and Spatial Cognition


File Description SizeFormat
00104531-1.pdf34.18 MBAdobe PDFView/Open
Other Titles: Analyse und Modellierung visueller Invarianz zur Objekterkennung und Raumkognition
Authors: Eberhardt, Sven 
Supervisor: Schill, Kerstin
1. Expert: Schill, Kerstin
2. Expert: Fahle, Manfred
Abstract: 
The human visual system is unmatched by machine imitates in its universal ability to perform a great number of complex tasks such as object detection, tracking and categorization, scene perception and localization seemingly effortlessly and instantly. It can quickly adapt to novel problems, learn concepts from few samples, build and reason on abstract representations and merge information from multiple senses. Of particular interest is processing in the ventral stream of human visual cortex because it solves a multitude of complex scene analysis tasks in temporal ranges of below 200 milliseconds. Here, a functional, dataset-driven analysis approach is followed. Feature outputs from several specialized vision models including \Textons, Gist, HMax, SIFT and Spatial Pyramids are analyzed for their diagnosticity on a number of tasks typically attributed to human ventral stream processing. A strong performance dissociation between models and tasks dependent on invariance properties is found. From these findings, a conceptual space is proposed into which both vision models and associated task requirements are placed based on local and global invariance dimensions. Following this concept, a general-purpose, hierarchical vision model is suggested in which specializations is realized as tuning of receptive field ranges and the proper task-dependent weights. As an example for an application of this conceptual space, the special task of vision-based localization is introduced in a classification concept. Place categorization in several contexts including indoor, outdoor and virtual world environments are sorted into the conceptual space of vision requirements. From this, a universal descriptor called \emph{Signature of a Place} is introduced which outperforms baseline models on all localization tasks. Correlation to human performance is tested, yielding an orthogonal result. The question of self-organized learning in hierarchical systems is analyzed and a novel approach utilizing cross-modal feature training between visual and auditory cues in a deep learning hierarchy is presented. The model is able to generate audio predictions from video input and explain previous human psychophysics results on multi-modal difference thresholds.
Keywords: dissertation, vision, visual system, localization, spatial cognition, object recognition, deep learning, modeling, image processing
Issue Date: 29-May-2015
Type: Dissertation
URN: urn:nbn:de:gbv:46-00104531-10
Institution: Universität Bremen 
Faculty: FB3 Mathematik/Informatik 
Appears in Collections:Dissertationen

  

Page view(s)

35
checked on Jan 25, 2021

Download(s)

4
checked on Jan 25, 2021

Google ScholarTM

Check


Items in Media are protected by copyright, with all rights reserved, unless otherwise indicated.

Legal notice -Feedback -Data privacy
Media - Extension maintained and optimized by Logo 4SCIENCE