Logo des Repositoriums
Zur Startseite
  • English
  • Deutsch
Anmelden
  1. Startseite
  2. SuUB
  3. Dissertationen
  4. Learning the Structure of Continuous Markov Decision Processes
 
Zitierlink URN
https://nbn-resolving.de/urn:nbn:de:gbv:46-00103656-17

Learning the Structure of Continuous Markov Decision Processes

Veröffentlichungsdatum
2014-02-21
Autoren
Metzen, Jan Hendrik  
Betreuer
Kirchner, Frank  
Gutachter
Kreowski, Hans-Jörg  
Zusammenfassung
There is growing interest in artificial, intelligent agents which can operate autonomously for an extended period of time in complex environments and fulfill a variety of different tasks. Such agents will face different problems during their lifetime which may not be foreseeable at the time of their deployment. Thus, the capacity for lifelong learning of new behaviors is an essential prerequisite for this kind of agents as it enables them to deal with unforeseen situations. However, learning every complex behavior anew from scratch would be cumbersome for the agent. It is more plausible to consider behavior to be modular and let the agent acquire a set of reusable building blocks for behavior, the so-called skills. These skills might, once acquired, facilitate fast learning and adaptation of behavior to new situations. This work focuses on computational approaches for skill acquisition, namely which kind of skills shall be acquired and how to acquire them. The former is commonly denoted as "skill discovery" and the latter as "skill learning". The main contribution of this thesis is a novel incremental skill acquisition approach which is suited for lifelong learning. In this approach, the agent learns incrementally a graph-based representation of a domain and exploits certain properties of this graph such as its bottlenecks for skill discovery. This thesis proposes a novel approach for learning a graph-based representation of continuous domains based on formalizing the problem as a probabilistic generative model. Furthermore, a new incremental agglomerative clustering approach for identifying bottlenecks of such graphs is presented. Thereupon, the thesis proposes a novel intrinsic motivation system which enables an agent to intelligently allocate time between skill discovery and skill learning in developmental settings, where the agent is not constrained by external tasks. The results of this thesis show that the resulting skill acquisition approach is suited for continuous domains and can deal with domain stochasticity and different explorative behavior of the agent. The acquired skills are reusable and versatile and can be used in multi-task and lifelong learning settings in high-dimensional problems.
Schlagwörter
Reinforcement Learning

; 

Skill Discovery

; 

Skill Acquisition

; 

Intrinsic Motivation

; 

Hierarchical Reinforcement Learning

; 

Graph
Institution
Universität Bremen  
Fachbereich
Fachbereich 03: Mathematik/Informatik (FB 03)  
Dokumenttyp
Dissertation
Zweitveröffentlichung
Nein
Sprache
Englisch
Dateien
Lade...
Vorschaubild
Name

00103656-1.pdf

Size

13.9 MB

Format

Adobe PDF

Checksum

(MD5):1e15975d90657dfd3b1a6b9f86fbe003

Built with DSpace-CRIS software - Extension maintained and optimized by 4Science

  • Datenschutzbestimmungen
  • Endnutzervereinbarung
  • Feedback schicken