< Terug naar vorige pagina


Disagreement options

Boekbijdrage - Boekabstract Conferentiebijdrage

Ondertitel:task adaptation through temporally extended actions
Embodied AI, learning through interaction with a physical environment, typically requires large amounts of interaction with the environment in order to learn how to solve new tasks. Training can be done in parallel, using simulated environments. However, once deployed in e.g., a real-world setting, it is not yet clear how an agent can quickly adapt its knowledge to solve new tasks. In this paper, we propose a novel Hierarchical Reinforcement Learning (HRL) method that allows an agent, when confronted with a novel task, to switch between exploiting prior knowledge through temporally extended actions, and environment exploration. We solve this trade-off by utilizing the disagreement between action distributions of selected previously acquired policies. Selection of relevant prior tasks is done by measuring the cosine similarity of their attached natural language goals in a pre-trained word-embedding. We analyze the resulting temporal abstractions, and we experimentally demonstrate the effectiveness of them in different environments. We show that our method is capable of solving new tasks using only a fraction of the environment interactions required when learning the task from scratch.
Boek: European Conference on Machine Learning and Principles and Practice of, Knowledge Discovery in Databases (ECML PKDD), September 13-17, 2021
Pagina's: 190 - 205
Jaar van publicatie:2021
Trefwoorden:P1 Proceeding