< Terug naar vorige pagina

Publicatie

A Framework for Flexibly Guiding Learning Agents

Tijdschriftbijdrage - Tijdschriftartikel

Reinforcement Learning (RL) enables artificial agents to learn through direct interaction with the environment. However, it usually does not scale up well to large problems due to its sampling inefficiency. Reward Shaping is a well-established approach that allows for more efficient learning by incorporating domain knowledge in RL agents via supplementary rewards. In this work we propose a novel methodology that automatically generates reward shaping functions from user-provided Linear Temporal Logic on finite traces (LTLf) formulas. LTLf in our work serves as a rich language that allows the user to communicate domain knowledge to the learning agent. In both single and multi-agent settings, we demonstrate that our approach performs at least as well as the baseline approach while providing essential advantages in terms of flexibility and ease of use. We elaborate on some of these advantages empirically by demonstrating that our approach can handle domain knowledge with different levels of accuracy, and provides the user with the flexibility to express aspects of uncertainty in the provided advice.
Tijdschrift: Neural Comput Appl
ISSN: 0941-0643
Volume: 2022
Pagina's: 1-17
Jaar van publicatie:2022
Trefwoorden:Reinforcement Learning, Reward Shaping, Linear Temporal Logic on finite traces, Multi-agent Systems
Toegankelijkheid:Open