< Terug naar vorige pagina

Octrooi

Exploring an unexplored domain by parallel reinforcement

Example embodiments describe a computer-implemented method for exploring, by a table-based parallel reinforcement learning, PRL, algorithm, an unexplored domain (100) comprising a plurality of agents (110-114) and states, the unexplored domain (100) represented by a state-action space (101, 102), the method comprising the following steps performed by one or more of the plurality of agents (110) receiving (510) an assigned partition (200) of the state-action space represented by a table; and executing (511) during a plurality of episodes actions for states within the partition (200), wherein an action transits a state; and granting (512) to a transited state a reward; and exchanging (513) state-action values with other agents of the plurality of agents (111-114) in the domain (100); and updating (514) the table.
Octrooi-publicatienummer: EP3637256
Jaar aanvraag: 2020
Jaar toekenning: 2021
Jaar van publicatie: 2020
Status: Aangevraagd
Technologiedomeinen: Computertechnologie
Gevalideerd voor IOF-sleutel: Ja
Toegewezen aan: Associatie Universiteit & Hogescholen Antwerpen