Exploring an unexplored domain by parallel reinforcement University of Antwerp
Example embodiments describe a computer-implemented method for exploring, by a table-based parallel reinforcement learning, PRL, algorithm, an unexplored domain (100) comprising a plurality of agents (110-114) and states, the unexplored domain (100) represented by a state-action space (101, 102), the method comprising the following steps performed by one or more of the plurality of agents (110) receiving (510) an assigned partition (200) of the state-action space represented by a table; and executing (511) during a plurality of episodes actions for states within the partition (200), wherein ...