Generating reward structures on a parameterized distribution of dynamics tasks

Leite, Abe; Izquierdo, Eduardo J.

doi:10.1162/isal_a_00466

Citation Details

Generating reward structures on a parameterized distribution of dynamics tasks

In order to make lifelike, versatile learning adaptive in the artificial domain, one needs a very diverse set of behaviors to learn. We propose a parameterized distribution of classic control-style tasks with minimal information shared between tasks. We discuss what makes a task trivial and offer a basic metric, time in convergence, that measures triviality. We then investigate analytic and empirical approaches to generating reward structures for tasks based on their dynamics in order to minimize triviality. Contrary to our expectations, populations evolved on reward structures that incentivized the most stable locations in state space spend the least time in convergence as we have defined it, because of the outsized importance our metric assigns to behavior fine-tuning in these contexts. This work paves the way towards an understanding of which task distributions enable the development of learning. more »

Award ID(s):: 1845322

PAR ID:: 10286536

Author(s) / Creator(s):: Leite, Abe; Izquierdo, Eduardo J.

Editor(s):: Cejkova, Jitka; Holler, Silvia; Soros, Lisa; Witkowski, Olaf

Date Published:: 2021-07-19

Journal Name:: Artificial Life Conference Proceedings

Volume:: Alife 2021

Issue:: 2021

Page Range / eLocation ID:: 118-127

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Conference Paper:
https://doi.org/10.1162/isal_a_00466

More Like this