PARAMETRIZED HIERARCHICAL PROCEDURES FOR NEURAL PROGRAMMING

Fox, R.; Shin, R.; Krishnan, S.; Goldberg, K.; Song, D.; Stoica, I.

Neural programs are highly accurate and structured policies that perform algorith- mic tasks by controlling the behavior of a computation mechanism. Despite the potential to increase the interpretability and the compositionality of the behavior of artificial agents, it remains difficult to learn from demonstrations neural networks that represent computer programs. The main challenges that set algorithmic do- mains apart from other imitation learning domains are the need for high accuracy, the involvement of specific structures of data, and the extremely limited observabil- ity. To address these challenges, we propose to model programs as Parametrized Hierarchical Procedures (PHPs). A PHP is a sequence of conditional operations, using a program counter along with the observation to select between taking an elementary action, invoking another PHP as a sub-procedure, and returning to the caller. We develop an algorithm for training PHPs from a set of supervisor demonstrations, only some of which are annotated with the internal call structure, and apply it to efficient level-wise training of multi-level PHPs. We show in two benchmarks, NanoCraft and long-hand addition, that PHPs can learn neural pro- grams more accurately from smaller amounts of both annotated and unannotated demonstrations.

More Like this