Path-Normalized Optimization of Recurrent Neural Networks with ReLU Activations

Neyshabur, Behnam; Wu, Yuhuai; Salakhutdinov, Ruslan; Srebro, Nathan

Citation Details

We investigate the parameter-space geometry of recurrent neural networks (RNNs), and develop an adaptation of path-SGD optimization method, attuned to this geometry, that can learn plain RNNs with ReLU activations. On several datasets that require capturing long-term dependency structure, we show that path-SGD can significantly improve trainability of ReLU RNNs compared to RNNs trained with SGD, even with various recently suggested initialization schemes. more »

Award ID(s):: 1302662

PAR ID:: 10025961

Author(s) / Creator(s):: Neyshabur, Behnam; Wu, Yuhuai; Salakhutdinov, Ruslan; Srebro, Nathan

Date Published:: 2016-05-23

Journal Name:: arXiv.org

ISSN:: 2331-8422

Page Range / eLocation ID:: arXiv:1605.07154v1 [cs.LG]

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Journal Article:
The DOI is not currently available.

More Like this