Second-order stagewise backpropagation for Hessian-matrix analyses and investigation of negative curvature

Eiji Mizutani; Stuart E. Dreyfus

doi:10.1016/j.neunet.2007.12.038

2008 Special Issue
Second-order stagewise backpropagation for Hessian-matrix analyses and investigation of negative curvature

Eiji Mizutani, Stuart E. Dreyfus

Source

Neural Networks > 2008 > 21 > 2-3 > 193-203

Abstract

Multi-stage feed-forward neural network (NN) learning with sigmoidal-shaped hidden-node functions is implicitly constrained optimization featuring negative curvature. Our analyses on the Hessian matrix H of the sum-squared-error measure highlight the following intriguing findings: At an early stage of learning, H tends to be indefinite and much better-conditioned than the Gauss–Newton Hessian JTJ. The NN structure influences the indefiniteness and rank of H. Exploiting negative curvature leads to effective learning. All these can be numerically confirmed owing to our stagewise second-order backpropagation; the systematic procedure exploits NN’s “layered symmetry” to compute H efficiently, making exact Hessian evaluation feasible for fairly large practical problems.