Recurrent networks constitute an elegant way of increasing the capacity of feedforward networks to deal with complex data in the form of sequences of vectors. They are well known for their power to model temporal dependencies and process sequences for classification, recognition, and transduction. In this paper, we propose a nonmonotone conjugate gradient training algorithm for recurrent neural networks, which is equipped with an adaptive tuning strategy for the nonmonotone learning horizon. Simulation results show that this modification of conjugate gradient is more effective than the original CG in four applications using three different recurrent network architectures.