Reservoir computing became very popular due to its potential for efficient design of recurrent neural networks, exploiting the computational properties of the reservoir structure. Various approaches, ranging from appropriate reservoir initialization to its optimization by training have been proposed. In this paper, we extend our previous work and focus on short-term memory capacity, introduced by Jaeger in case of echo state networks. Memory capacity has been previously shown to peak at criticality, when the network switches from a stable regime to an unstable dynamic regime. Using computational experiments with nonlinear ESNs, we systematically analyze the memory capacity from the perspective of several parameters and their relationship, namely the input and reservoir weights scaling, reservoir size and its sparsity. We also derive and test two gradient descent based orthogonalization procedures for recurrent weights matrix, which considerably increase the memory capacity, approaching the upper bound, which is equal to the reservoir size, as proved for linear reservoirs. Orthogonalization procedures are discussed in the context of existing methods and their benefit is assessed.