Convolutional Neural Network (CNN) is one of the successful deep learning algorithms that have shown its effectiveness in a variety of vision tasks. The performance of this network depends directly on its hyperparameters. Although, designing CNN architectures require expert knowledge of their intrinsic structure or a lot of trial and error. To overcome these issues, there is a need to automatically design the optimal architecture of CNNs without any human intervention. So, we try to eliminate the constraints on the number of convolutional layers and pooling layers and their type etc. from traditional architecture. Biologically inspired approaches have not been extensively exploited for this task. This paper attempts to automatically optimize CNN architecture’s hyperparameters for speech recognition task based on particle swarm optimization (PSO) which is a population based stochastic optimization technique. The proposed method is evaluated by designing CNN architecture for speech recognition task on Hindi dataset. The experimental results show that the proposed method significantly designs the competitive CNN architecture which performs similar as other state-of-the-art methods.