Automatically generating images from text by using generative adversarial networks (GANs) has been actively investigated. To the best of our knowledge, there is no method of generating images with consideration of the given text and its context; therefore, representing a story describing a series of related actions is insufficient for applications such as generating image sequences. In this paper, we propose a method of automatically tuning the noise parameter of GANs and a context-aware GAN model to generate images from a series of text and image pairs. Our method and model can be used for automatically generating visual stories.