Due to the rapid propagation of the network malwares and their severe threat, it is crucial to detect them and automatically generate their signatures in the early stage of the infection. Most existing approaches for automatic signature generation are based on the byte sequences in the flows, which usually has a great computation and memory overhead and cannot work well in presence of noise. In this paper, we present a method for large-scale malware analysis with feature extraction based on hashed matrix. Moreover, we propose the automatic signature generation using the Bayesian signature selection within clusters. Our evaluation shows that the proposed method can speed up the typical malware signature generation with less memory consumption. In addition, it has a comparably higher accuracy than previous approaches and is more noise-tolerant.