Automatic summarization is developed to extract the representative contents or sentences from a large corpus of documents. This paper presents a new hierarchical representation of words, sentences and documents in a corpus, and infers the Dirichlet distributions for latent topics and latent themes in word level and sentence level, respectively. The sentence-based latent Dirichlet allocation (SLDA) is accordingly established for document summarization. Different from the vector space summarization, SLDA is built to fit the fine structure of text documents, and is specifically designed for sentence selection. SLDA acts as a sentence mixture model with a mixture of Dirichlet themes, which are used to generate the latent topics in observed words. The theme model is inherent to distinguish sentences in a summarization system. In the experiments, the proposed SLDA outperforms other methods for document summarization in terms of precision, recall and F-measure.