Due to the huge scale and great diversity of large-scale mail archives, the extraction methods should not only be able to extract signatures and roles accurately without any training data, but also be general enough to work well with large-scale mail archives with different characteristics. To address this problem, first, we proposed an unsupervised language model based method to identify signatures from large numbers of emails, and then present an unsupervised two-stage method to effectively extract signatures. Experimental results show that, our methods are general and effective for the roles extraction from large-scale mail archives.