Data deduplication, a data redundancy elimination technique, has been widely employed in many application environments to reduce data storage space. However, it is challenging to provide a fast and scalable key-value fingerprint index particularly for large datasets, while the index performance is critical to the overall deduplication performance. This paper proposes RMD, a resemblance and mergence based deduplication scheme, which aims to provide quick responses to fingerprint queries. The key idea of RMD is to leverage a bloom filter array and a data resemblance algorithm to dramatically reduce the query range. At data ingesting time, RMD uses a resemblance algorithm to detect resemble data segments and put resemblance segments in the same bin. As a result, at querying time, it only needs to search in the corresponding bin to detect duplicate content, which significantly speeds up the query process. Moreover, RMD uses a mergence strategy to accumulate resemblance segments to relevant bins, and exploits frequency-based fingerprint retention policy to cap the bin capacity to improve query throughput and data deduplication ratio. Extensive experimental results with real-world datasets have shown that RMD is able to achieve high query performance and outperforms several well-known deduplication schemes.