High-speed routers rely on well-designed packet buffers that support multiple queues, large capacity and short response times. Some researchers suggested combined SRAM/DRAM hierarchical buffer architectures to meet these challenges. However, these architectures suffer from either large SRAM requirement or high time-complexity in the memory management. Our analysis indicates that they perform exactly the same in the worst case. In this paper, we present a novel packet buffer architecture which reduces the SRAM size requirement by (k-1)/2k, where k denotes the number of DRAMs working in parallel. We use a fast batch load scheme and per-queue Random Round Robin memory management algorithm. Our mathematical analysis and simulation results indicate that the proposed architecture provides guaranteed performance in terms of low time complexity, short access delay and upper-bounded drop rate, when a little speedup is provided.