One step required several times for current video encoders is the residual coding loop, composed of the direct transformation, direct quantization, inverse quantization, and inverse transformation. These operations demand high throughput and low latency since their outputs must be processed by other steps of the coder. This paper proposes a high-throughput parallel and multiplierless hardware architecture for the HEVC direct quantization targeting real-time processing of Ultra-High Definition 8K videos. The proposed architecture support frequency dependent quantization steps. The binary multiplications were replaced by multiple constant multiplications in order to improve the throughput and to reduce the area and power dissipation. The developed design is able to process 32 samples in parallel, which represents one line of the biggest HEVC transform block. The ASIC synthesis results, obtained with Nangate 45nm standard cells library, show that the proposed architecture is able to quantize about 8 billion coefficients per second, when running at 186.6 MHz, with a gate count of 168,330. This throughput is enough to process UHD 8K videos at 120 fps.