We present a new divide-and-conquer algorithm to efficiently evaluate the Coulomb interaction in a large system, which is an essential part of self-consistent first-principle calculations. The total Coulomb potential ϕ(r)=1/|r| is divided into a short range part ϕS(r) and a smooth long range part ϕL(r). The system is divided into many cuboids, with a small box defined for each cuboid plus a buffer region. For the short range part, the interaction convolution integral is calculated directly using a Fast Fourier Transformation (FFT) in the small box. For the smooth long range part, the convolution integral is evaluated by a global FFT but on a coarse grid. The conversion between the dense grid and coarse grid values is done using small box FFTs with proper mask functions. Using this small box FFT method, the total Coulomb potentials of test quantum dot systems on 480 3 grid and 2400 3 grid are calculated. For the 2400 3 grid case, the calculation is carried out by tens of thousands of processors with a computational speed up close to 10 times when compared with direct global FFT calculations using the FFTW package with the maximumly allowed number of processors. The maximum relative error is 4×10 −5 while the absolute error is less than 0.1 meV.