Electrostatic halftoning is a high-quality method for stippling, dithering, and sampling, but it suffers from a high runtime. This made the technique difficult to use for most real-world applications. A recently proposed minimisation scheme based on the non-equispaced fast Fourier transform (NFFT) lowers the complexity in the particle number M from $$\mathcal{O}(M^2)$$ to $$\mathcal{O}(M \log M).$$ However, the NFFT is hard to parallelise, and the runtime on modern CPUs lies still in the orders of an hour for about 50,000 particles, to a day for 1 million particles. Our contributions to remedy this problem are threefold: we design the first GPU-based NFFT algorithm without special structural assumptions on the positions of nodes, we introduce a novel nearest-neighbour identification scheme for continuous point distributions, and we optimise the whole algorithm for n-body problems such as electrostatic halftoning. For 1 million particles, this new algorithm runs 50 times faster than the most efficient technique on the CPU, and even yields a speedup of 7,000 over the original algorithm.