A new 3D Proton Induced X-Ray Emission Tomography (PIXE-T) and Scanning Transmission Ion Microscopy Tomography (STIM-T) simulation software has been developed in Java and uses NVIDIA™ Common Unified Device Architecture (CUDA) to calculate the X-ray attenuation for large detector areas. A challenge with PIXE-T is to get sufficient counts while retaining a small beam spot size. Therefore a high geometric efficiency is required. However, as the detector solid angle increases the calculations required for accurate reconstruction of the data increase substantially. To overcome this limitation, the CUDA parallel computing platform was used which enables general purpose programming of NVIDIA graphics processing units (GPUs) to perform computations traditionally handled by the central processing unit (CPU). For simulation performance evaluation, the results of a CPU- and a CUDA-based simulation of a phantom are presented. Furthermore, a comparison with the simulation code in the PIXE-Tomography reconstruction software DISRA (A. Sakellariou, D.N. Jamieson, G.J.F. Legge, 2001) is also shown. Compared to a CPU implementation, the CUDA based simulation is approximately 30× faster.