This paper presents a debugging aid for parallel program developers. The tool presented enables programmers to use cyclic debugging techniques for debugging non-deterministic parallel programs running on multiprocessor systems with shared memory. The solution proposed consists of a combination of record/replay with automatic on-the-fly data race detection. This combination enables us to limit the record phase to the more efficient recording of the synchronization operations, and checking for data races (using intrusive methods) during a replayed execution. As the record phase is highly efficient, there is no need to switch it off, hereby eliminating the possibility of Heisenbugs because tracing can be left on all the time.