We describe the design and implementation of the Distributed Threads System (DTS), a programming environment for the parallelization of irregular and highly data-dependent algorithms. DTS extends the support for fork/join parallel programming from shared memory threads to a distributed memory environment. It is currently implemented on top of PVM, adding an asynchronous RPC abstraction and turning the net into a pool of anonymous compute servers. Each node of DTS is multithreaded using the C threads interface and is thus ready to run on a multiprocessor workstation. We give performance results for a parallel implementation of the RSA cryptosystem, parallel long integer multiplication, and parallel multi-variate polynomial resultant computation.