This paper describes our method to fit a 3D skeleton to the human hand using depth images. The human hand is represented by a 3D skeleton with 21 parts. This model is used to generate synthetic depth images, that are used to train Random Decision Forests (RDF), which are used to assign each pixel to a hand part. Mean-shift method is used on the classification results and joint locations are estimated. The system can run in real time at 30 fps on Kinect depth images. We use this method and Support Vector Machines for classification and obtain 99.9% recognition rate on the American Sign Language (ASL) digit recognition problem.