Loop scheduling on parallel and distributed systems has been a critical problem. Furthermore, it becomes more difficult to deal with on the emerging heterogeneous grid environments. In the past, some loop self-scheduling schemes have been proposed to be applicable to heterogeneous gird environments. In this paper, we propose a performance-based approach, which partitions loop iterations according to the performance weight of nodes. To verify the proposed approach, a grid testbed that consists four schools is built, and matrix multiplication example is implemented to be executed in this testbed. Experimental results show that the proposed approach performs better than previous schemes.