Collision avoidance is a fundamental problem in navigation. In this paper, we present a novel method of cooperative movement planning to examine how two vehicles can orchestrate their movements so as to avoid collisions and subsequently return to their intended paths. Movement planning in this research is solved by regarding it as a decision process. When the vehicles are at risk of a collision, the system determines appropriate steering motions for both vehicles at each time step, so that they can cooperatively change course to avoid collisions and return to their original course when the risk is averted. Reinforcement learning is applied to solve this decision-making task. States of the system are described in terms of the vehicles' position and orientation and actions are defined considering the kinematic constraints of the vehicles. In reinforcement learning, an approximate value function is iteratively developed according to certain rules to evaluate state-action combinations of the system. Appropriate motions are selected by the system after calculating the approximate value of possible target states, which also satisfy the requirement of the smoothness of paths, as well as the distances between, and velocities of, both vehicles. The method of least squares is applied in the iterative mechanism to update the approximate value function given a scoring technique for a collection of state samples featuring continuous state space and action space. This paper summarizes the concept and methodologies used to implement an online cooperative collision avoidance system. Different scenarios are tested to assess the performance of the proposed algorithm.