Reinforcement learning (RL) is learning how to map states to actions so as to maximise a numeric reward signal. Fuzzy Q-learning (FQL) extends the RL technique Q-learning to large or continuous problems and has been applied to a wide range of applications from data mining to robot control. Typically, FQL uses a uniform or pre-defined internal representation provided by the human designer. A uniform representation usually provides poor generalisation for control applications, and a pre-defined representation requires the designer to have an in-depth knowledge of the desired control policy. In this paper, the approach taken is to reduce the reliance on a human designer by adapting the internal representation, to improve the generalisation over the control policy, during the learning process. A hierarchical fuzzy rule based system (HFRBS) is used to improve the generalisation of the control policy through iterative refinement of an initial coarse representation on a classical RL problem called the mountain car problem. The process of adapting the representation is shown to significantly reduce the time taken to learn a suitable control policy.