Web service composition problem was considered as a planning problem by previous research. However, many factors constantly affect the QoS and results of invocation of web services, thus the environment of web services is dynamic. As result, web service composition problem should be considered as an uncertain planning problem. This paper uses Markov property to deal with the uncertain planning problem for service composition. According to the uncertainty model, we propose a reinforcement learning method to compose web services. Without knowing the transition function and reward function, our uncertain planning method uses an estimated value function to approach a real function and is able to obtain a composite service. The results of experiments show that our method can effectively reduce computing time of the service composition.