Following the recent dissemination of several internet devices, NFV has received attention as a core technology of next-generation networks with the purpose of reflecting the requirements of increasingly diverse services. NFV virtualizes the service and network functions operated on hardware in order to enable the quick building of services with low costs. Service chaining (SFC) aimed at particular network services has appeared with the advance in NFV, and this refers to the technology of successively abstracting service functions. The present study concerns the research of a dynamic service chaining plan that dynamically creates service chains through reinforcement learning by considering the nodes where service functions operate to enable efficient service chaining in an NFV environment and also by considering the consumption of resources like the CPU, memory, and network usage of service functions.