In order to improve performance in the many-core era, we should utilize all cores on a chip effectively. However, it is difficult to parallelize programs so as to utilize all cores, and single-thread regions remain as bottlenecks. To solve these bottlenecks, cooperative core architectures are proposed. They can accelerate single-thread execution by fusing some narrow-issue cores into a wide-issue core. They can also balance single-thread performance and multi-thread performance by fusion and split during execution. We have proposed Core Symphony architecture that is one of the cooperative core architectures. In this paper, we design and implement efficient and realistic Core Symphony and run it on FPGA. Then, we clarify the performance and the hardware budget of Core Symphony.