In this paper, we explore optimizing the bandwidth utilization of the network-on-chips (NoCs). We propose a flit-level speedup scheme to improve the NoC performance using self-reconfigurable bidirectional channels. For the NoC intrarouter bandwidth, in addition to allowing flits from different packets to use the idle internal bandwidth of the crossbar, our proposed flit-level speedup scheme also allows flits within the same packet to be transmitted simultaneously. For interrouter channels, a distributed channel configuration scheme is developed to dynamically change the link directions. In this way, the effective bandwidth between two routers can change adaptively depending on the run time network traffic. We present the implementation of the proposed flit-level speedup NoC on a 2-D mesh. An input buffer architecture, which supports reading and writing two flits from the same virtual channel at the same time, is proposed. The switch allocator is also designed to support flit-level parallel arbitration. Extensive simulations on both the synthetic traffic and real applications show performance improvement in throughput and latency over the existing architectures using bidirectional channels.