This paper delivers a block-based parallel convolutional decoding architecture in which several Viterbi decoders work concurrently to decode consecutive code blocks. Each code block contains a preamble and a postamble which are duplicate data from neighbor blocks. Preamble and postamble are beneficial to the continuity and correctness of decoding output. Simulation results demonstrate that this architecture has a negligible coding-gain loss, compared with the conventional Viterbi decoder. An FPGA implementation of this architecture achieves a throughput up to 1.2 Gbps.