We propose optimal VLSI layouts for butterfly networks under the multilayer 2-D grid model, the Thompson model, and the extended grid model. We show that an N-node butterfly network can be laid out with area N2/lfloorL2/2rfloorlog2 2 N+ o(N2/L2log2N), volume LN2 /lfloorL2/2rfloorlog2 2N+ o(N 2/Llog2N), and maximum wire length N/radiclfloorL2/4rfloorlog2N+o(N/LlogN) under the multilayer 2-D grid model, where only one active layer (for network nodes) is required and L wiring layers (for network links) are available, 2 les L les o(nthrootN). We also show that the proposed multilayer butterfly layouts are optimal within a factor of 1+o(1) when adjacent wiring layers have orthogonal wires (to be referred to as X-Y layouts) and the area is calculated by a slanted encompassing rectangle. The proposed layouts are the first and only optimal butterfly layouts reported in the literature thus far for L gsim 3, and match the best previous layout for L=2. We propose to use AT2L2 or 2AT2 lfloorL2/2rfloor as a new parameter for characterizing the space-time complexity for multilayer VLSI, and show that AT2L2ap 2R2for RxR butterfly networks, where R=N/log2N+o(N/logN)