Power efficiency and variability, currently, are the main aspects of concern of nanometer-scale CMOS technology. Both issues have been widely studied and described in the literature, and various options for their independent management are available. Unfortunately, their exacerbation on sub-40 nm processes will require new design solutions for concurrent optimization. This paper moves towards this objective, and presents a new, fully-automated, design methodology, based on the Monitor and Control paradigm, able to improve the timing yield of a system making use of traditional power-gating (PG) as a knob for controlling power consumption and performance. In particular, the design and implementation of tunable-size sleep transistors is described, as well as a methodology for inserting them in a row-based layout. In order to keep under control both area and power overhead that come from the insertion of the sleep transistors, this paper also proposes a new strategy for clustering and power-gating only the timing critical cells. The experimental results are extremely promising. In fact, the proposed approach guarantees 100% of the timing yield with average leakage-power savings of about 29%.