The level of quality that can be achieved in concatenative text-to-speech synthesis is primarily governed by the inventory of units used in unit selection. This has led to the collection of ever larger corpora in the quest for ever more natural synthetic speech. As operational considerations limit the size of the unit inventory, however, pruning is critical to removing any instances that prove either spurious or superfluous. At last ICASSP we introduced an alternative pruning strategy based on a data-driven feature extraction framework separately optimized for each unit type in the inventory. This paper presents further validation of this strategy, as well as a detailed analysis of its potential benefits for concatenative synthesis.