This paper investigates an effective test strategy for structural failure criterion characterization. The goals include identification of potential failure modes and a better approximation of failure boundary, e.g., failure load mapping with respect to geometry and load conditions. We typically replicate and test the same structural configuration several times in order to deal with noisy observation. However, our study shows that replication is not necessarily needed, because of the smoothing effect of surrogate models, and we show that exploring with as many different configurations as possible is more important. We illustrate the failure criterion characterization with two structural examples with various surrogate models, including polynomial response surface (PRS), support vector regression (SVR) and Gaussian process regression (GPR). We also examine the treatment of replicated test data for surrogate fitting. While fitting to all replicated test data works well for GPR, fitting only to the mean values of the replicated data helps a not-well-tuned surrogate (SVR in this paper) by compensating for the proneness to overfitting. When the noise level is significant as compared to the error due to surrogate modeling, a denser matrix might be prone to overfitting for GPR and SVR.