We explore the use of synthetic benchmarks for the training phase of machine-learning-based automatic performance tuning. We focus on the problem of predicting if the use of local memory on a GPU is beneficial for caching a single target array in a GPU kernel. We show that the use of only 13 real benchmarks leads to poor prediction accuracy (about to 58%) of the 13 leave-one-out models trained using these benchmarks, even when the model features are sufficiently comprehensive. We define a metric, called the average vicinity density to measure the quality of a training set. We then use it to demonstrate that the poor accuracy of the models built with the real benchmarks is indeed because of the limited size and coverage of the training set. In contrast, the use of 90K properly generated set of synthetic benchmarks leads to significantly better accuracies, up to 87%. These results validate our approach of using synthetic benchmarks for training machine learning models. We describe a synthetic benchmark template for the local memory optimization. We then present two approaches to using this template and a seed set of real benchmarks to generate a large number of synthetic benchmark. We also explore the impact of the number of synthetic benchmarks used in training.