The Least-squares extrapolation of harmonic models and autoregressive (LS + AR) prediction is currently considered to be one of the best prediction model for polar motion parameters. In this method, LS fitting residuals are treated as data to train an AR model. But it is readily known that using too many data will result in learning a badly relevant AR model, implying increasing the model bias. It can also be possible that using too few data will result in a lower estimation accuracy of the AR model, implying increasing the model variance. So selecting data is a critical issue to compromise between bias and variance, and hence to obtain a model with optimized prediction performance. In this paper, an experimental study is conducted to check the effect of different data volume on the final prediction performance and hence to select an optimal data portion for AR model. The earth orientation parameters products released by the International Earth Rotation and Reference Systems Service were used as primary data to predict changes in polar motion parameters over spans of 1–500 days for 800 experiments. The experimental results showed that although the short term prediction were not ameliorated, but the method that the AR model parameters calculated by appropriate data volume can effectively improve the accuracy of long-term prediction of polar motion.