The ever-increasing complexity of numerical models and associated computational demands have challenged classical reliability analysis methods. Surrogate model-based reliability analysis techniques, and in particular those using kriging meta-model, have gained considerable attention recently for their ability to achieve high accuracy and computational efficiency. However, existing stopping criteria, which are used to terminate the training of surrogate models, do not directly relate to the error in estimated failure probabilities. This limitation can lead to high computational demands because of unnecessary calls to costly performance functions (e.g., involving finite element models) or potentially inaccurate estimates of failure probability due to premature termination of the training process. Here, we propose the error-based stopping criterion (ESC) to address these limitations. First, it is shown that the total number of wrong sign estimation of the performance function for candidate design samples by kriging, S, follows a Poisson binomial distribution. This finding is subsequently used to estimate the lower and upper bounds of S for a given confidence level for sets of candidate design samples classified by kriging as safe and unsafe. An upper bound of error of the estimated failure probability is subsequently derived according to the probabilistic properties of Poisson binomial distribution. The proposed upper bound is implemented in the kriging-based reliability analysis method as the stopping criterion. The efficiency and robustness of ESC are investigated here using five benchmark reliability analysis problems. Results indicate that the proposed method achieves the set accuracy target and substantially reduces the computational demand, in some cases by over 50%.