Many software reliability growth models (SRGMs) have developed in the past three decades to quantify several reliability measures including the expected number of remaining faults and software reliability. The underlying common assumption of many existing models is that the operating environment and the developing environment are the same. In reality, this is often not the case because the operating environments are unknown due to the uncertainty of environments in the field. In this paper, we present two new software reliability models with considerations of the fault-detection rate based on a Loglog distribution and the testing coverage subject to the uncertainty of operating environments. Examples are included to illustrate the goodness-of-fit test of proposed models and several existing non-homogeneous Poisson process (NHPP) models based on a set of failure data collected from software applications. Three goodness-of-fit test criteria, such as, mean square error, predictive-ratio risk, and predictive power, are used as an example to illustrate the model comparisons. The results show that the proposed models fit significantly better than other existing NHPP models based on the studied criteria. As we know different criteria have different impacts in measuring the software reliability and that no software reliability model is optimal for all contributing criteria. In this paper, we also discuss a method, called normalized criteria distance, to show ways to rank and select the best model from among SRGMs based on a set of criteria taken all together. Examples show that the proposed method offers a promising technique for selecting the best model based on a set of contributing criteria.