Serious concerns have been raised that false positive findings are widespread in empirical research in business disciplines. This is largely because researchers almost exclusively adopt the ‘p‐value less than 0.05’ criterion for statistical significance; and they are often not fully aware of large‐sample biases which can potentially mislead their research outcomes. This paper proposes that a statistical toolbox (rather than a single hammer) be used in empirical research, which offers researchers a range of statistical instruments, including a range of alternatives to the p‐value criterion such as the Bayesian methods, optimal significance level, sample size selection, equivalence testing and exploratory data analyses. It is found that the positive results obtained under the p‐value criterion cannot stand, when the toolbox is applied to three notable studies in finance.