In this commentary, we suggest that infancy researchers should carefully consider validity when taking steps to improve reliability. Zooming in to focus on looking‐time methods, we argue that limitations in validity represent perhaps an even more fundamental issue than reliability. At the same time, focusing single‐mindedly on reliability comes with two possible pitfalls: maximizing reliability at the expense of construct validity, and overvaluing parental report measures compared to direct measures of infant behaviour. Finally, we articulate several promising avenues for improving validity in infant research: experimental and modelling efforts to characterize the functional relationship between measures such as looking time and infant cognition, using multiple measures to establish convergent validity, and improving our understanding of how measures vary across a broader set of stimulus characteristics.