A key question in early word learning is how children cope with the uncertainty in natural naming events. One potential mechanism for uncertainty reduction is cross‐situational word learning – tracking word/object co‐occurrence statistics across naming events. But empirical and computational analyses of cross‐situational learning have made strong assumptions about the nature of naming event ambiguity, assumptions that have been challenged by recent analyses of natural naming events. This paper shows that learning from ambiguous natural naming events depends on perspective. Natural naming events from parent–child interactions were recorded from both a third‐person tripod‐mounted camera and from a head‐mounted camera that produced a ‘child's‐eye’ view. Following the human simulation paradigm, adults were asked to learn artificial language labels by integrating across the most ambiguous of these naming events. Significant learning was found only from the child's perspective, pointing to the importance of considering statistical learning from an embodied perspective.