In this paper, we attempt to reconciliate two views of spatial development based on two mechanisms of statistical learning and of sensory alignment. Conflicting results in developmental psychology attribute either a developmental period to spatial cognition (Piaget). Besides, these results conflict with other researches in which infants do demonstrate good coordination and coherence across modalities (Gibsonian), even from restricted pre-natal experiences [1], [2]. In order to study both views, we present at first a simple model based on conditional learning which integrates visual and auditory modalities although it has some limitation regarding the number of degrees of freedom. In second, we propose then to use a sensory alignment mechanism, which allows the system to learn invariances in the world. In experiments with a robot head, we show the advantages of each strategy. We then discuss about the future possibilities of merging both models and their implications.