One of the recurring challenges in humanoid robotics is the development of learning mechanisms to predict the effects of certain actions on objects. It is paramount to predict the functional properties of an object from “afar”, for example on a table, in a rack or a shelf, which would allow the robot to select beforehand and automatically an appropriate action (or sequence of actions) in order to achieve a particular goal. Such sensory to motor schemas associated to objects, surfaces or other entities in the environment are called affordances [1, 2] and, more recently, they have been formalized computationally under the name of object-action complexes [3] (OACs). This paper describes an approach to the acquisition of affordances and tool use in a humanoid robot combining vision, learning and control. Learning is structured to enable a natural progression of episodes that include objects, tools, and eventually knowledge of the complete task. We finally test the robot's behavior in an object retrieval task where it has to choose among a number of possible elongated tools to reach the object of interest which is otherwise out of the workspace.