For robots to interact with humans at the language level, it becomes fundamental that robots and humans share a common language. In this paper, a social language grounding paradigm is adopted to teach a robotic arm basic vocabulary about objects in its environment. A human user, acting as an instructor, teaches the names of the objects present in their shared field of view. The robotic agent grounds these words by associating them to visual category descriptions. A component-based object representation is presented. An instance based approach is used for category representation. An instance is described by its components and geometric relations between them. Each component is a color blob or an aggregation of neighboring color blobs. The categorization strategy is based on graph matching. The learning/grounding capacity of the robot is assessed over a series of semi-automated experiments and the results are reported.