Visual object categorisation problem has attracted significant attention during the last ten years, and the two main hypotheses adopted by virtually all methods are i) detection of visual saliency and ii) bag-of-visual-words based categorisation. It is, however, difficult to verify the hypotheses with humans since many recordings, such as gaze fixation locations, represent processing after the recognition and the object classification task is too easy for humans producing no information about uncertainties in the cognitive process. To the authors' best knowledge, this work is the first attempt to study the main hypotheses and state-of-the-art algorithms for visual object categorisation with abstract images. These images inhibit rapid recognition and cause the observers' opinions differ substantially in assigning the images into “similar categories”. Our work reveals interesting findings: the state-of-the-art methods' performances drop to almost pure chance while human observers remain surprisingly consistent.