The Voice Onset Time (VOT) introduced by Lisker and Abramson (1964) is defined as the single production dimension, the time interval between the release of a stop occlusion and the onset of vocal cord vibration. Languages generally fall into two of the three broad categories that show little cross-linguistic variation: voicing lead, short lag, and long lag. English and Polish exploit the VOT continuum differently. While English contrasts short lag vs. long lag for voiced and voiceless stops, Polish exploits voicing lead vs. short lag for its voiced and voiceless stops. This acoustic difference makes an interesting cross-linguistic scenario for perception studies in an identification paradigm. From a naturally obtained nonword keef, the author generated 8 stimuli with the VOT values of an initial stop ranging in 10ms-steps from 0 ms to +70 ms. These values span across the English VOT boundary which separates short lag (voiced) vs. long lag (voiceless) categories. In a forced-choice format, he asked two groups of subjects - native speakers of English and Polish beginner learners of English - to recognise and initial segment in each stimulus. The analysis of the results shows that the two groups performed differently in that native speakers categorised short lag into voiced /g/ and long lag into voiceless /k/. Polish subjects, on the other hand, did not exhibit a categorical shift from a voiceless into voiced category.