Although many people have tried using genetic algorithms (GAs) for exam timetabling, far fewer have done systematic investigations to try to determine whether a GA is a good choice of method or not. We have extensively studied GAs that use one particular kind of direct encoding for exam timetabling. Perhaps not surprisingly, it emerges that this approach is not very good, but it is instructive to see why. In the course of this investigation we discovered a class of solvable problems with interesting properties: our GAs would sometimes fail to solve some of the moderately-constrained problems, but could solve all of the lightly-constrained ones and all of the highly-constrained ones. This is despite the fact that they form a hierarchy: those erratically-solved problems are subproblems of the easily-solved but highly-constrained ones. Moreover, some other non-evolutionary approaches also failed on precisely the same sets. This, together with some observations about much simpler graph-colouring methods based on the Brelaz algorithm, suggest some future directions for GA-based methods.