Many state-of-the-art semantic object detection methods locate category-level objects by finding optimal bounding boxes. However, the accuracy of localization is compromised, when the shape of an object does not conform to rectangular bounding boxes. As a remedy, some recent work locates an object based on superpixel classification. However, the increased flexibility in shape modeling also means less control, and methods which mostly rely on high-level semantic (category-level) classification cue have difficulty in producing “regular” segments which align well with objects. To solve this problem, we propose a novel energy-minimization method which explicitly models the “objectness” of a segment by incorporating mid-level grouping cues. The highlevel classification cue is integrated with mid-level grouping features in a principled ratio energy function whose global optimal solution can be obtained efficiently. Our method compares favorably with state-of-the-art methods on public datasets.