Hierarchical classification has been becoming a popular research topic nowadays, particularly on the web as text categorization. For a large web corpus, there can be a hierarchy with hundreds of thousands of topics, so it is common to handle this task using a flat classification approach, inducing a binary classifier only for the leaf-node classes. However, it always suffers from such low prediction accuracy due to an imbalanced issue in the training data. In this paper, we propose two novel strategies: (i) “Top-Level Pruning” to narrow down the candidate classes, and (ii) “Exclusive Top-Level Training Policy” to build more effective classifiers by utilizing the top-level data. The experiments on the Wikipedia dataset show that our system outperforms the traditional flat approach unanimously on all hierarchical classification metrics.