We address the problem of performance and power-efficient thread allocation in NoC-based CMPs. The CMP includes a number of cores with a shared cache interconnected by a network on chip (NoC). The NoC-based CMP executes multiple multi-threaded applications and its cores perform coarse-grain multithreading. To that end, based on an analytical model, we introduce a parameterized performance/power metric that can be adjusted according to a preferred tradeoff between performance and power. We introduce a simple and efficient heuristic called Incremental Threshold Algorithm (ITA) for allocating threads to cores. It utilizes the CMP resources in a way that maximizes the given performance/power metric. We compare the performance/power metric achieved by ITA with several optimization methods. ITA outperforms the best of these methods by 9%, while consuming on average 0.01% and at most 2.5% of the optimization methods' computational effort.