In a chip-multiprocessor with a shared cache structure, the last level cache is shared by multiple applications executing simultaneously. The competing accesses from different applications degrade the system performance, resulting in non-predicting executing time. Cache partitioning techniques partition the shared cache for multiple applications. Traditional cache partitioning mechanisms, such as Utility-based Cache Partitioning (UCP) and IPC-based Cache Partitioning (IPC-CP), aim to optimize the objective (for example, instruction per cycle or miss rate) that is appealing for individual application. However, the performances of multi-programmed systems are usually characterized by the number of applications finished during certain interval. This paper investigates System Level Speedup oriented Cache Partitioning (SLS-CP), which is used to maximize total speedup of the system. Like UCP and IPC-CP, the inputs of SLS-CP are current performance status and misses of all the possible partitions, and the outputs of SLS-CP are optimum cache partitions for multi-programmed workloads. Our evaluation, on top of a two cores CMP processor with 8 multi-programmed workloads shows that SLS-CP improves system level speedup and fairness over UCP and IPC-CP.