Integrated CPU-GPU architecture provides excellent acceleration capabilities for data parallel applications on embedded platforms while meeting the size, weight and power (SWaP) requirements. However, sharing of main memory between CPU applications and GPU kernels can severely affect the execution of GPU kernels and diminish the performance gain provided by GPU. In the NVIDIA Tegra TK1 platform which has the integrated CPU-GPU architecture, we noticed that in the worst case scenario, the GPU kernels can suffer as much as 4X slowdown in the presence of co-running memory intensive CPU applications compared to their solo execution. In this paper, we propose a kernel mechanism called BWLOCK++ which can be used to protect the performance of GPU kernels from corunning memory intensive CPU applications. Our preliminary investigations show that by using BWLOCK++, the performance slowdown of GPU kernels in the presence of memory contention can be decreased by up-to 63%.