In this paper we present a hardware-accelerated system-on-chip implementation of an MPEG-4 simple profile video decoder with a novel hardware accelerator interfacing methodology. The system consists of a general purpose master processor and several slave hardware accelerators. The communication between the master processor and the hardware accelerators is performed without interrupts by using piecewise-static run-time scheduling. After the data content of each macroblock has been discovered, the master processor computes a short static schedule for the accelerators. This removes the need for the accelerators to interrupt the master processor when the assigned task is finished. Therefore, context save overheads in the master processor are avoided and energy efficiency improves. The accelerators execute functions that perform block-level decoding operations (IDC, inverse quantization etc.), which have deterministic execution times and can be scheduled statically. The task scheduling algorithm executed by the master processor is able to take into account the costs and restrictions of a shared memory with limited access capabilities and marks memory accesses separately to the schedule. The possible heterogeneity of the processing units is also taken care of. Tests show that the proposed scheme is feasible and can be used as an alternative to traditional synchronization methods.