The control and management of unmanned combat vehicles (UCVs) operating in an adversarial urban environment is a challenging task due, in part, to the imperfect and incomplete information available, the conflicting objectives of opposing teams, the uncertain stochastic dynamics, and the limitation in computational capability. In this paper, a decision policy built upon Markov decision processes is proposed to provide optimal routing and munitions management despite the conflicting objectives of the adversaries and the stochastic dynamics. The main novelty of the proposed decision policy lies in its handling of multiple UCV formations of varying dimensions. This multiformations capability is explicitly accounted for in the proposed formulation of the optimization problem. The UCVs, which constitute the blue team, have for objective to reach prescribed tactical target locations from a common starting point by following possibly different paths across an adversarial urban environment, within prescribed time windows and with maximum lethality. On their way, the UCVs will face an adversarial red team, which is composed of ground units that can engage any nearby UCV. The rendezvous objective of the blue team can be interpreted as a constraint in an optimization problem, aimed at minimizing damage while maximizing the total number of remaining munitions at the time the multiformations reach the targets. The blue and red teams play the roles of cost-function minimizer and maximizer, respectively. The worst-case minimization objective of the blue team is formulated as a finite-time optimization, which is solved by means of a dynamic programming equation with value function evolving according to a graph of feasible UCV paths. The resulting decision policy takes the form of a lookup table, which is ideal for online implementations. The practical case of imperfect information on the classification and the location of the adversarial ground units is addressed by means of a one-step lookahead rollout policy using estimates provided by a recursive Bayesian filter. Simulation results show that the concept of multiformations provides, on average, an improvement in performance when compared with single-formation routing.