Under probability-of-interference constraints, proper spectrum sensing is crucial in Cognitive Radios (CRs). However, the capability of a CR to sense the spectrum is limited, especially when multiple users try to access multiple channels. As a consequence, control and resource allocation schemes should optimize not only transmitting resources, but also sensing resources. In this paper, the cost of such sensing resources is incorporated into the optimization, with the aim of dynamically adapting the power (energy) devoted to sense each channel. More precisely, the tradeoff among: throughput, power devoted to sense, power devoted to transmit, and probability of interference is optimized. A soft-decision Bayesian sequential sensing scheme is used to exploit the time-correlation of the primary occupancy. The joint design leverages tools of dynamic programming to solve the sequential sensing problem and relies on reinforcement learning to develop a stochastic solution.