This paper proposes a coefficient-based policy searching method, the direct policy search (DPS), for searching (learning) and construct policies for controlling the altitude of an aircraft. The DPS is a new and efficient reinforcement learning (RL) strategy combined with genetic algorithms (GAs). Specifically, an optimal policy in DPS consists of a set of coefficients which are learned using GA-based RL (GARL). The proposed method for learning optimal policy is demonstrated in controlling the complicated altitude system of a Boeing 747 aircraft whose solution space consists of 20 variables. Simulation results show that this new approach produces competitive performances with the traditional algorithms such as the classical state-feedback algorithm and the pure RL algorithm.