A QoS-aware, Energy-efficient Trajectory Optimization for UAV Base Stations using Q-Learning