skip to main content
article

A Deep Reinforcement Learning Strategy for UAV Autonomous Landing on a Moving Platform

Published: 01 February 2019 Publication History

Abstract

The use of multi-rotor UAVs in industrial and civil applications has been extensively encouraged by the rapid innovation in all the technologies involved. In particular, deep learning techniques for motion control have recently taken a major qualitative step, since the successful application of Deep Q-Learning to the continuous action domain in Atari-like games. Based on these ideas, Deep Deterministic Policy Gradients (DDPG) algorithm was able to provide outstanding results with continuous state and action domains, which are a requirement in most of the robotics-related tasks. In this context, the research community is lacking the integration of realistic simulation systems with the reinforcement learning paradigm, enabling the application of deep reinforcement learning algorithms to the robotics field. In this paper, a versatile Gazebo-based reinforcement learning framework has been designed and validated with a continuous UAV landing task. The UAV landing maneuver on a moving platform has been solved by means of the novel DDPG algorithm, which has been integrated in our reinforcement learning framework. Several experiments have been performed in a wide variety of conditions for both simulated and real flights, demonstrating the generality of the approach. As an indirect result, a powerful work flow for robotics has been validated, where robots can learn in simulation and perform properly in real operation environments. To the best of the authors knowledge, this is the first work that addresses the continuous UAV landing maneuver on a moving platform by means of a state-of-the-art deep reinforcement learning algorithm, trained in simulation and tested in real flights.

References

[1]
Rucco, A., Sujit, P.B., Aguiar, A.P., Sousa, J.B., Pereira, F.L.: Optimal rendezvous trajectory for unmanned aerial-ground vehicles. arXiv:1612.06100 (2016)
[2]
Abadi, M., Agarwal, A., Barham, P., Brevdo, E., Chen, Z., Citro, C., Corrado, G.S., Davis, A., Dean, J., Devin, M., et al.: Tensorflow: Large-scale machine learning on heterogeneous distributed systems. arXiv:1603.04467 (2016)
[3]
Borowczyk, A., Nguyen, D.-T., Phu-Van Nguyen, A., Nguyen, D.Q., Saussié, D., Ny, J.L.: Autonomous Landing of a multirotor micro air vehicle on a high velocity ground vehicle. In: IFAC World Congress (2017)
[4]
Ananthakrishnan, U.S., Akshay, N., Manikutty, G., Bhavani, R.R.: Control of quadrotors using neural networks for precise landing maneuvers (2017)
[5]
Araar, O., Aouf, N., Vitanov, I.: Vision based autonomous landing of multirotor uav on moving platform. J. Intell. Robot. Syst. 85(2), 369---384 (2017)
[6]
Arora, S., Jain, S., Scherer, S., Nuske, S., Chamberlain, L., Singh, S.: Infrastructure-free shipdeck tracking for autonomous landing. In: 2013 IEEE International Conference on Robotics and Automation (ICRA), pp. 323---330 (2013)
[7]
Blösch, M., Weiss, S., Scaramuzza, D., Siegwart, R.: Vision based mav navigation in unknown and unstructured environments. In: 2010 IEEE International Conference on Robotics and Automation (ICRA), pp. 21---28. IEEE (2010)
[8]
Brockman, G., Cheung, V., Pettersson, L., Schneider, J., Schulman, J., Tang, J., Zaremba, W.: Openai gym. arXiv:1606.01540 (2016)
[9]
Cantelli, L., Mangiameli, M., Melita, C.D., Muscato, G.: Uav/Ugv cooperation for surveying operations in humanitarian demining. In: 2013 IEEE International Symposium on Safety, Security, and Rescue Robotics (SSRR), pp. 1---6. IEEE (2013)
[10]
Dorigo, M., Colombetti, M.: Robot Shaping: an Experiment in Behavior Engineering. MIT Press, Cambridge (1998)
[11]
Espié, E., Guionneau, C., Wymann, B., Dimitrakakis, C., Coulom, R., Sumner, A.: Torcs-the open racing car simulator. Available at: http://torcs.sourceforge.net (2005)
[12]
Falanga, D., Zanchettin, A., Simovic, A., Delmerico, J., Scaramuzza, D.: Vision-based autonomous quadrotor landing on a moving platform
[13]
Furrer, F., Burri, M., Achtelik, M., Siegwart, R.: Robot operating system (ROS): the complete reference (Volume 1), chap. RotorS--A Modular Gazebo MAV Simulator Framework, pp 595---625. Springer International Publishing, Cham (2016).
[14]
Gautam, A., Sujit, P.B., Saripalli, S.: A survey of autonomous landing techniques for uavs. In: 2014 International Conference on Unmanned Aircraft Systems (ICUAS) (2014)
[15]
Gautam, A., Sujit, P.B., Saripalli, S.: Application of Guidance Laws to Quadrotor Landing. In: 2015 International Conference on Unmanned Aircraft Systems (ICUAS) (2015)
[16]
Giusti, A., Guzzi, J., Cireşan, D.C., He, F.L., Rodríguez, J.P., Fontana, F., Faessler, M., Forster, C., Schmidhuber, J., Di Caro, G., et al.: A machine learning approach to visual perception of forest trails for mobile robots. IEEE Robotics and Automation Letters 1(2), 661---667 (2016)
[17]
Gu, S., Lillicrap, T., Sutskever, I., Levine, S.: Continuous deep q-learning with model-based acceleration. In: International Conference on Machine Learning, pp. 2829---2838 (2016)
[18]
Hu, B., Lu, L., Mishra, S.: Fast, safe and precise landing of a quadrotor on an oscillating platform. In: 2015 American Control Conference (ACC) (2015)
[19]
Ivakhnenko, A.G.: Polynomial theory of complex systems. IEEE Trans. Syst. Man Cybern. 1(4), 364---378 (1971)
[20]
Kai, W., Chunzhen, S., Yi, J.: Research on adaptive guidance technology of uav ship landing system based on net recovery. Procedia Engineering 99, 1027---1034 (2015)
[21]
Kelchtermans, K., Tuytelaars, T.: How hard is it to cross the room?---training (recurrent) neural networks to steer a uav. arXiv:1702.07600 (2017)
[22]
Kendoul, F., Ahmed, B.: Bio-inspired taupilot for automated aerial 4d docking and landing of unmanned aircraft systems. In: 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems (2012)
[23]
Kim, D.K., Chen, T.: Deep neural network for real-time autonomous indoor navigation. arXiv:1511.04668 (2015)
[24]
Kim, J., Jung, Y., Lee, D., Shim, D.H.: Landing control on a mobile platform for multi-copters using an omnidirectional image sensor. J. Intell. Robot. Syst. 84(1---4), 529---541 (2016)
[25]
Kober, J., Bagnell, J.A., Peters, J.: Reinforcement learning in robotics: a survey. Int. J. Robot. Res. 32 (11), 1238---1274 (2013)
[26]
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097---1105 (2012)
[27]
Lee, D., Ryan, T., Kim, H.J.: Autonomous landing of a vtol uav on a moving platform using image-based visual servoing. In: 2012 IEEE International Conference on Robotics and Automation (2012)
[28]
Lillicrap, T.P., Hunt, J.J., Pritzel, A., Heess, N., Erez, T., Tassa, Y., Silver, D., Wierstra, D.: Continuous control with deep reinforcement learning. arXiv:1509.02971 (2015)
[29]
Ling, K., Chow, D., Das, A., Waslander, S.L.: Autonomous maritime landings for low-cost vtol aerial vehicles. In: 2014 Canadian Conference on Computer and Robot Vision (2014)
[30]
Mnih, V., Badia, A.P., Mirza, M., Graves, A., Lillicrap, T., Harley, T., Silver, D., Kavukcuoglu, K.: Asynchronous methods for deep reinforcement learning. In: International Conference on Machine Learning, pp. 1928---1937 (2016)
[31]
Mnih, V., Kavukcuoglu, K., Silver, D., Graves, A., Antonoglou, I., Wierstra, D., Riedmiller, M.: Playing atari with deep reinforcement learning. arXiv:1312.5602 (2013)
[32]
Quigley, M., Conley, K., Gerkey, B., Faust, J., Foote, T., Leibs, J., Wheeler, R., Ng, A.Y.: Ros: an open-source robot operating system. In: ICRA Workshop on Open Source Software, vol. 3, p. 5. Kobe (2009)
[33]
Rezelj, A.: Autonomous charging of a quadrocopter by landing at a mobile platform (2013)
[34]
Rodriguez-Ramos, A., Sampedro, C., Bavle, H., Milosevic, Z., Garcia-Vaquero, A., Campoy, P.: Towards fully autonomous landing on moving platforms for rotary unmanned aerial vehicles. In: 2017 International Conference on Unmanned Aircraft Systems (ICUAS), pp. 170---178. IEEE (2017)
[35]
Sadeghi, F., Levine, S.: rl: real single image flight without a single real image. 12, arXiv:1611.04201 (2016)
[36]
Sampedro, C., Bavle, H., Rodríguez-Ramos, A., Carrio, A., Fernández, R.A.S., Sanchez-Lopez, J.L., Campoy, P.: A fully-autonomous aerial robotic solution for the 2016 international micro air vehicle competition. In: 2017 International Conference on Unmanned Aircraft Systems (ICUAS), pp. 989---998. IEEE (2017)
[37]
Sanchez-Lopez, J.L., Fernández, R.A.S., Bavle, H., Sampedro, C., Molina, M., Pestana, J., Campoy, P.: Aerostack: an architecture and open-source software framework for aerial robotics. In: 2016 International Conference on Unmanned Aircraft Systems (ICUAS), pp. 332---341. IEEE (2016)
[38]
Santana, P., Correia, L., Mendonça, R., Alves, N., Barata, J.: Tracking natural trails with swarm-based visual saliency. J. Field Rob. 30(1), 64---86 (2013)
[39]
Serra, P., Cunha, R., Hamel, T., Cabecinhas, D., Silvestre, C.: Landing of a quadrotor on a moving target using dynamic image-based visual servo control. IEEE Trans. Robot. 32(6), 1524---1535 (2016)
[40]
Shaker, M., Smith, M.N., Yue, S., Duckett, T.: Vision-based landing of a simulated unmanned aerial vehicle with fast reinforcement learning. In: 2010 International Conference on Emerging Security Technologies (EST), pp. 183---188. IEEE (2010)
[41]
Silver, D., Lever, G., Heess, N., Degris, T., Wierstra, D., Riedmiller, M.: Deterministic policy gradient algorithms. In: Proceedings of the 31st International Conference on Machine Learning (ICML-14), pp. 387---395 (2014)
[42]
Skoczylas, M.: Vision analysis system for autonomous landing of micro drone. Acta Mechanica et Automatica 8(4), 199---203 (2015)
[43]
Sutton, R.S., Barto, A.G.: Reinforcement Learning: an Introduction, vol. 1. MIT Press, Cambridge (1998)
[44]
Todorov, E., Erez, T., Tassa, Y.: Mujoco: a physics engine for model-based control. In: 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 5026---5033. IEEE (2012)
[45]
Uhlenbeck, G.E., Ornstein, L.S.: On the theory of the brownian motion. Phys. Rev. 36(5), 823 (1930)
[46]
Vlantis, P., Marantos, P., Bechlioulis, C.P., Kyriakopoulos, K.J.: Quadrotor landing on an inclined platform of a moving ground vehicle. In: 2015 IEEE International Conference on Robotics and Automation (ICRA) (2015)
[47]
Wenzel, K.E., Masselli, A., Zell, A.: Automatic take off, tracking and landing of a miniature uav on a moving carrier vehicle. J. Intell. Robot. Syst. 61(1---4), 221---238 (2011)
[48]
Zamora, I., Lopez, N.G., Vilches, V.M., Cordero, A.H.: Extending the openai gym for robotics: a toolkit for reinforcement learning using ros and gazebo. arXiv:1608.05742 (2016)
[49]
Zhang, T., Kahn, G., Levine, S., Abbeel, P.: Learning deep control policies for autonomous aerial vehicles with mpc-guided policy search. In: 2016 IEEE International Conference on Robotics and Automation (ICRA), pp. 528---535. IEEE (2016)

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Journal of Intelligent and Robotic Systems
Journal of Intelligent and Robotic Systems  Volume 93, Issue 1-2
February 2019
401 pages

Publisher

Kluwer Academic Publishers

United States

Publication History

Published: 01 February 2019

Author Tags

  1. Autonomous landing
  2. Continuous control
  3. Deep reinforcement learning
  4. UAV

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 22 Sep 2024

Other Metrics

Citations

Cited By

View all
  • (2024)A FPGA Accelerator of Distributed A3C Algorithm with Optimal Resource DeploymentIET Computers & Digital Techniques10.1049/2024/78552502024Online publication date: 1-Jan-2024
  • (2024)Reinforcement learning based autonomous multi-rotor landing on moving platformsAutonomous Robots10.1007/s10514-024-10162-848:4-5Online publication date: 6-Jun-2024
  • (2023)Reinforcement learning for shared autonomy drone landingsAutonomous Robots10.1007/s10514-023-10143-347:8(1419-1438)Online publication date: 1-Dec-2023
  • (2023)Deep reinforcement learning-based air combat maneuver decision-making: literature review, implementation tutorial and future directionArtificial Intelligence Review10.1007/s10462-023-10620-257:1Online publication date: 28-Dec-2023
  • (2022)An Intelligent Planning Method For The Multi-Rotor Manipulation Robot With Reinforcement Learning2022 IEEE International Conference on Mechatronics and Automation (ICMA)10.1109/ICMA54519.2022.9856385(1028-1033)Online publication date: 7-Aug-2022
  • (2022)Autonomous Obstacle Avoidance and Target Tracking of UAV Based on Deep Reinforcement LearningJournal of Intelligent and Robotic Systems10.1007/s10846-022-01601-8104:4Online publication date: 1-Apr-2022
  • (2022)Modified model free dynamic programming :an augmented approach for unmanned aerial vehicleApplied Intelligence10.1007/s10489-022-03510-753:3(3048-3068)Online publication date: 20-May-2022
  • (2022)Robust flight control system design of a fixed wing UAV using optimal dynamic programmingSoft Computing - A Fusion of Foundations, Methodologies and Applications10.1007/s00500-022-07484-z27:6(3053-3064)Online publication date: 23-Sep-2022
  • (2021)Transfer Reinforcement Learning for Autonomous DrivingACM Transactions on Modeling and Computer Simulation10.1145/344935631:3(1-26)Online publication date: 18-Jul-2021
  • (2021)A Novel Articulated Rover for Industrial Pipes Inspection Tasks2021 IEEE/ASME International Conference on Advanced Intelligent Mechatronics (AIM)10.1109/AIM46487.2021.9517691(1027-1032)Online publication date: 12-Jul-2021
  • Show More Cited By

View Options

View options

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media