Deep reinforcement learning based path planning and collision avoidance for smart ships in complex environments

Introduction

Intelligence in shipbuilding and shipping is a key trend enabling high-quality development in the post-pandemic era
- Main components
  - Autonomous navigation
  - Automatic collision avoidance
  - Energy management systems

Intelligent navigation is a vital smart ship technology realized through advanced automation and navigation systems
- Safety、efficiency、cost
  - optimizing ship speeds and routes
  - ensuring navigation safety
  - reducing fuel consumption and emissions
Traditional methods
- have limitations for path planning problems in random, complex environments which involve difficult-to-quantify factors like the environment and contingent uncertainties
- Deep reinforcement learning demonstrates better performance with abstract, difficult-to-quantify influences compared to traditional approaches
Work basis

Gao P, Zhou L, Zhao X, Shao B, Research on ship collision avoidance path planning based on modified potential field ant colony algorithm [J]. Ocean and Coastal Management,2023,235(3): 106482. https://doi.org/10.1016/j.ocecoaman.2023.106482.*

Vessel	Name	Type	Size (m)	Tonnage (t)
Own ship	Hang Xing817	Bulk cargo ship	87-14.8-5.1	2114
Target ship	Zhou Gong6006	Bulk cargo ship	67.8-16.0-5.2	2138

Hyperparameter	Value
Episode	3000
Learning rate	1-e4
Batch size	256
Target network update frequency	3000
Replay buffer	100000
Skit ratio	0. 02
PER	0. 6
PER	0. 4
Warm start	50
Discount factor	0.99

For ship collision avoidance path planning in dynamic environments, large state and complex action spaces arise due to uncertainties.
Based on AIS data and COLREGs, this paper designs rewards evaluating multiple rule-compliant behaviors. An adaptive attenuated greedy exploration strategy is introduced based on the prioritized experience replay D3QN algorithm. Experiments under various collision avoidance scenarios demonstrate that the proposed method achieves superior results.
Further considerations / extensions
- Multi-Obj / different reward
- Complexity： uncertainty、dynamic、multi-ships (communication)