Learning a guidance policy to navigate among dynamic agents in constrained environments with continual reinforcement learning