Proximal Policy Optimization - Custom Reacher Task 1