Why Reward Shaping Sucks and What You Can Do About It
In my previous article on reward shaping, I walked through four hard-learned lessons about balancing collision penalties in a navigation task. I eventually found the "Goldilocks solution" - a -0.1 penalty that let my agent learn to navigate obstacles...
Sep 18, 20256 min read19