Q Learning
Q Learning#
{"description": "Q learning in Duckietown.", "keywords": "Reinforcement learning, Q learning, DQN, Deep Q learning, Duckietown, machine learning, ML, AI, embedded AI"}
Through the Bellman equation, we can formulate a bootstrapping objective for estimating the value function (the Q-value function more specifically). This objective involves minimizing the temporal difference error. We discuss strategies for doing this, distinguishing between on-policy and off-policy approaches. Finally, if we have a high dimensional state, such as an image input, we show how we can use a neural network to estimate the state representation, a method called Deep Q Learning, or DQN.