Reinforcement learning is the fourth machine learning model. In supervised learning, the machine is given the answer key and learns by finding correlations among all the correct outcomes. The reinforcement learning model does not include an answer key but, rather, inputs a set of allowable actions, rules, and potential end states. When the desired goal of the algorithm is fixed or binary, machines can learn by example. But in cases where the desired outcome is mutable, the system must learn by experience and reward. In reinforcement learning models, the “reward” is numerical and is programmed into the algorithm as something the system seeks to collect.
In many ways, this model is analogous to teaching someone how to play chess. Certainly, it would be impossible to try to show them every potential move. Instead, you explain the rules and they build up their skill through practice. Rewards come in the form of not only winning the game, but also acquiring the opponent’s pieces. Applications of reinforcement learning include automated price bidding for buyers of online advertising, computer game development, and high-stakes stock market trading.