Rewards and Reinforcement Learning

Written by Ayebamiebi Yousuo

Sep 24, 2024

How would you train your pet? Would you give it a pat when it poos in the living room? What about when it welcomes you after a long day's work? So you will negatively reward it for the actions you don't desire and positively reward it for the actions you desire right?

Well!

There you go. That is the summary of reinforcement learning.

Reinforcement Learning is a type of machine learning where the model learns by experimentation and exploitation based on the feedback(reward) it gets for the actions it carries concerning its environment.

Reward as a Component of Reinforcement Learning

There are 4 components of Reinforcement Learning:

1. Agent: This is the model you seek to train, analogous to your pet in the opening statement. 2. Environment: Every component of the agents surrounding that it interacts with. In the case of your pet, that will be the house, its pieces of furniture, and even you, its owner. 3. Action: this comprises all steps taken by the agent to interact with its environment. 4. Reward: this is the feedback the agent gets, whether negative or positive, for every action it carries as a means of interaction with its environment. The reward is determined by a code called the reward function. A deviation from the ideal desired action is negatively rewarded by either decreasing the reward or initiating some form of punishment as determined by the

reward function. By iterative experimentation and exploitation, the agent learns to carry out the actions that bring about the greatest positive rewards.

Reinforcement Learning is a great learning model, not just for machines but also for humans. With its great reward system that seeks to minimize the loss of function at each iteration. In my opinion, humans have a thing or two to learn from how machines learn, what do you think?

References

What is reinforcement learning? - reinforcement learning explained - AWS. (n.d.). https://aws.amazon.com/what-is/reinforcement-learning/

Written by Ayebamiebi Yousuo from MEDILOQUY

MEDILOQUY BLOG

Discussion about this post

Ready for more?