Approaches to Implement Reinforcement Learning
There are mainly three ways to implement reinforcement-learning in ML, which are:
Value-based
The value-based approach is about to find the optimal value function, which is the maximum value at a state under any policy. Therefore, the agent expects the long-term return at any state(s) under policy π.
Policy-based
Policy-based approach is to find the optimal policy for the maximum future rewards without using the value function. In this approach, the agent tries to apply such a policy that the action performed in each step helps to maximize the future reward.
The policy-based approach has mainly two types of policy:
Deterministic
The same action is produced by the policy (π) at any state.
Stochastic
In this policy, probability determines the produced action.
Model-based
In the model-based approach, a virtual model is created for the environment, and the agent explores that environment to learn it. There is no particular solution or algorithm for this approach because the model representation is different for each environment.