rl_action_simulate.RdThis is a generic function used to simulate an RL agent's action given a specific decision-making policy.
rl_action_simulate(policy, values, ...)What policy should a decision be made under? Currently supported are softmax, greedy, and epsilon-greedy.
A numeric vector containing the current value estimates of each action.
Additional arguments passed to or from specific methods, such as
tau when policy = "softmax" and epsilon when policy = "epsilonGreedy".
A number representing which action will be taken given the chosen policy.