rl_arms_get_outcome.RdThis function defines the reinforcement delivery for an
individual arm, and is used internaly by RL Bandit Agents. With probability
prob, it an arm will yield a reinforcement of magnitude; with
probability 1 - prob, an arm will yield a reinforcement of alternative
(default of zero).
rl_arms_get_outcome(arm_definitions, action, trial)A list of arm definitions where each element contains
a data frame with columns 'probability', 'magnitude', 'alternative', and
'trial' describing, respectively, the probability of receiving a reward
magnitude with the alternative for each trial.
A numeric scalar representing which action was selected on a given trial.
The trial in which an action was selected.
A numeric reinforcement defined by magnitude (with probability
prob) or alternative (with probability 1 - prob).