This function defines the reinforcement delivery for an individual arm, and is used internaly by RL Bandit Agents. With probability prob, it an arm will yield a reinforcement of magnitude; with probability 1 - prob, an arm will yield a reinforcement of alternative (default of zero).

rl_arms_get_outcome(arm_definitions, action, trial)

Arguments

arm_definitions

A list of arm definitions where each element contains a data frame with columns 'probability', 'magnitude', 'alternative', and 'trial' describing, respectively, the probability of receiving a reward magnitude with the alternative for each trial.

action

A numeric scalar representing which action was selected on a given trial.

trial

The trial in which an action was selected.

Value

A numeric reinforcement defined by magnitude (with probability prob) or alternative (with probability 1 - prob).