Get Arm's Outcome based on its Probability and Reward Structure

This function defines the reinforcement delivery for an individual arm, and is used internaly by RL Bandit Agents. With probability prob, it an arm will yield a reinforcement of magnitude; with probability 1 - prob, an arm will yield a reinforcement of alternative (default of zero).

rl_arms_get_outcome(arm_definitions, action, trial)

Arguments

arm_definitions: A list of arm definitions where each element contains a data frame with columns 'probability', 'magnitude', 'alternative', and 'trial' describing, respectively, the probability of receiving a reward magnitude with the alternative for each trial.
action: A numeric scalar representing which action was selected on a given trial.
trial: The trial in which an action was selected.

Value

A numeric reinforcement defined by magnitude (with probability prob) or alternative (with probability 1 - prob).