rl_arms_get_outcome.Rd
This function defines the reinforcement delivery for an
individual arm, and is used internaly by RL Bandit Agents. With probability
prob
, it an arm will yield a reinforcement of magnitude
; with
probability 1 - prob
, an arm will yield a reinforcement of alternative
(default of zero).
rl_arms_get_outcome(arm_definitions, action, trial)
A list of arm definitions where each element contains
a data frame with columns 'probability', 'magnitude', 'alternative', and
'trial' describing, respectively, the probability
of receiving a reward
magnitude
with the alternative
for each trial
.
A numeric scalar representing which action was selected on a given trial.
The trial in which an action was selected.
A numeric reinforcement defined by magnitude
(with probability
prob
) or alternative
(with probability 1 - prob
).