Package 'bandit' reference manual

Title:	Functions for Simple a/B Split Test and Multi-Armed Bandit Analysis
Description:	A set of functions for doing analysis of A/B split test data and web metrics in general.
Authors:	Thomas Lotze and Markus Loecher
Maintainer:	Markus Loecher <markus.loecher@gmail.com>
License:	GPL-3
Version:	0.5.1
Built:	2025-03-12 02:58:00 UTC
Source:	https://github.com/cran/bandit

Functions for simple A/B split test and multi-armed bandit analysis

Description

A set of functions for doing analysis of A/B split test data and web metrics in general.

Details

Package:	bandit
Type:	Package
Title:	Functions for simple A/B split test and multi-armed bandit analysis
Version:	0.5.0
Date:	2014-05-03
Imports:	boot, gam (>= 1.09)
Author:	Thomas Lotze and Markus Loecher
Maintainer:	Thomas Lotze <thomaslotze@thomaslotze.com>
License:	GPL-3

Author(s)

Thomas Lotze and Markus Loecher

best_binomial_bandit

Description

Compute the Bayesian probabilities for each arm being the best binomial bandit.

Usage

best_binomial_bandit(x, n, alpha=1, beta=1)
best_binomial_bandit(x, n, alpha=1, beta=1)

Arguments

`x`	as in prop.test, a vector of the number of successes
`n`	as in prop.test, a vector of the number of trials
`alpha`	shape parameter alpha for the prior beta distribution.
`beta`	shape parameter beta for the prior beta distribution.

Value

a vector of probabilities for each arm being the best binomial bandit; this can be used for future randomized allocation

Author(s)

Thomas Lotze <thomaslotze@thomaslotze.com> and Markus Loecher

References

Steven L. Scott, A modern Bayesian look at the multi-armed bandit, Appl. Stochastic Models Bus. Ind. 2010; 26:639-658. (http://www.economics.uci.edu/~ivan/asmb.874.pdf)

Examples

x=c(10,20,30,50)
n=c(100,102,120,130)
arm_probabilities = best_binomial_bandit(x,n)
print(arm_probabilities)
paste("The best arm is likely ", which.max(arm_probabilities), ", with ",
	round(100*max(arm_probabilities), 2), " percent probability of being the best.", sep="")

best_binomial_bandit(c(2,20),c(100,1000))

best_binomial_bandit(c(2,20),c(100,1000), alpha = 2, beta = 5)

#quick look at the various shapes of the beta distribution as we change the shape params:
AlphaBeta = cbind(alpha=c(0.5,5,1,2,2),beta=c(0.5,1,3,2,5))
M = nrow(AlphaBeta)
y= matrix(0,100,ncol=M)
x = seq(0,1,length=100)
for (i in 1:M) y[,i] = dbeta(x,AlphaBeta[i,1],AlphaBeta[i,2])
matplot(x,y,type="l", ylim = c(0,3.5), lty=1, lwd=2)
param_strings = paste("a=", AlphaBeta[,"alpha"], ", b=", AlphaBeta[,"beta"], sep="")
legend("top", legend = param_strings, col=1:M, lty=1)
x=c(10,20,30,50)
n=c(100,102,120,130)
arm_probabilities = best_binomial_bandit(x,n)
print(arm_probabilities)
paste("The best arm is likely ", which.max(arm_probabilities), ", with ",
	round(100*max(arm_probabilities), 2), " percent probability of being the best.", sep="")

best_binomial_bandit(c(2,20),c(100,1000))

best_binomial_bandit(c(2,20),c(100,1000), alpha = 2, beta = 5)

#quick look at the various shapes of the beta distribution as we change the shape params:
AlphaBeta = cbind(alpha=c(0.5,5,1,2,2),beta=c(0.5,1,3,2,5))
M = nrow(AlphaBeta)
y= matrix(0,100,ncol=M)
x = seq(0,1,length=100)
for (i in 1:M) y[,i] = dbeta(x,AlphaBeta[i,1],AlphaBeta[i,2])
matplot(x,y,type="l", ylim = c(0,3.5), lty=1, lwd=2)
param_strings = paste("a=", AlphaBeta[,"alpha"], ", b=", AlphaBeta[,"beta"], sep="")
legend("top", legend = param_strings, col=1:M, lty=1)

best_binomial_bandit_sim

Description

Compute the Bayesian probabilities for each arm being the best binomial bandit, using simulation.

Usage

best_binomial_bandit_sim(x, n, alpha = 1, beta = 1, ndraws = 5000)best_binomial_bandit_sim(x, n, alpha = 1, beta = 1, ndraws = 5000)

Arguments

`x`	as in prop.test, a vector of the number of successes
`n`	as in prop.test, a vector of the number of trials
`alpha`	shape parameter alpha for the prior beta distribution.
`beta`	shape parameter beta for the prior beta distribution.
`ndraws`	number of random draws from the posterior

Value

a vector of probabilities for each arm being the best binomial bandit; this can be used for future randomized allocation

Author(s)

Thomas Lotze and Markus Loecher

References

Steven L. Scott, A modern Bayesian look at the multi-armed bandit, Appl. Stochastic Models Bus. Ind. 2010; 26:639-658.

(http://www.economics.uci.edu/~ivan/asmb.874.pdf)

Examples


x=c(10,20,30,33)
n=c(100,102,120,130)
best_binomial_bandit_sim(x,n, ndraws=1000)
round(best_binomial_bandit(x,n),3)

best_binomial_bandit_sim(c(2,20),c(100,1000))

best_binomial_bandit_sim(c(2,20),c(100,1000), alpha = 2, beta = 5)

#quick look at the various shapes of the beta distribution as we change the shape params:
AlphaBeta = cbind(alpha=c(0.5,5,1,2,2),beta=c(0.5,1,3,2,5))
M = nrow(AlphaBeta)
y= matrix(0,100,ncol=M)
x = seq(0,1,length=100)
for (i in 1:M) y[,i] = dbeta(x,AlphaBeta[i,1],AlphaBeta[i,2])
matplot(x,y,type="l", ylim = c(0,3.5), lty=1, lwd=2)
param_strings = paste("a=", AlphaBeta[,"alpha"], ", b=", AlphaBeta[,"beta"], sep="")
legend("top", legend = param_strings, col=1:M, lty=1)
x=c(10,20,30,33)
n=c(100,102,120,130)
best_binomial_bandit_sim(x,n, ndraws=1000)
round(best_binomial_bandit(x,n),3)

best_binomial_bandit_sim(c(2,20),c(100,1000))

best_binomial_bandit_sim(c(2,20),c(100,1000), alpha = 2, beta = 5)

#quick look at the various shapes of the beta distribution as we change the shape params:
AlphaBeta = cbind(alpha=c(0.5,5,1,2,2),beta=c(0.5,1,3,2,5))
M = nrow(AlphaBeta)
y= matrix(0,100,ncol=M)
x = seq(0,1,length=100)
for (i in 1:M) y[,i] = dbeta(x,AlphaBeta[i,1],AlphaBeta[i,2])
matplot(x,y,type="l", ylim = c(0,3.5), lty=1, lwd=2)
param_strings = paste("a=", AlphaBeta[,"alpha"], ", b=", AlphaBeta[,"beta"], sep="")
legend("top", legend = param_strings, col=1:M, lty=1)

best_poisson_bandit

Description

Compute the Bayesian probabilities for each arm being the best poisson bandit.

Usage

best_poisson_bandit(x, n = NULL)
best_poisson_bandit(x, n = NULL)

Arguments

`x`	as in prop.test, a vector of the number of successes; it may alternatively be a list of vectors of the results of each trial, if n is not provided
`n`	as in prop.test, a vector of the number of trials; if it is not provided, x must be a list of vectors of the results of each trial

Value

a vector of probabilities for each arm being the best poisson bandit; this can be used for future randomized allocation

Author(s)

Thomas Lotze <thomaslotze@thomaslotze.com>

References

Steven L. Scott, A modern Bayesian look at the multi-armed bandit, Appl. Stochastic Models Bus. Ind. 2010; 26:639-658. (http://www.economics.uci.edu/~ivan/asmb.874.pdf)

Examples

p1 = rpois(100, lambda=10)
p2 = rpois(100, lambda=9)
x = sapply(list(p1, p2), sum)
n = sapply(list(p1, p2), length)
best_poisson_bandit(x,n)
p1 = rpois(100, lambda=10)
p2 = rpois(100, lambda=9)
x = sapply(list(p1, p2), sum)
n = sapply(list(p1, p2), length)
best_poisson_bandit(x,n)

A convenience function to analyze a timeseries and return an estimate (via gam, using day of week factors and smoothed timestamp) of whether, after accounting for day-of-week, there is a significant time-based influence and what that influence is.

Usage

deseasonalized_trend(df, w=NULL)
deseasonalized_trend(df, w=NULL)

Arguments

`df`	a data frame containing timestamp and value entries
`w`	number of attempts (n for binomial data)

Value

a list with the following items:

`pval`	pval given by anova on gam, to indicate whether s(timestamp) is significant
`smoothed_prediction`	a smoothed prediction over time (on Wednesdays), to give a human-understandable idea of what the change over time has been

Author(s)

Thomas Lotze <thomaslotze@thomaslotze.com>

Examples

timestamps = as.numeric(as.POSIXct(seq(as.Date("2012-01-01"),as.Date("2012-05-03"),by=1)))
df=data.frame(timestamp = timestamps, value = rnorm(length(timestamps)))
dt = deseasonalized_trend(df)
if (dt$pval < 0.01) {
  print("Significant time-based factor")
  plot(df$timestamp, dt$smoothed_prediction)
} else {
  print("No significant time-based factor")
}

df=data.frame(timestamp = timestamps,
              value = sapply(timestamps, function(t) {rpois(1, lambda=t-min(timestamps))}))
dt = deseasonalized_trend(df)
if (dt$pval < 0.01) {
  print("Significant time-based factor")
  plot(df$timestamp, dt$smoothed_prediction)
} else {
  print("No significant time-based factor")
}

timestamps = as.numeric(as.POSIXct(seq(as.Date("2012-01-01"),as.Date("2012-05-03"),by=1)))
df=data.frame(timestamp = timestamps, value = rnorm(length(timestamps)))
dt = deseasonalized_trend(df)
if (dt$pval < 0.01) {
  print("Significant time-based factor")
  plot(df$timestamp, dt$smoothed_prediction)
} else {
  print("No significant time-based factor")
}

df=data.frame(timestamp = timestamps,
              value = sapply(timestamps, function(t) {rpois(1, lambda=t-min(timestamps))}))
dt = deseasonalized_trend(df)
if (dt$pval < 0.01) {
  print("Significant time-based factor")
  plot(df$timestamp, dt$smoothed_prediction)
} else {
  print("No significant time-based factor")
}

summarize_metrics

Description

A convenience function to perform overall metric analysis: mean, median, CI.

Usage

distribution_estimate(v, successes=NULL, num_quantiles=101, observed=FALSE)
distribution_estimate(v, successes=NULL, num_quantiles=101, observed=FALSE)

Arguments

`v`	a vector of values to be analyzed (for nonbinary data), or number of trials (for binary data)
`successes`	number of successes (for binary data)
`num_quantiles`	number of quantiles to split into
`observed`	whether to generate the observed distribution (rather than the estimated distribution of the mean); default FALSE

Value

a data frame with the following columns:

`quantiles`	the estimated quantiles (0,0.01,0.02,...,1) for the mean, using a Beta-binomial estimate of p for binomial data, a bootstrapped quantile distribution for real-valued numbers
`x`	x values for plotting a lineplot of the estimated distribution
`y`	y values for plotting a lineplot of the estimated distribution
`mids`	mid values for plotting a barplot of the estimated distribution
`lefts`	left values for plotting a barplot of the estimated distribution
`rights`	right values for plotting a barplot of the estimated distribution
`widths`	width values for plotting a barplot of the estimated distribution
`heights`	height values for plotting a barplot of the estimated distribution
`probabilities`	probabilities indicating how much probability is contained in each barplot

Author(s)

Thomas Lotze <thomaslotze@thomaslotze.com>

Examples

metric_list = list(rbinom(n=100,size=1,prob=0.5),
                   rbinom(n=100,size=1,prob=0.7),
                   rpois(n=100, lambda=5))
distribution_estimate(length(metric_list[[1]]), sum(metric_list[[1]]))
distribution_estimate(length(metric_list[[2]]), sum(metric_list[[2]]))
de = distribution_estimate(metric_list[[3]])
plot(de$x, de$y, type="l")
barplot(de$heights, de$widths)
distribution_estimate(metric_list[[3]], observed=TRUE)
metric_list = list(rbinom(n=100,size=1,prob=0.5),
                   rbinom(n=100,size=1,prob=0.7),
                   rpois(n=100, lambda=5))
distribution_estimate(length(metric_list[[1]]), sum(metric_list[[1]]))
distribution_estimate(length(metric_list[[2]]), sum(metric_list[[2]]))
de = distribution_estimate(metric_list[[3]])
plot(de$x, de$y, type="l")
barplot(de$heights, de$widths)
distribution_estimate(metric_list[[3]], observed=TRUE)

prob_winner

Description

Function to compute probability that each arm is the winner, given simulated posterior results

Usage

prob_winner(post)prob_winner(post)

Arguments

post

the simulated results from the posterior, provided by sim_post

Author(s)

Thomas Lotze and Markus Loecher

Examples


x=c(10,20,30,50)
n=c(100,102,120,130)
betaPost = sim_post(x,n)
prob_winner(betaPost)

x=c(10,20,30,50)
n=c(100,102,120,130)
betaPost = sim_post(x,n)
prob_winner(betaPost)

significance_analysis

Description

A convenience function to perform overall proportion comparison using prop.test, before doing pairwise comparisons, to see what outcomes seem to be better than others.

Usage

significance_analysis(x, n)
significance_analysis(x, n)

Arguments

`x`	as in prop.test, a vector of the number of successes
`n`	as in prop.test, a vector of the number of trials

Value

a data frame with the following columns:

`successes`	x
`totals`	n
`estimated_proportion`	x/n
`lower`	0.95 confidence interval on the estimated amount by which this alternative outperforms the next-lower alternative
`upper`	0.95 confidence interval on the estimated amount by which this alternative outperforms the next-lower alternative
`significance`	p-value for the test that this alternative outperforms the next-lower alternative
`order`	order, by highest success proportion
`best`	1 if it is part of the 'highest performing group' – those groups which were not significantly different from the best group
`p_best`	Bayesian posterior probability that this alternative is the best binomial bandit

Note

This is intended for use in A/B split testing – so sizes of n should be roughly equal. Also, note that alternatives which have the same rank are grouped together for analysis with the 'next-lower' alternative, so you may want to check to see if ranks are equal.

Author(s)

Thomas Lotze <thomaslotze@thomaslotze.com>

Examples

x = c(10,20,30,50)
n = c(100,102,120,130)
sa = significance_analysis(x,n)
sa[rev(order(sa$estimated_proportion)), ]

x = c(37,41,30,43,39,30,31,35,50,30)
n = rep(50, length(x))
sa = significance_analysis(x,n)
sa[rev(order(sa$estimated_proportion)), ]

x = c(37,41,30,43,39,30,31,37,50,30)
n = rep(50, length(x))
sa = significance_analysis(x,n)
sa[rev(order(sa$estimated_proportion)), ]
x = c(10,20,30,50)
n = c(100,102,120,130)
sa = significance_analysis(x,n)
sa[rev(order(sa$estimated_proportion)), ]

x = c(37,41,30,43,39,30,31,35,50,30)
n = rep(50, length(x))
sa = significance_analysis(x,n)
sa[rev(order(sa$estimated_proportion)), ]

x = c(37,41,30,43,39,30,31,37,50,30)
n = rep(50, length(x))
sa = significance_analysis(x,n)
sa[rev(order(sa$estimated_proportion)), ]

sim_post

Description

Simulate the posterior distribution the Bayesian probabilities for each arm being the best binomial bandit

Usage

sim_post(x, n, alpha = 1, beta = 1, ndraws = 5000)sim_post(x, n, alpha = 1, beta = 1, ndraws = 5000)

Arguments

`x`	as in prop.test, a vector of the number of successes
`n`	as in prop.test, a vector of the number of trials
`alpha`	shape parameter alpha for the prior beta distribution.
`beta`	shape parameter beta for the prior beta distribution.
`ndraws`	number of random draws from the posterior

Author(s)

Thomas Lotze and Markus Loecher

Examples


x=c(10,20,30,50)
n=c(100,102,120,130)
sim_post(x,n)

x=c(10,20,30,50)
n=c(100,102,120,130)
sim_post(x,n)

summarize_metrics

Description

A convenience function to perform overall metric analysis: mean, median, CI.

Usage

summarize_metrics(v, successes=NULL)
summarize_metrics(v, successes=NULL)

Arguments

`v`	a vector of values to be analyzed (for nonbinary data), or number of trials (for binary data)
`successes`	number of successes (for binary data)

Value

a list with the following items:

`mean`	mean
`median`	median
`lower`	0.95 confidence interval on the mean
`upper`	0.95 confidence interval on the mean
`num_obs`	number of observations of this metric
`total`	the sum of all values of this metric (mean*num_obs)

Author(s)

Thomas Lotze <thomaslotze@thomaslotze.com>

Examples

metric_list = list(rbinom(n=100,size=1,prob=0.5),
                   rbinom(n=100,size=1,prob=0.7),
                   rpois(n=100, lambda=5))
summarize_metrics(length(metric_list[[1]]), sum(metric_list[[1]]))
summarize_metrics(length(metric_list[[2]]), sum(metric_list[[2]]))
summarize_metrics(metric_list[[3]])
metric_list = list(rbinom(n=100,size=1,prob=0.5),
                   rbinom(n=100,size=1,prob=0.7),
                   rpois(n=100, lambda=5))
summarize_metrics(length(metric_list[[1]]), sum(metric_list[[1]]))
summarize_metrics(length(metric_list[[2]]), sum(metric_list[[2]]))
summarize_metrics(metric_list[[3]])

value_remaining

Description

Compute the "value_remaining" in the binomial bandits

Usage

value_remaining(x, n, alpha = 1, beta = 1, ndraws = 10000)value_remaining(x, n, alpha = 1, beta = 1, ndraws = 10000)

Arguments

`x`	as in prop.test, a vector of the number of successes
`n`	as in prop.test, a vector of the number of trials
`alpha`	shape parameter alpha for the prior beta distribution.
`beta`	shape parameter beta for the prior beta distribution.
`ndraws`	number of random draws from the posterior

Value

value_remaining distribution; the distribution of improvement amounts that another arm might have over the current best arm

Author(s)

Thomas Lotze and Markus Loecher

Examples


x=c(10,20,30,80)
n=c(100,102,120,240)
vr = value_remaining(x, n)
hist(vr)
best_arm = which.max(best_binomial_bandit(x, n))
# "potential value" remaining in the experiment
potential_value = quantile(vr, 0.95)
paste("Were still unsure about the CvR for the best arm (arm ", best_arm,
	"), but whatever it is, one of the other arms might beat it by as much as ",
	round(potential_value*100, 4), " percent.", sep="")
x=c(10,20,30,80)
n=c(100,102,120,240)
vr = value_remaining(x, n)
hist(vr)
best_arm = which.max(best_binomial_bandit(x, n))
# "potential value" remaining in the experiment
potential_value = quantile(vr, 0.95)
paste("Were still unsure about the CvR for the best arm (arm ", best_arm,
	"), but whatever it is, one of the other arms might beat it by as much as ",
	round(potential_value*100, 4), " percent.", sep="")

Package 'bandit'

Help Index

Functions for simple A/B split test and multi-armed bandit analysis

Description

Details

Author(s)

best_binomial_bandit

Description

Usage

Arguments

Value

Author(s)

References

See Also

Examples

best_binomial_bandit_sim

Description

Usage

Arguments

Value

Author(s)

References

See Also

Examples

best_poisson_bandit

Description

Usage

Arguments

Value

Author(s)

References

See Also

Examples

deseasonalized_trend

Description

Usage

Arguments

Value

Author(s)

Examples

summarize_metrics

Description

Usage

Arguments

Value

Author(s)

Examples

prob_winner

Description

Usage

Arguments

Author(s)

Examples

significance_analysis

Description

Usage

Arguments

Value

Note

Author(s)

See Also

Examples

sim_post

Description

Usage

Arguments

Author(s)

Examples

summarize_metrics

Description

Usage

Arguments

Value

Author(s)

Examples

value_remaining

Description

Usage

Arguments

Value