Contextual Bandits
This is the documentation page for the python package contextualbandits. For more details, see the project’s GitHub page:
Installation
Package is available on PyPI, can be installed with
pip install contextualbandits
If it fails to install due to not being able to compile C code, an earlier pure-Python version can be installed with
pip install contextualbandits==0.1.8.5
Getting started
You can find user guides with detailed examples in the following links:
Serializing (pickling) objects
Don’t use pickle to userialize objects from this package as it’s likely to fail. Use cloudpickle or dill instead, which have the same syntax and is able to serialize more types of objects.
Online Contextual Bandits
Hint: if in doubt of where to start or which method to choose, the safest bet is BootstrappedUCB.
Policy classes - first one from each group is the recommended one to use:
Randomized:
Active choices:
AdaptiveGreedy (with active_choice != None)
ExploreFirst (with prob_active_choice > 0)
Thompson sampling:
Upper confidence bound:
Naive:
ActiveExplorer
AdaptiveGreedy
BootstrappedTS
BootstrappedUCB
EpsilonGreedy
ExploreFirst
LinTS
LinUCB
LogisticTS
LogisticUCB
ParametricTS
PartitionedTS
PartitionedUCB
SeparateClassifiers
SoftmaxExplorer
Off-policy learning
Hint: if in doubt, use OffsetTree or SeparateClassifiers (last one is from the online module)
DoublyRobustEstimator
OffsetTree
Policy Evaluation
evaluateRejectionSampling
evaluateDoublyRobust
evaluateFullyLabeled
evaluateNCIS
Linear Regression
The package offers non-stochastic linear regression procedures with exact “partial_fit” solutions, which are recommended to use alongside the online policies for better incremental updates.