Stitch Fix recently began replacing A/B tests with multi-armed bandit experiments, which reduce opportunity costs by learning to divert traffic away from low-performing experimental arms towards high-performing ones. They use deterministic Thompson Sampling to ensure the convergence and instantaneous self-correction of the bandit algorithm. However, to enable this transition, Stitch Fix had to extend its centralized experimentation platform. Stitch Fix’s experimentation platform consists of configuration parameters and randomization units, which users can mix and match to define any type of experiment; and an allocation engine, which serves randomized configuration parameter values deterministically to an application. The extended experimentation platform provides two standardized bandit policies and enables data scientists to implement and/or update a reward function, which is connected to the allocation engine through a dedicated microservice for each bandit experiment.