Author(s): Alejandro Francetich and David M. Kreps
The problemof choosing an optimal toolkit day after day,when there is uncertainty concerning the value of different tools that can only be resolved by carrying the tools, is a multi-armed bandit problem with nonindependent arms. Accordingly, except for very simple specifications, this optimization problem cannot (practically) be solved. Decision takers facing this problem presumably resort to decision heuristics, "sensible" rules fordeciding which tools to carry, based on past experience. In this paper, we examine and compare the performance of a variety of heuristics, some very simple and others inspired by the computer-science literature on these problems. Some asymptotic results are obtained, especially concerning the long-run outcomes of using the heuristics, hence these results indicate which heuristics do well when the discount factor is close to one. But our focus is on the relative performance of these heuristics for discount factors bounded away from one, which we study through simulation of the heur istics on a collection of test problems.