Towards a theory of balancing exploration and exploitation in probabilistic environments

Book:

In proceedings of the sixth International Conference on Cognitive Modeling

Authors:

Nellen, S., & Lovett, M. C.

Type:

Conference Paper

Pages:

214-219

Year:

2004

Publisher:

Pittsburgh, PA

Abstract:

Learning to make good choices in a probabilistic environmentrequires that the Decision Maker resolves the tension betweenexploration (learning about all available options) andexploitation (consistently choosing the best option in order tomaximize rewards). We present a mathematical learningmodel that makes selections in a repeated-choice probabilistictask based on the expected payoff associated with each optionand the information gain that will result from choosing thatoption. This model can be used to analyze the relative impactof exploration and exploitation over time and under differentconditions. It predicts the aggregated and individual learningtrajectories of participants in various versions of the tasksufficiently well to support our basic argument: Informationgain is a valid and rational criterion underlying humandecision making. Future modeling work will be addressingthe exact nature of the interaction between exploration andexploitation.

Towards a theory of balancing exploration and exploitation in probabilistic environments

Licensing

Quick Links

Connect with us