yiyibooks Feature Construction for Reinforcement Learning in Hearts

名称
Feature Construction for Reinforcement Learning in Hearts
首页
http://usyiyi.cn/yiyibooks/Feature_Construction_for_Reinforcement_Learning_in_Hearts/index.html
原始地址
http://pdfs.semanticscholar.org/533f/a670bd278bd03833e29eb769f58ee31b8400.pdf
描述
Temporal difference (TD) learning has been used to learn strong evaluation functions in a variety of two-player games. TD-gammon illustrated how the combination of game tree search and learning methods can achieve grand-master level play in backgammon. In this work, we develop a player for the game of hearts, a 4-player game, based on stochastic linear regression and TD learning. Using a small set of basic game features we exhaustively combined features into a more expressive representation of the game state. We report initial results on learning with various combinations of features and training under self-play and against search-based players. Our simple learner was able to beat one of the best search-based hearts programs.