How to Guide a Non-Cooperative Learner to Cooperate: Exploiting No-Regret Algorithms in System Design
Published in Proceedings of the 20th International Conference on Autonomous Agents and MultiAgent Systems, 2021
We investigate a repeated two-player game setting where the column player is also a designer of the system, and has full control over payoff matrices. In addition, we assume that the row player uses a no-regret algorithm to efficiently learn how to adapt their strategy to the column player’s behaviour over time. The goal of the column player is to guide her opponent into picking a mixed strategy which is preferred by the system designer. Therefore, she needs to: (i) design appropriate payoffs for both players; and (ii) strategically interact with the row player during a sequence of plays in order to guide her opponent to converge to the desired mixed strategy.
Recommended citation: Nicholas Bishop, Le Cong Dinh, Long Tran-Thanh. "How to Guide a Non-Cooperative Learner to Cooperate: Exploiting No-Regret Algorithms in System Design". In: Proceedings of the 20th International Conference on Autonomous Agents and MultiAgent Systems, 2021