UCTs - Infinite Lexicon - Infinite Lexicon

UCTs

UCTs, or Upper Confidence bounds applied to Trees, are a family of algorithms used for decision making in sequential, stochastic environments, implemented as a selection strategy within Monte Carlo Tree Search (MCTS). The core idea is to treat each node as having an estimated value plus an uncertainty bound, and to select among a node’s children by balancing exploration and exploitation. In the common formulation, the child with the highest value of Q_i + c sqrt(ln N / n_i) is chosen, where Q_i is the average reward of child i, n_i is the number of times that child has been visited, N is the total visits to the parent, and c is an exploration constant. After a simulation (playout) from a leaf, the result is propagated back up the tree, updating visit counts and value estimates along the path.

UCTs were introduced by Csaba Szepesvári and Rényi Kocsis in 2006 as a principled way to apply

Variants and extensions of UCTs address practical challenges, such as large branching factors (progressive widening), non-stationary

Limitations include reliance on rollout quality, computational demands, and sensitivity to parameter choices. While powerful, UCTs

a

domain-specific

heuristic-guided