Väärtusfunktsioonide
Väärtusfunktsioonid, often translated as value functions, are a fundamental concept in reinforcement learning and decision theory. They represent the expected cumulative future reward an agent can achieve from a given state or state-action pair. In essence, a value function quantifies the desirability of being in a particular situation from the perspective of an agent aiming to maximize its rewards.
There are two primary types of value functions: state-value functions and action-value functions. A state-value function,
The core idea behind value functions is their recursive definition, often expressed through Bellman equations. These