Belohnungslernen
Belohnungslernen, also known as Reinforcement Learning (RL) in English, is a machine learning paradigm where an agent learns to make decisions by taking actions in an environment to maximize a cumulative reward. Unlike supervised learning, which relies on labeled data, or unsupervised learning, which seeks patterns in unlabeled data, RL operates on a trial-and-error basis. The agent interacts with its environment, receives feedback in the form of rewards or penalties, and adjusts its behavior over time to achieve its goals.
The core components of a reinforcement learning system include an agent, an environment, a state, an action,
Key concepts in RL include value functions, which estimate the expected future reward from a given state