Surprisalbased

Surprisalbased is a term used to describe approaches and analyses that rely on word surprisal as a central metric for understanding language processing. Surprisal, drawn from information theory, measures how much information is conveyed by an event and is defined as the negative logarithm of its probability. In language processing, the surprisal of a word given its preceding context is S(w_i | w_1...w_{i-1}) = -log2 P(w_i | w_1...w_{i-1}). A higher surprisal value indicates greater expected processing difficulty.

Surprisalbased theories, including surprisal theory, posit that real-time linguistic processing is shaped by incremental expectations: when

In practice, surprisal values are estimated from language models trained on large corpora, using either traditional

a

a

psycholinguistics,

a

a