PDTB
The Penn Discourse Treebank (PDTB) is a linguistically annotated corpus designed to capture discourse relations in English text. Developed by researchers at the University of Pennsylvania as part of the Penn Discourse Treebank project, the resource provides annotations of discourse connectives and their semantic relations across multiple genres, including newswire, fiction, and blogs.
The annotation focuses on discourse relations signaled by explicit connectives (such as because, however) as well
The top-level senses in the PDTB taxonomy are Expansion, Contingency, Temporal, and Comparison, with finer-grained subtypes
PDTB serves as a foundational resource for automatic discourse parsing and related natural language processing tasks.
Release history includes major versions such as PDTB 2.0 and PDTB 3.0, with expanded coverage and revised