wordbreaking

Word breaking is the process of determining where a line of text may be broken to wrap content to a new line. It identifies permissible break points between characters, between words, and in some cases within words. In languages that use spaces to separate words, line breaks typically occur at whitespace or punctuation. In languages without explicit word boundaries, such as Chinese or Japanese, line breaking relies on scripts' typographic rules to determine boundaries between characters. Hyphenation is related but distinct: it refers to breaking a word at syllable or morpheme boundaries, often with a hyphen inserted. Soft hyphen characters and zero-width joiners can influence potential break points without visible marks.

Break rules are implemented by line-breaking algorithms. In Unicode, the line breaking algorithm (UAX #14) defines

In practice, word breaking affects text layout in word processors, web browsers, and typesetting systems. It

script-specific

character-based