YeoJohnsonTransformation
YeoJohnsonTransformation, commonly called the Yeo–Johnson transformation, is a family of power transformations that extends the Box–Cox approach to data that can take negative values. It introduces a single shape parameter, λ, to transform a real-valued variable y in a way that can stabilize variance and improve normality, which in turn can enhance the performance of statistical models that assume normality or homoscedasticity.
The transformation is defined piecewise for y ≥ 0 and y < 0:
- If y ≥ 0: y(λ) = [(y+1)^λ − 1]/λ for λ ≠ 0; and y(0) = ln(y+1).
- If y < 0: y(λ) = − [(-y+1)^(2−λ) − 1]/(2−λ) for λ ≠ 2; and y(2) = −ln(-y+1).
The Yeo–Johnson transformation is invertible for all real y and a suitable λ, with inverse formulas:
- For y ≥ 0: if λ ≠ 0, y = (1 + λz)^(1/λ) − 1; if λ = 0, y = exp(z) − 1, where z
- For y < 0: if λ ≠ 2, y = 1 − [1 − z(2−λ)]^(1/(2−λ)); if λ = 2, y = 1 − exp(−z).
Advantages include applicability to the full real line and the ability to approximate normality without data