Tetrachoric
Tetrachoric correlation is a statistic used to estimate the correlation between two latent continuous variables from a pair of observed binary variables. The concept assumes that each binary variable results from thresholding an underlying normally distributed variable, and that the two latent variables follow a bivariate normal distribution. Under this model, the observed 0/1 outcomes reflect whether each latent variable exceeds its threshold.
Calculation involves a 2x2 contingency table of the binary data. The thresholds for each variable are derived
Uses and interpretation: The tetrachoric correlation is preferred over the phi coefficient when the binary measures
Limitations: The method relies on the normality and threshold assumptions. Estimates can be unstable with small