Polychoric correlation

In statistics, polychoric correlation[1] is a technique for estimating the correlation between two hypothesised normally distributed continuous latent variables, from two observed ordinal variables. Tetrachoric correlation is a special case of the polychoric correlation applicable when both observed variables are dichotomous. These names derive from the polychoric and tetrachoric series which are used for estimation of these correlations.

Applications and examples

edit

This technique is frequently applied when analysing items on self-report instruments such as personality tests and surveys that often use rating scales with a small number of response options (e.g., strongly disagree to strongly agree). The smaller the number of response categories, the more a correlation between latent continuous variables will tend to be attenuated. Lee, Poon & Bentler (1995) have recommended a two-step approach to factor analysis for assessing the factor structure of tests involving ordinally measured items. Kiwanuka and colleagues (2022) have also illustrated the application of polychoric correlations and polychoric confirmatory factor analysis in nursing science. This aims to reduce the effect of statistical artifacts, such as the number of response scales or skewness of variables leading to items grouping together in factors. In some disciplines, the statistical technique is rarely applied however, some scholars [1] have demonstrated how it can be used as an alternative to the Pearson correlation.

Software

edit
  • Mplus by Muthen and Muthen [2]
  • polycor package in R by John Fox [3]
  • psych package in R by William Revelle [4]
  • lavaan package in R by Yves Rosseel [5]
  • semopy package in Python by Georgy Meshcheryakov [6]
  • PRELIS
  • POLYCORR program
  • PROC CORR in SAS (with POLYCHORIC or OUTPLC= options) [7]
  • An extensive list of software for computing the polychoric correlation, by John Uebersax [8]
  • package polychoric in Stata by Stas Kolenikov [9]

See also

edit

References

edit
  1. ^ "Base SAS(R) 9.3 Procedures Guide: Statistical Procedures, Second Edition". support.sas.com. Retrieved 2018-01-10.
  • Lee, S.-Y., Poon, W. Y., & Bentler, P. M. (1995). "A two-stage estimation of structural equation models with continuous and polytomous variables". British Journal of Mathematical and Statistical Psychology, 48, 339–358.
  • Bonett, D. G., & Price R. M. (2005). "Inferential Methods for the Tetrachoric Correlation Coefficient". Journal of Educational and Behavioral Statistics, 30, 213.
  • Drasgow, F. (1986). Polychoric and polyserial correlations. In Kotz, Samuel, Narayanaswamy Balakrishnan, Campbell B. Read, Brani Vidakovic & Norman L. Johnson (Eds), Encyclopedia of Statistical Sciences, Vol. 7. New York, NY: John Wiley, pp. 68–74.
  • Kiwanuka, F., Kopra, J., Sak-Dankosky, N., Nanyonga, R. C., & Kvist, T. (2022). "Polychoric Correlation with Ordinal Data in Nursing Research". Nursing research, 10.1097/NNR.0000000000000614. Advance online publication. https://doi.org/10.1097/NNR.0000000000000614.
edit