A comparison of methods for calculating a stratified kappa

W Barlow; M Y Lai; S P Azen

doi:10.1002/sim.4780100913

A comparison of methods for calculating a stratified kappa

Stat Med. 1991 Sep;10(9):1465-72. doi: 10.1002/sim.4780100913.

Authors

W Barlow¹, M Y Lai, S P Azen

Affiliation

¹ Center for Health Studies, Group Health Cooperative, Seattle, WA 98101-1448.

PMID: 1925174
DOI: 10.1002/sim.4780100913

Abstract

Investigators use the kappa coefficient to measure chance-corrected agreement among observers in the classification of subjects into nominal categories. The marginal probability of classification may depend, however, on one or more confounding variables. We consider assessment of interrater agreement with subjects grouped into strata on the basis of these confounders. We assume overall agreement across strata is constant and consider a stratified index of agreement, or 'stratified kappa', based on weighted summations of the individual kappas. We use three weighting schemes: (1) equal weighting; (2) weighting by the size of the table; and (3) weighting by the inverse of the variance. In a simulation study we compare these methods under differing probability structures and differing sample sizes for the tables. We find weighting by sample size moderately efficient under most conditions. We illustrate the techniques by assessing agreement between surgeons and graders of fundus photographs with respect to retinal characteristics, with stratification by initial severity of the disease.

Publication types

Comparative Study
Research Support, U.S. Gov't, P.H.S.

MeSH terms

Humans
Models, Statistical*
Observer Variation*
Retinal Diseases / therapy
Statistics as Topic / methods

Grants and funding

EY05571/EY/NEI NIH HHS/United States