Journal of Public Health Advance Access originally published online on August 11, 2006
Journal of Public Health 2006 28(3):278-282; doi:10.1093/pubmed/fdl038
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Comparing the part with the whole: should overlap be ignored in public health measures?
Lillian J. Hayes, Faculty of Nursing and Midwifery1
Geoffrey Berry, Emeritus Professor Biostatistics and Epidemiology2
1 Faculty of Nursing and Midwifery, University of Sydney, Sydney, New South Wales 2006, Australia
2 School of Public Health, University of Sydney, New South Wales 2006, Australia
Address correspondence to Lillian J. Hayes, E-mail: lhayes{at}nursing.usyd.edu.au
Background In public health, health outcomes such as cancer incidence or mortality of subgroups are often compared with health outcomes of the whole population. Our objective was to explore the effect of overlap that occurs in such comparisons and to develop a correction factor to adjust the test statistics and confidence intervals to allow for the effect in situations where the full data are not available.
Method The standard error of a difference between a statistic calculated for a subgroup and for the whole population was derived theoretically both ignoring and allowing for overlap. The ratio of these standard errors was defined as the correction factor. Cancer incidence and death data (19972001) for the Australian state of New South Wales (NSW) were examined to demonstrate the utility of the correction factor.
Results If the overlap is ignored, significance tests are conservative and confidence intervals too wide. In an example with an overlap of 12%, the correction factor was 1.13 and the significance level of 0.08 was corrected to 0.05 by taking the overlap into account.
Conclusions The overlap may not be of concern if the result is significant or if the subgroup is <10% of the whole population, but if the overlap is greater than 10% it should not be ignored. The easiest way of allowing for overlap is to use a correction factor, calculated from the amount of overlap, to adjust analyses that ignore overlap.
Keywords: confidence intervals, health status indicators, part compared with whole, statistical methods