Testing for spatial clustering of disease at multiple ranges within a single dataset is a common practice in spatial epidemiology. It is not documented whether this approach has an impact on the type one error rate. Mr. Matthew S. Loop, a graduate student trainee in the department of biostatistics at the University of Alabama at Birmingham — in collaboration with Dr. Leslie A. McClure, professor in UAB’s department of biostatistics — estimated the family-wise error rate (FWE) for the difference in Ripley’s K functions test, when testing for spatial clustering of disease at an increasing number of ranges. FWE is the probability of finding at least one false positive among multiple statistical tests.
Case and control locations were generated on an area the size of the continental United States (approximately 3,000,000 square miles). Two thousand Monte Carlo replicates were used to estimate the FWE with 95 percent confidence intervals when testing for clustering at 10, 50, and 100 equidistant ranges. Results indicated that the estimated FWE and 95 percent confidence intervals when testing 10, 50, and 100 ranges were 0.22, 0.34, and 0.36, respectively.
The researchers concluded that testing for clustering at multiple ranges within a single dataset inflated the FWE above the nominal level of 0.05. They advise that investigators should construct simultaneous critical envelopes (available in spatstat package in R) or use an overall test statistic that integrates all of the test statistics from each range.
“Testing for Clustering at Many Ranges Inflates Family-wise Error Rate (FWE)” was published in January 2015 in the International Journal of Health Geographics.
Journal article: http://www.ij-healthgeographics.com/content/14/1/4