Researchers at National Taiwan University College of Public Health found a high variability in the false discovery rate (FDR) control for typical genomic studies. They urge researchers to present the bootstrapped standard errors alongside with the FDR indices. The study was published in BMC Genetics 2015;16:97.
Conducted by Mr. Yi-Ting Lin and his advisor Dr. Wen-Chung Lee, professor of institute of epidemiology and preventive medicine, this study analyzed two datasets: a colon cancer gene-expression data and a visual refractive errors genome-wide association data, to illustrate the point.
The problem of multiple hypothesis testing arises naturally when one compares a large number of genes between different groups. The paradigm has shifted from Bonferroni corrections to FDR controls for multiple testing problems. From a practicing epidemiologist’s viewpoint, the FDR procedure is simple: input the p-values for the genes into an FDR software, get the output of the corresponding q-values, and then declare a gene significant if its q-value is less than or equal to 0.05. This supposedly ensures the FDR to be controlled at 5 percent level, and among the genes declared significant, the percentage of false positive genes would be at most as large as 5 percent.
“Interpreting the results this way can be perilous” warned Dr. Lee. In fact, there are three levels of variations attached to any FDR control, and neither one is negligible. To avoid over-interpretations, researchers are advised to present the variability of the FDR control.