Brain Innovation

support portal

False Discovery Rate (FDR)

Introduction


The Multiple Comparsion Problem
Since most currently performed fMRI research is aiming at finding statistical significant activation patterns (the null hypothesis can be rejected with a well defined error probability of alpha, usually in the order or 5 or 1 per cent), programs such as Brain Voyager must offer the user a solution for the following problem: The commonly used GLM approach calculates a univariate t-statistic for each and every voxel, which is often referred to as mass-univariate testing. While this is not a problem per se, it is obvious that performing, say, 50,000 tests (e.g. the number of voxels in a brain mask for still commonly used resolution properties) in parallel means that on a given alpha level of 5 per cent roughly 2,500 voxels will show up as activated although no true effect is in the data. This downside of mass-univariate statistical testing is called the Multiple Comparison Problem.

Bonferroni's Approach
The most strict and consequent solution to this problem is to simply divide the alpha level by the number of tests, thus meaning that the probability of one of the total number of (mutually independent) voxels being wrongfully said to be significantly activated is of magnitude alpha. This is also known as correcting for the Family Wise Error. It is clear though that this, on the other hand, highly increases the beta error (false negative), which is that many voxels that should show an effect get below the corrected threshold. 

False Discovery Rate (FDR)


Difference to FWE
Basically, the idea of FDR is the following: While FWE (Bonferroni's approach) does not honor the fact that p-values are not equally distributed (so all "good" voxels, where rejecting H0 is the better choice, have in fact a higher t-statistic than those with the nominal alpha-p level), FDR does exactly this. 

Procedure
In theory thus, the most significant voxel is being corrected at Bonferroni level (i.e. set alpha level divided by number of voxels that have been processed). The next most significant voxel is corrected at twice this rate, etc. As soon as one voxel (in this ascending list of p-value sorted voxels) is found that does not meet with the correction criterion (p(V) > (i/N) * q), then all subsequent voxels are assumed to belong to the falsely claimed active voxels. 

Remaining Problems
While this method is very well suited for many datasets (i.e. data where an effect can be found without too much difficulty), the problem persists that should the most significant voxel's t-statistic not exceed the critical threshold (matching a p-value divided by the number of voxels) then this method will show no active voxels at all. This naturally means that using Bonferroni correction wouldn't help either (or in other words, in this case both methods do exactly the same!), since this would correct all voxels at the same stringent level, which is why many people currently tend to report thresholds of 0.001 alpha level uncorrected. 

Random Gaussian Fields
There is presently one restriction remaining with BV's FDR implementation: one cannot decide (as for example in SPM) whether to use a smoothness estimator prior to the correction to reduce the number of independent voxels (or then rather resels, i.e. resolution elements). While there is a clear advantage of having this option, it implicitly suggests that heavily smoothing the data somewhat improves the statistics, as the smoothness will usually increase... 

Alternatives
Our suggestion is rather to come to a useful threshold in another way (since this is ALL that FDR does, finding a more suitable threshold, and why would an algorithm that was shown to work good in simulated data perform well with all kinds of real data...?). Here are just some examples of how to do so

  • using a stringent cortex mask (reducing the number of multiple comparisons, thus making FWE *and* FDR less conservative)
  • performing a Region-of-Interest analysis, which doesn't need any of those corrections (just one test is done)
  • using cluster threshold levels with a fixed p of, say, 0.01 and a voxel extend

Cluster Thresholding
To find the correct cluster thresholding parameters (voxel extend), Fabrizio Esposito implemented a plugin that will, for a specific dataset, find good thresholds that allows you to report your activation at a given alpha level (p-value) within BrainVoyager QX.