fdr calculator

False Discovery Rate Calculator (Benjamini-Hochberg / Benjamini-Yekutieli)

Paste p-values (comma, space, or newline separated), choose a target FDR level, and calculate significant tests while controlling the expected false discovery rate.

Rank Test # p-value Critical Value Adjusted p-value Significant?

What this FDR calculator does

This tool helps you control the false discovery rate (FDR) when you run many hypothesis tests at once. Instead of treating every p-value in isolation, FDR procedures evaluate the full set of tests together and estimate how many “discoveries” may be false positives on average.

In practical terms, if you set FDR to 0.05, you are targeting a situation where roughly 5% of the results you call significant could be expected to be false discoveries, on average, over repeated studies.

False discovery rate in plain English

Multiple testing is tricky. If you run 1 test at a 5% significance level, false positives are relatively controlled. If you run 1,000 tests, you can get many “significant” findings just by chance. FDR methods are designed for exactly this setting.

  • Family-wise error rate (FWER) methods (like Bonferroni) focus on preventing even one false positive.
  • FDR methods allow more power and focus on controlling the proportion of false positives among discoveries.
  • This balance is especially useful in genomics, A/B testing at scale, neuroimaging, and high-dimensional research.

How the Benjamini-Hochberg method works

Step 1: Sort p-values from smallest to largest

Suppose you have m tests. Rank them so that p(1) ≤ p(2) ≤ ... ≤ p(m).

Step 2: Build the critical line

For each rank i, compute a critical value: (i / m) × q, where q is your chosen FDR level. For Benjamini-Yekutieli, the threshold is adjusted by a harmonic factor, making it stricter.

Step 3: Find the largest passing rank

Identify the largest rank k such that p(k) ≤ critical(k). Then declare ranks 1 through k significant. The calculator also reports adjusted p-values (often called q-values in software outputs).

Benjamini-Hochberg vs Benjamini-Yekutieli

You can choose between two popular procedures:

  • Benjamini-Hochberg (BH): more powerful, widely used, assumes independent or positively dependent tests.
  • Benjamini-Yekutieli (BY): robust under arbitrary dependence, but more conservative.

If your tests are strongly dependent and you want a safer upper bound, BY may be appropriate. If not, BH is often the practical default.

How to use this calculator effectively

  • Enter all p-values from a coherent family of hypotheses.
  • Set your FDR target (common choices are 0.10, 0.05, or 0.01).
  • Choose BH or BY based on your dependence assumptions.
  • Review the ranked table and the “Significant?” column.
Tip: Avoid feeding p-values from unrelated research questions into one correction set. FDR is meaningful only for a well-defined test family.

Worked example

Imagine 40 simultaneous tests in an experiment. At q = 0.05, BH might mark 9 tests significant. That does not mean exactly 0.45 false findings in your current dataset; it means the expected proportion of false discoveries among selected results is controlled in repeated sampling under method assumptions.

If you switch to BY, you may see fewer significant results because the threshold is stricter. This tradeoff is normal: stronger error protection usually costs statistical power.

Choosing a good q-value

Common defaults

  • q = 0.05: balanced in many scientific settings.
  • q = 0.10: exploratory studies where missing true effects is costly.
  • q = 0.01: high-stakes confirmatory analyses.

Decision context matters

There is no universally correct q. Choose it based on domain risk: clinical decisions and safety systems often require stricter thresholds, while discovery-phase screening may use a more permissive q with follow-up validation.

Common mistakes to avoid

  • Interpreting FDR as the probability that one specific significant result is false.
  • Applying correction separately to arbitrary subgroups just to increase significance count.
  • Mixing one-sided and two-sided p-values inconsistently within the same family.
  • Ignoring study design problems and relying on correction as a fix-all.

Quick FAQ

Do adjusted p-values replace raw p-values?

They complement raw p-values. Raw p-values reflect single-test evidence; adjusted p-values reflect that evidence under multiplicity control.

Can I use this for thousands of tests?

Yes. The method scales well. For very large lists, consider exporting from your analysis pipeline and pasting directly into the calculator.

Is this a substitute for preregistration or robust design?

No. FDR correction helps with multiplicity but does not solve bias, poor measurement, or model misspecification.

Bottom line

A good FDR calculator makes multiple-testing decisions transparent and reproducible. Use it as part of a full analytical workflow: clear hypotheses, quality data, appropriate models, and explicit reporting of correction method and q threshold.

🔗 Related Calculators