Biallelic Two-Locus LD Calculator
Enter haplotype counts (or estimated frequencies scaled to any total) for two loci: A/a and B/b. The calculator returns allele frequencies, D, D′, and r².
What is linkage disequilibrium?
Linkage disequilibrium (LD) describes the non-random association of alleles at different loci. If allele A at one locus and allele B at another locus are observed together more often (or less often) than expected from their individual frequencies, the loci are in LD.
LD is central to population genetics, genome-wide association studies (GWAS), fine mapping, haplotype block analysis, and marker selection. In practical terms, LD helps determine whether one variant can act as a proxy for another nearby variant.
How this calculator works
This tool assumes two biallelic loci: A/a and B/b. You provide counts for the four haplotypes:
- AB
- Ab
- aB
- ab
The calculator converts these counts to frequencies, estimates allele frequencies at each locus, and then computes common LD metrics used in genetics workflows.
Formulas used
pAB = AB / N, where N = AB + Ab + aB + ab
pA = pAB + pAb, pB = pAB + paB
D = pAB − pApB
D′ = D / Dmax, where Dmax = min(pA(1−pB), (1−pA)pB) for D ≥ 0, and Dmax = min(pApB, (1−pA)(1−pB)) for D < 0.
r² = D² / [pA(1−pA)pB(1−pB)]
Interpreting the results
D (raw disequilibrium)
D indicates direction and magnitude of association but depends on allele frequencies, making cross-locus comparisons difficult.
D′ (normalized disequilibrium)
D′ scales D to its theoretical maximum range given observed allele frequencies. Values close to 1 (or -1) indicate strong historical linkage with limited recombination, but D′ can be high even when one allele is rare.
r² (correlation between loci)
r² is often preferred in association studies because it reflects predictive strength between markers. In tag SNP selection, higher r² means one marker better predicts another.
- r² ~ 0.8–1.0: very strong correlation
- r² ~ 0.5–0.8: moderate to strong
- r² ~ 0.2–0.5: weak to moderate
- r² < 0.2: weak LD
Worked example
Suppose your phased data gives haplotype counts AB=40, Ab=10, aB=20, ab=30 (the calculator default). These counts produce positive D and moderate-to-strong r², indicating that A and B tend to co-occur more often than expected under independence.
If you adjust counts so AB and ab dominate while Ab and aB shrink, LD usually increases. If all four haplotypes approach proportions expected from independent allele frequencies, LD decreases.
Common pitfalls
- Using unphased genotype counts as if they were haplotypes without haplotype inference.
- Comparing D values across loci with very different minor allele frequencies.
- Interpreting very high D′ from sparse data as strong predictive power (check r² too).
- Ignoring population structure, admixture, and sample size effects.
Best practices for reliable LD analysis
- Apply quality control filters (call rate, HWE checks, MAF thresholds).
- Use sufficiently large and ancestry-matched cohorts.
- Report both D′ and r² for a complete picture.
- When possible, validate findings in an external dataset.
Conclusion
A linkage disequilibrium calculator is a quick and useful way to quantify haplotype structure between two loci. For teaching, exploratory analysis, and quick validation, this page gives immediate LD estimates from simple inputs. For large-scale studies, use this as a conceptual companion to full genetics pipelines and dedicated bioinformatics tools.