BLOG ON MOLECULAR BREEDING

Linkage Disequilibrium in Breeding: From Alleles to GWAS Signals

Discover how linkage disequilibrium drives GWAS and why population structure can create misleading associations if not properly controlled.

If you want to truly understand how GWAS works, there is one concept you cannot skip:

👉 Linkage Disequilibrium (LD)
What does it mean?
In simple terms, LD occurs when certain allele combinations appear together in a population more often than expected by chance. If alleles were inherited independently, their combinations would be random—but LD tells us they are not.

👉A simple example
Imagine two SNPs:
SNP1: A / a
SNP2: B / b
If the combination A–B appears much more frequently than expected in our population, these loci are said to be in LD. Measuring LD allows us to quantify how strongly alleles co-occur on the same chromosome, which is the foundation for detecting marker–trait associations in GWAS.

👉Why does LD happen?
The most intuitive reason is physical proximity. Loci that are close together on a chromosome are less likely to be separated by recombination, so they tend to be inherited together.
However, LD is influenced by more than just distance. It is also shaped by:
1. Recombination rate
2. Population bottlenecks and founder effects
3. Genetic drift
4. Population admixture

In the context of a breeding program, 𝐩𝐨𝐩𝐮𝐥𝐚𝐭𝐢𝐨𝐧 𝐚𝐝𝐦𝐢𝐱𝐭𝐮𝐫𝐞 is often the most important factor to consider before running GWAS.

Here’s a practical example:
You mix two genetically distinct populations:
Population A: high fruit firmness
Population B: low firmness but high sugar content

Each population carries its own allele combinations. After mixing them, you create an admixed population where alleles from both sources coexist.

Now, some allele combinations (e.g., the firmness allele from A + nearby SNPs) appear together not because of physical linkage, but because they came from the same ancestral population. This can create long-range LD, even between loci on different chromosomes.

❌This type of LD matters for GWAS because it may generate false positives, where a marker appears associated with a trait simply because it tracks population origin rather than causality.

👉 How to avoid false positives due to population admixture
1️⃣Run GWAS within a single, relatively uniform population.
2️⃣Or, if using an admixed population, focus on traits not related to the phenotypic differences between the parental populations (e.g., disease resistance rather than fruit quality).
3️⃣Use statistical models that account for population structure and relatedness.

👉 If you’d like to be informed about the upcoming workshops organized by AgroSynapsis, and receive early access and discounts, 𝗳𝗶𝗹𝗹 𝗼𝘂𝘁 𝗼𝘂𝗿 𝘀𝗵𝗼𝗿𝘁 𝘁𝗿𝗮𝗶𝗻𝗶𝗻𝗴 𝗶𝗻𝘁𝗲𝗿𝗲𝘀𝘁 𝗳𝗼𝗿𝗺 here:

https://lnkd.in/g3tApqPz

By Rachil Koumproglou