Learn

Polygenic Risk Scores Explained

How thousands of tiny genetic effects combine into a single number that can reshape how you think about disease prevention.

Updated March 29, 2026 · 9 min read

What Is a Polygenic Risk Score?

A polygenic risk score (PRS) is a single number that summarizes the combined effect of many genetic variants on your risk for a particular disease or trait. Instead of looking at one gene in isolation, a PRS considers thousands — sometimes millions — of positions across your entire genome, each contributing a tiny nudge toward higher or lower risk.

Think of it like a credit score for a specific health condition. No single factor determines your score. Instead, hundreds of small data points are weighted and added together to produce one number that places you on a spectrum relative to the general population.

The concept behind polygenic risk scores emerged from genome-wide association studies (GWAS), which scan the DNA of hundreds of thousands of people to find which genetic variants are statistically associated with specific conditions. Most individual variants have negligible effects on their own. A single SNP might shift your risk for coronary artery disease by 0.02%. But when you aggregate the effects of 100,000 such variants, the combined signal becomes medically meaningful.

How PRS Differs from Single-Gene Testing

Most people are familiar with genetic testing through high-profile single-gene examples. A BRCA1 or BRCA2mutation, for instance, can increase a woman’s lifetime breast cancer risk to 60–80%. These are rare, high-impact variants — a single broken gene with dramatic consequences.

Polygenic risk scores work differently. They capture the common end of the genetic risk spectrum: thousands of variants that each shift risk by a fraction of a percent. Individually, none of them would concern a geneticist. Collectively, they can rival or even exceed the predictive power of monogenic mutations for some conditions.

 Single-Gene TestPolygenic Risk Score
Variants examined1 to a handfulThousands to millions
Individual effect sizeLarge (e.g. 5-80x risk)Tiny (e.g. 0.01-0.05x each)
Population frequencyRare (< 1% carry it)Common (everyone has a score)
ExampleBRCA2 for breast cancerPRS for breast cancer
Who benefitsRare mutation carriersThe entire population

This distinction matters because most disease risk is polygenic, not monogenic. Only a small fraction of the population carries a BRCA mutation, but everyone has a polygenic risk score for breast cancer. Research published in Nature Geneticshas shown that women in the top 1% of polygenic risk for breast cancer face a comparable lifetime risk to BRCA2 carriers — yet they would never be flagged by traditional single-gene testing.

How a Polygenic Risk Score Is Calculated

The mathematics behind a PRS is conceptually straightforward, even when the scale is massive. Here is what happens:

01

Identify associated variants

Researchers run a GWAS on a large cohort (often 500,000+ people) and identify which SNPs are associated with the trait. Each SNP gets an effect weight representing how much it shifts risk.

02

Genotype the individual

Your DNA is read at each of those variant positions. At every SNP, you carry 0, 1, or 2 copies of the effect allele.

03

Multiply and sum

For each variant, multiply your allele count (0, 1, or 2) by the published effect weight. Sum all the products. The result is your raw polygenic risk score.

04

Compare to a reference population

Your raw score is compared against a distribution of scores from a reference population (typically thousands of genomes). This converts your raw score into a percentile.

A well-powered PRS for coronary artery disease, for example, might incorporate effect weights from over 6 million SNPs derived from GWAS studies of 1.1 million participants. Each individual weight is negligible. The aggregate tells a story that no single variant could.

What Your Percentile Actually Means

This is where most people get confused, so it is worth being precise.

If you are at the 80th percentile for a condition, it means your genetic risk score is higher than 80% of the reference population. It does not mean you have an 80% chance of developing the disease.

Percentiles are about relative position, not absolute probability. Someone at the 95th percentile for type 2 diabetes has a higher genetic predisposition than 95% of people, but their actual lifetime risk depends on many additional factors: diet, exercise, body composition, age, and environment. Genetics loads the gun; lifestyle often pulls the trigger.

That said, percentiles at the extremes carry real clinical weight. Research in Nature Reviews Genetics (2025) showed that individuals in the top 5% of polygenic risk for coronary heart disease face a 2–3 fold increasein risk — comparable to carrying a monogenic familial hypercholesterolemia mutation. And unlike monogenic conditions, this applies to roughly 5% of the entire population, not a fraction of a percent.

1st-20th
Lower risk
20th-80th
Average
80th-99th
Elevated risk

The PGS Catalog: 3,550+ Peer-Reviewed Models

Polygenic risk scores are not invented by individual companies. They are developed by research hospitals and universities worldwide, then published in peer-reviewed journals. The PGS Catalog, maintained by the European Bioinformatics Institute (EMBL-EBI) and the University of Cambridge, serves as the central repository for these validated models.

As of 2026, the PGS Catalog contains over 3,550 published scoring modelscovering hundreds of conditions and traits. Each model includes the list of genetic variants, their effect weights, the study population, and validation metrics. This is open science — anyone can download the models, verify the methodology, and reproduce the results.

The catalog spans models from institutions including the Broad Institute, UK Biobank, Mass General Brigham, the Finnish Institute for Health and Welfare (THL), and dozens of others. These are the same scoring models being integrated into clinical practice at hospitals worldwide.

What PRS Can Tell You

Polygenic risk scores now exist for a wide spectrum of conditions. Here are the areas where the evidence is strongest:

Cardiovascular Disease

Coronary artery disease, atrial fibrillation, stroke, hypertension. The CAD PRS is among the most clinically validated, with the MI-GENES trial demonstrating improved patient outcomes over a 10-year follow-up.

Type 2 Diabetes

PRS models identify individuals at elevated risk years before blood glucose levels become abnormal, opening a window for dietary and lifestyle intervention.

Cancer

Breast, prostate, colorectal, lung, and others. Polygenic scores complement traditional screening guidelines and can help determine when to begin mammography or PSA testing.

Neurodegenerative Disease

Alzheimer's disease PRS models incorporate effects beyond the well-known APOE gene, capturing polygenic risk that APOE-only testing misses entirely.

Mental Health

Schizophrenia, bipolar disorder, major depressive disorder, and anxiety. The schizophrenia PRS was one of the earliest to demonstrate strong predictive power.

Autoimmune Conditions

Type 1 diabetes, rheumatoid arthritis, inflammatory bowel disease, celiac disease, and lupus. Useful for understanding family patterns and early warning.

Limitations and What PRS Cannot Do

Polygenic risk scores are a powerful tool, but they are not a crystal ball. Understanding their limitations is just as important as understanding their strengths.

Ancestry Bias

The largest limitation today is that most PRS models were trained on cohorts of European ancestry. The UK Biobank, which underlies many of the highest-quality models, is approximately 94% white British. This means PRS accuracy is highest for individuals of European descent and can be substantially less predictive for people of African, East Asian, South Asian, or Indigenous ancestry.

This is not a flaw in the math — it is a data problem. Linkage disequilibrium patterns (the way variants are inherited together) differ between populations, so effect weights derived from one ancestry group do not transfer perfectly to another. Major initiatives like the All of Us Research Program and the H3Africa Consortium are working to build more diverse training cohorts, but the gap remains significant in 2026.

Not Deterministic

A high polygenic risk score does not mean you will develop the condition. A low score does not guarantee you will not. PRS captures the genetic component of risk, but most common diseases are influenced by a complex interplay of genetics, environment, and behavior. Someone at the 99th percentile for type 2 diabetes who maintains a healthy weight and exercises regularly may never develop the condition. Someone at the 20th percentile who is sedentary and overweight might.

Score Quality Varies

Not all 3,550+ models in the PGS Catalog are equally well-powered. Some are derived from GWAS studies of over a million participants and have been validated across multiple independent cohorts. Others are based on smaller studies or less heritable traits. A PRS for height (which is roughly 80% heritable and has been studied in enormous cohorts) will be substantially more informative than a PRS for a condition studied in only a few thousand people.

How Imputation Makes Consumer DNA Data More Useful

If you have taken a DNA test through 23andMe, AncestryDNA, or MyHeritage, your raw data file contains genotypes at roughly 600,000 to 700,000 SNP positions. That sounds like a lot, but the human genome contains over 600 million known variant positions. Many PRS models require variants that your consumer chip never directly measured.

This is where genotype imputation comes in. Imputation uses statistical methods to infer the genotypes at positions your chip did not measure, based on patterns of linkage disequilibrium observed in large reference panels. Variants that tend to be inherited together can be predicted with high confidence from neighboring variants that were directly measured.

Modern imputation tools like Beagle 5.5 can expand your 700,000 directly genotyped variants to over 28 million imputed variants with high accuracy. This dramatically increases the number of PRS model weights that can be applied to your data, improving the accuracy and coverage of your risk scores.

Without imputation, a consumer DNA file might match only 15–40% of the variants in a given PRS model. With imputation, coverage typically jumps to 85–95%. That difference can change a score from uninformative noise to a clinically meaningful signal.

A Real-World Example

Consider someone who uploads their consumer DNA data and receives a polygenic risk score at the 90th percentile for coronary artery disease (CAD). What does this actually mean in practice?

Their genetic predisposition to CAD is higher than 90% of the reference population. Published research suggests this corresponds to roughly a 1.5–2x increase in relative risk compared to the population average. For a 45-year-old man, this might shift the conversation with his physician toward earlier and more frequent lipid panels, a coronary artery calcium (CAC) scan, or more aggressive management of modifiable risk factors like LDL cholesterol.

The MI-GENES randomized controlled trial, published in Circulation, demonstrated exactly this scenario in practice. Patients who received PRS-integrated risk assessments had lower LDL cholesterol levels and fewer cardiovascular events over a 10-year follow-up compared to those who received conventional risk assessment alone. The genetic information did not change their DNA — it changed their behavior and their physician’s clinical decisions.

This is the core promise of polygenic risk scores: not predicting the future with certainty, but identifying people who would benefit from earlier screening, more proactive monitoring, or lifestyle modifications — before symptoms ever appear.

See Your Polygenic Risk Scores

Helix Sequencing scores your DNA against all 3,550+ PGS Catalog models. Deep imputation expands your consumer chip data to 28M+ variants. Ancestry-matched reference populations. Full report with percentiles, clinical context, and a personalized longevity protocol.

Upload Your DNA File

No account required. Zero data retention. Your file is deleted after analysis.

Key Takeaways

A polygenic risk score aggregates the effects of thousands of genetic variants into one number that represents your predisposition to a specific condition.

PRS captures common genetic risk that single-gene tests like BRCA miss entirely, and it applies to the entire population, not just rare mutation carriers.

Your percentile is a ranking, not a probability. The 80th percentile means higher genetic risk than 80% of people, not an 80% chance of disease.

The PGS Catalog contains 3,550+ peer-reviewed scoring models from leading research institutions worldwide.

Imputation expands consumer DNA chip data from ~700K to 28M+ variants, dramatically improving PRS accuracy.

Ancestry bias is a real limitation. Most models are trained on European cohorts, and accuracy decreases for other ancestries.

PRS is not destiny. It identifies who should be screened earlier and more carefully, not who will or will not get sick.

Further Reading

  • PGS Catalog — Browse all 3,550+ published polygenic scoring models
  • Kullo, I.J. et al. “Clinical use of polygenic risk scores.” Nature Reviews Genetics (2025)
  • Khera, A.V. et al. “Genome-wide polygenic scores for common diseases identify individuals with risk equivalent to monogenic mutations.” Nature Genetics (2018)
  • Helix Sequencing Engineering Journal — How we built and validated our PRS pipeline

Get Your Full Genetic Analysis

Upload your existing DNA file from 23andMe, AncestryDNA, or MyHeritage. Get 3,550+ polygenic risk scores, pharmacogenomics for 34 genes, and an AI-generated longevity protocol. Connect your genome to Claude or ChatGPT.

Analyze My DNA