What DNA files do you accept?

We accept raw data exports from 23andMe (all versions), AncestryDNA, MyHeritage, LivingDNA, and FamilyTreeDNA. Files can be .txt, .csv, .tsv, .zip, or .gz format. Most consumer DNA chips contain 600K–900K variants.

How accurate are polygenic risk scores?

Every score uses peer-reviewed PGS Catalog models selected for report-grade coverage and relevance. Each score shows percentile direction and coverage. With imputation enabled, coverage jumps from ~30% to ~95%, making scores significantly more useful. PRS show relative genetic tendency compared to the population.

What happens to my DNA data after analysis?

Your DNA file is deleted immediately after analysis completes. All intermediate files are purged within 2 hours. You receive a SHA-256 Data Deletion Certificate proving destruction. We do not store, sell, or share your genetic data.

What’s included in the report?

300+ carefully curated polygenic risk scores across cardiovascular, cancer, metabolic, neurological, immune, and trait categories, surfaced after coverage and quality filtering. ClinVar variant scanning, pharmacogenomics, protein pathway context, training and diet guidance, and downloadable evidence packs are delivered as an interactive HTML report.

Consumer DNA chips only test ~700K positions out of your 3 billion base pairs. Imputation uses a reference panel to infer additional positions that were not directly measured. This improves polygenic risk score coverage, especially when key model variants are not present on the original chip.

How long does analysis take?

Without imputation: 2–5 minutes. With imputation enabled: 15–45 minutes depending on server load. You’ll receive your report by email when it’s ready.

Is this a medical diagnostic tool?

No. Helix Sequencing is for research and educational purposes. Results should be discussed with a healthcare provider before making medical decisions. Polygenic risk scores show relative genetic predisposition, not diagnoses.

Can I compare my DNA with a family member?

Yes. Our DNA Compare feature lets you trace variant inheritance between two related individuals — see which alleles came from which parent, identify shared risk factors, and explore inherited traits side by side.

Learn

Polygenic Risk Scores Explained

How thousands of tiny genetic effects combine into a single number that can reshape how you think about disease prevention.

Updated March 29, 2026 · 9 min read

What Is a Polygenic Risk Score?

A polygenic risk score (PRS) is a single number that summarizes the combined effect of many genetic variants on your risk for a particular disease or trait. Instead of looking at one gene in isolation, a PRS considers thousands — sometimes millions — of positions across your entire genome, each contributing a tiny nudge toward higher or lower risk.

Think of it like a credit score for a specific health condition. No single factor determines your score. Instead, hundreds of small data points are weighted and added together to produce one number that places you on a spectrum relative to the general population.

The concept behind polygenic risk scores emerged from genome-wide association studies (GWAS), which scan the DNA of hundreds of thousands of people to find which genetic variants are statistically associated with specific conditions. Most individual variants have negligible effects on their own. A single SNP might shift your risk for coronary artery disease by 0.02%. But when you aggregate the effects of 100,000 such variants, the combined signal becomes medically meaningful.

How PRS Differs from Single-Gene Testing

Most people are familiar with genetic testing through high-profile single-gene examples. A BRCA1 or BRCA2mutation, for instance, can increase a woman’s lifetime breast cancer risk to 60–80%. These are rare, high-impact variants — a single broken gene with dramatic consequences.

Polygenic risk scores work differently. They capture the common end of the genetic risk spectrum: thousands of variants that each shift risk by a fraction of a percent. Individually, none of them would concern a geneticist. Collectively, they can rival or even exceed the predictive power of monogenic mutations for some conditions.

	Single-Gene Test	Polygenic Risk Score
Variants examined	1 to a handful	Thousands to millions
Individual effect size	Large (e.g. 5-80x risk)	Tiny (e.g. 0.01-0.05x each)
Population frequency	Rare (< 1% carry it)	Common (everyone has a score)
Example	BRCA2 for breast cancer	PRS for breast cancer
Who benefits	Rare mutation carriers	The entire population

This distinction matters because most disease risk is polygenic, not monogenic. Only a small fraction of the population carries a BRCA mutation, but everyone has a polygenic risk score for breast cancer. Research published in Nature Geneticshas shown that women in the top 1% of polygenic risk for breast cancer face a comparable lifetime risk to BRCA2 carriers — yet they would never be flagged by traditional single-gene testing.

How a Polygenic Risk Score Is Calculated

The mathematics behind a PRS is conceptually straightforward, even when the scale is massive. Here is what happens:

Identify associated variants

Researchers run a GWAS on a large cohort (often 500,000+ people) and identify which SNPs are associated with the trait. Each SNP gets an effect weight representing how much it shifts risk.

Genotype the individual

Your DNA is read at each of those variant positions. At every SNP, you carry 0, 1, or 2 copies of the effect allele.

Multiply and sum

For each variant, multiply your allele count (0, 1, or 2) by the published effect weight. Sum all the products. The result is your raw polygenic risk score.

Compare to a reference population

Your raw score is compared against a distribution of scores from a reference population (typically thousands of genomes). This converts your raw score into a percentile.

A well-powered PRS for coronary artery disease, for example, might incorporate effect weights from over 6 million SNPs derived from GWAS studies of 1.1 million participants. Each individual weight is negligible. The aggregate tells a story that no single variant could.

What Your Percentile Actually Means

This is where most people get confused, so it is worth being precise.

If you are at the 80th percentile for a condition, it means your genetic risk score is higher than 80% of the reference population. It does not mean you have an 80% chance of developing the disease.

Percentiles are about relative position, not absolute probability. Someone at the 95th percentile for type 2 diabetes has a higher genetic predisposition than 95% of people, but their actual lifetime risk depends on many additional factors: diet, exercise, body composition, age, and environment. Genetics loads the gun; lifestyle often pulls the trigger.

That said, percentiles at the extremes carry real clinical weight. Research in Nature Reviews Genetics (2025) showed that individuals in the top 5% of polygenic risk for coronary heart disease face a 2–3 fold increasein risk — comparable to carrying a monogenic familial hypercholesterolemia mutation. And unlike monogenic conditions, this applies to roughly 5% of the entire population, not a fraction of a percent.

1st-20th

Lower risk

20th-80th

Average

80th-99th

Elevated risk

The PGS Catalog and Curated Report PRS

Polygenic risk scores are not invented by individual companies. They are developed by research hospitals and universities worldwide, then published in peer-reviewed journals. The PGS Catalog, maintained by the European Bioinformatics Institute (EMBL-EBI) and the University of Cambridge, serves as the central repository for these validated models.

Helix surfaces 300+ carefully curated report PRS covering high-value conditions and traits after coverage and quality filtering. Each underlying model includes genetic variants, effect weights, study population details, and validation metrics. This is open science — anyone can download the source models, verify the methodology, and reproduce the results.

The catalog spans models from institutions including the Broad Institute, UK Biobank, Mass General Brigham, the Finnish Institute for Health and Welfare (THL), and dozens of others. These are the same scoring models being integrated into clinical practice at hospitals worldwide.

What PRS Can Tell You

Polygenic risk scores now exist for a wide spectrum of conditions. Here are the areas where the evidence is strongest:

Cardiovascular Disease

Coronary artery disease, atrial fibrillation, stroke, hypertension. The CAD PRS is among the most clinically validated, with the MI-GENES trial demonstrating improved patient outcomes over a 10-year follow-up.

Type 2 Diabetes

PRS models identify individuals at elevated risk years before blood glucose levels become abnormal, opening a window for dietary and lifestyle intervention.

Cancer

Breast, prostate, colorectal, lung, and others. Polygenic scores complement traditional screening guidelines and can help determine when to begin mammography or PSA testing.

Neurodegenerative Disease

Alzheimer's disease PRS models incorporate effects beyond the well-known APOE gene, capturing polygenic risk that APOE-only testing misses entirely.

Mental Health

Schizophrenia, bipolar disorder, major depressive disorder, and anxiety. The schizophrenia PRS was one of the earliest to demonstrate strong predictive power.

Autoimmune Conditions

Type 1 diabetes, rheumatoid arthritis, inflammatory bowel disease, celiac disease, and lupus. Useful for understanding family patterns and early warning.

Limitations and What PRS Cannot Do

Polygenic risk scores are a powerful tool, but they are not a crystal ball. Understanding their limitations is just as important as understanding their strengths.

Ancestry Bias

The largest limitation today is that most PRS models were trained on cohorts of European ancestry. The UK Biobank, which underlies many of the highest-quality models, is approximately 94% white British. This means PRS accuracy is highest for individuals of European descent and can be substantially less predictive for people of African, East Asian, South Asian, or Indigenous ancestry.

This is not a flaw in the math — it is a data problem. Linkage disequilibrium patterns (the way variants are inherited together) differ between populations, so effect weights derived from one ancestry group do not transfer perfectly to another. Major initiatives like the All of Us Research Program and the H3Africa Consortium are working to build more diverse training cohorts, but the gap remains significant in 2026.

Not Deterministic

A high polygenic risk score does not mean you will develop the condition. A low score does not guarantee you will not. PRS captures the genetic component of risk, but most common diseases are influenced by a complex interplay of genetics, environment, and behavior. Someone at the 99th percentile for type 2 diabetes who maintains a healthy weight and exercises regularly may never develop the condition. Someone at the 20th percentile who is sedentary and overweight might.

Score Quality Varies

Not all 300+ curated report PRS in the PGS Catalog are equally well-powered. Some are derived from GWAS studies of over a million participants and have been validated across multiple independent cohorts. Others are based on smaller studies or less heritable traits. A PRS for height (which is roughly 80% heritable and has been studied in enormous cohorts) will be substantially more informative than a PRS for a condition studied in only a few thousand people.

How Imputation Makes Consumer DNA Data More Useful

If you have taken a DNA test through 23andMe, AncestryDNA, or MyHeritage, your raw data file contains genotypes at roughly 600,000 to 700,000 SNP positions. That sounds like a lot, but the human genome contains over 600 million known variant positions. Many PRS models require variants that your consumer chip never directly measured.

This is where genotype imputation comes in. Imputation uses statistical methods to infer the genotypes at positions your chip did not measure, based on patterns of linkage disequilibrium observed in large reference panels. Variants that tend to be inherited together can be predicted with high confidence from neighboring variants that were directly measured.

Modern imputation tools like Beagle 5.5 can turn your directly genotyped chip data into coverage-expanded imputed data with high accuracy. This dramatically increases the number of PRS model weights that can be applied to your data, improving the accuracy and coverage of your risk scores.

Without imputation, a consumer DNA file might match only 15–40% of the variants in a given PRS model. With imputation, coverage typically jumps to 85–95%. That difference can change a score from uninformative noise to a clinically meaningful signal.

A Real-World Example

Consider someone who uploads their consumer DNA data and receives a polygenic risk score at the 90th percentile for coronary artery disease (CAD). What does this actually mean in practice?

Their genetic predisposition to CAD is higher than 90% of the reference population. Published research suggests this corresponds to roughly a 1.5–2x increase in relative risk compared to the population average. For a 45-year-old man, this might shift the conversation with his physician toward earlier and more frequent lipid panels, a coronary artery calcium (CAC) scan, or more aggressive management of modifiable risk factors like LDL cholesterol.

The MI-GENES randomized controlled trial, published in Circulation, demonstrated exactly this scenario in practice. Patients who received PRS-integrated risk assessments had lower LDL cholesterol levels and fewer cardiovascular events over a 10-year follow-up compared to those who received conventional risk assessment alone. The genetic information did not change their DNA — it changed their behavior and their physician’s clinical decisions.

This is the core promise of polygenic risk scores: not predicting the future with certainty, but identifying people who would benefit from earlier screening, more proactive monitoring, or lifestyle modifications — before symptoms ever appear.

See Your Polygenic Risk Scores

surfaces 300+ curated report scores. Imputation improves PRS model coverage when enabled. Ancestry-matched reference populations. Full report with percentiles, clinical context, and prevention-focused report guidance.

Upload Your DNA File

No account required. Zero data retention. Your file is deleted after analysis.

Key Takeaways

A polygenic risk score aggregates the effects of thousands of genetic variants into one number that represents your predisposition to a specific condition.

PRS captures common genetic risk that single-gene tests like BRCA miss entirely, and it applies to the entire population, not just rare mutation carriers.

Your percentile is a ranking, not a probability. The 80th percentile means higher genetic risk than 80% of people, not an 80% chance of disease.

Helix surfaces 300+ carefully curated report scores from peer-reviewed polygenic models.

Imputation improves consumer DNA chip coverage, dramatically improving PRS accuracy.

Ancestry bias is a real limitation. Most models are trained on European cohorts, and accuracy decreases for other ancestries.

PRS is not destiny. It identifies who should be screened earlier and more carefully, not who will or will not get sick.