Disclosure: Some links on this page are affiliate links. If you purchase through these links, we may earn a commission at no extra cost to you.

Find My GLP-1 Path

GLP-1 Clinical Trial Endpoints Explained

Published: · Last reviewed:

By The RX Index Editorial Team · Last verified: May 14, 2026 · Educational content. Not medical advice.

How to spot a real GLP-1 clinical trial: understanding endpoints, placebo arms, denominators, and what the headlines actually mean

GLP-1 clinical trial endpoints explained in one breath: an endpoint is the specific outcome a trial was designed to measure, and the endpoint that matters depends on what the drug was being tested for. Weight-loss trials track percent body-weight change. Diabetes trials track A1C. Heart trials track MACE — major adverse cardiovascular events. Kidney trials track a kidney composite. Liver trials track biopsy findings. When a press release says a drug “met its primary endpoint,” those three words carry all the meaning — and none of it — without knowing what the endpoint actually was.

By the end of this page, you'll read GLP-1 trial news with the same eye a cautious clinician does. No stats degree required. Skip to the cross-trial table if you came for the numbers.


The GLP-1 trial headline cheat sheet

Bookmark this section. Every GLP-1 press release maps to one of these patterns.

If the headline says…It usually means…Before you trust it, check…
"Met primary endpoint"The main pre-specified analysis was positive.Was the endpoint patient-important or just a surrogate? Was it composite?
"Average weight loss was 22.5%"Mean percent change in body weight from baseline.Population. Timepoint. Comparator. Which estimand (efficacy vs. treatment-regimen).
"63% achieved ≥20% weight loss"A responder threshold — the share who crossed a cutoff.Also look at the mean. Thresholds can flatter the result.
"Reduced MACE by 20%"A relative reduction in a cardiovascular event composite.The absolute numbers. Baseline risk. Follow-up duration. Population studied.
"Slowed kidney disease"A composite kidney outcome -- usually kidney failure, sustained eGFR decline, or kidney/CV death.Did the patients already have kidney disease? Diabetes? Both?
"Resolved MASH" / "improved fibrosis"Liver biopsy findings at a fixed timepoint.Whether long-term clinical outcomes (cirrhosis, death) have been confirmed yet.
"Improved sleep apnea"Usually a change in AHI (breathing events per hour of sleep).PAP-machine status. Baseline AHI severity.
"Versus placebo, p<0.001"Statistical significance — unlikely to be chance.Statistical ≠ clinically meaningful. Always check the absolute difference.
The fastest safe read: endpoint + population + comparator + timepoint + estimand. If those five things aren't on the table, you can't compare a headline to anything else.

GLP-1 clinical trial endpoints explained: what counts as success?

A clinical trial endpoint is the pre-specified outcome a study analyzes to decide whether a drug worked or was safe. The endpoint is locked into the protocol before the trial starts — researchers cannot move the goalposts after seeing the data. The U.S. National Center for Advancing Translational Sciences (NCATS) defines an endpoint as a targeted outcome that is statistically analyzed to help determine the efficacy and safety of an intervention.

Primary endpoint

The one that determines whether the trial succeeded or failed. The trial is statistically powered to answer this question. If the primary endpoint misses, nothing else really matters.

Secondary endpoint

Supporting questions the trial also measured. Useful additional information, but the trial usually wasn't designed to live or die by them. Don't lead with secondaries.

Exploratory endpoint

Hypothesis-generating signals. Useful for steering the next trial. Not strong enough on their own to base a decision on.

Co-primary endpoints

Two primary endpoints, both usually required to pass. SURMOUNT-1: mean % weight change AND proportion losing ≥5%. ESSENCE: MASH resolution AND fibrosis improvement.

Composite endpoint

A single endpoint built from several events counted together. SELECT's MACE = CV death + non-fatal MI + non-fatal stroke. Always check that components moved the same direction.

Don't stop at “met primary endpoint.” Ask: the primary endpoint of what, in whom, compared with what, at what timepoint, and analyzed how?

What does “met its primary endpoint” mean in a GLP-1 trial?

It means the trial's main pre-specified analysis was positive at the statistical threshold the protocol set — usually p<0.05, or stricter when the trial used multiplicity adjustment. That's it. That's the claim.

What it does mean

  • ✓ The main pre-specified analysis was positive
  • ✓ The result was statistically significant at the protocol threshold
  • ✓ The trial was designed to answer this specific question

What it does NOT mean

  • ✕ The result is clinically meaningful for any individual
  • ✕ The drug is approved (or will be)
  • ✕ Safety endpoints came in clean
  • ✕ Other doses or populations see the same result
  • ✕ The result holds at longer follow-up

The next question is always the same: primary endpoint of what, in whom, compared with what, at what timepoint, and analyzed how?


The endpoint families you'll see across GLP-1 trials

Knowing which bucket a trial is in tells you which endpoint to look for first. GLP-1 trials sit in a handful of buckets, sometimes overlapping.

1

Weight endpoints

Obesity trials

The standard primary endpoint is mean percent change in body weight from baseline, measured at a fixed week (usually 68, 72, or 88). The FDA's January 2025 draft guidance on obesity drug development points to mean percent body-weight change as the primary efficacy assessment. Responder thresholds (≥5%, ≥10%, ≥15%, ≥20%, ≥25%) are used as co-primary or key secondary endpoints.

2

Glycemic endpoints (A1C)

Type 2 diabetes trials

The standard primary endpoint is mean change in A1C -- the hemoglobin A1c blood test reflecting average blood-sugar level over the previous 2-3 months. Secondary glycemic endpoints include fasting plasma glucose, the share reaching A1C targets (<7.0% or ≤6.5%), and hypoglycemia rates.

3

Cardiovascular endpoints (MACE)

Cardiovascular outcome trials (CVOTs)

The standard primary endpoint is 3-point MACE: a composite of cardiovascular death, non-fatal myocardial infarction, and non-fatal stroke. Some trials use 4-point MACE, which adds hospitalization for unstable angina or heart failure. CVOTs drove much of the GLP-1 story after 2008.

4

Kidney endpoints

Renal outcome trials

The standard primary endpoint is a composite -- typically time to kidney failure (eGFR <15 or dialysis/transplant), persistent ≥50% reduction in eGFR, or death from kidney or cardiovascular causes. Confirmatory secondaries include eGFR slope (rate of filtering decline per year) and changes in albuminuria.

5

Liver histology endpoints

MASH trials

MASH is the newer name for progressive fatty liver disease (formerly NASH). Primary endpoints are biopsy-based: resolution of steatohepatitis without worsening fibrosis, and improvement in fibrosis without worsening steatohepatitis. FDA accepts these histologic endpoints under accelerated approval but requires longer follow-up for hard clinical outcomes.

6

Heart-failure endpoints

HFpEF trials (STEP-HFpEF, SUMMIT)

The KCCQ-CSS (Kansas City Cardiomyopathy Questionnaire Clinical Summary Score) measures how much heart failure limits daily life -- higher is better. SUMMIT also tested a composite of cardiovascular death or worsening heart-failure events, which is not the same as a standalone mortality endpoint.

7

Sleep apnea endpoints

SURMOUNT-OSA

The primary endpoint is change in AHI -- the apnea-hypopnea index, or the number of breathing pauses and shallow-breathing events per hour of sleep. Lower is better. An AHI improvement is not the same as eliminating need for PAP therapy.

8

Safety endpoints

All trials

Adverse events, serious adverse events, discontinuations due to side effects, and pre-specified adverse events of special interest (gallbladder problems, pancreatitis, thyroid markers). Don't judge efficacy without the tolerability table.

Surrogate vs. hard endpoint. A1C is a surrogate for long-term diabetes complications. Percent weight loss is a surrogate for long-term obesity-related disease. MACE is a hard outcome — actual heart attacks, strokes, and deaths. Both types matter, but hard outcomes carry more weight when available.
Placebo-adjusted weight loss explained: how to read a GLP-1 clinical trial result by subtracting the placebo arm to find the drug's true effect

For a deeper look at how clinical trial data translates to real patient outcomes, see our GLP-1 adverse event rates guide or the GLP-1 FDA indications chart.


Major GLP-1 / incretin Phase 3 endpoint examples, side-by-side

Inclusion methodology

We included trials that either changed GLP-1 labeling, introduced a major endpoint family, produced a major outcomes result, or are frequently cited in GLP-1 headline comparisons. This is a selected set, not a complete registry of every GLP-1 Phase 3 trial. Every row points to a primary publication or a clearly labeled manufacturer release.

FDA-approved GLP-1 weight loss options comparison table: Wegovy, Zepbound, Saxenda, and Foundayo by indication, dose, and trial program

Cardiovascular outcome trials

TrialDrugPopulationPrimary endpointHeadline resultSourceWhat this proves
SELECT (2023)Semaglutide 2.4 mg s.c. weeklyAdults ≥45, BMI ≥27, established CVD, no diabetesTime to first 3-point MACE (CV death, non-fatal MI, non-fatal stroke)HR 0.80 (95% CI 0.72–0.90); 6.5% vs. 8.0% over up to 5 years; 1,270 first MACE events; 20% relative risk reductionNEJM 2023; ACC summaryIn this high-risk population, semaglutide cut MACE rate by about one-fifth. Doesn't automatically apply to lower-risk users.
SURPASS-CVOT (Dec 2025)Tirzepatide vs. dulaglutideT2D with established ASCVD3-point MACETirzepatide noninferior to dulaglutide (12.2% vs. 13.1%; HR 0.92); did NOT meet superiorityACC summary 2026; primary publication Dec 2025Tirzepatide is at least as safe as dulaglutide for MACE in T2D + ASCVD. Did not prove superior on primary endpoint.
GLP-1 RA class meta-analysisLEADER, SUSTAIN-6, REWIND, PIONEER 6, EXSCEL, ELIXA, HARMONY, AMPLITUDE-O (pooled)T2D ± CVD3-point MACEHR 0.86 (95% CI 0.80–0.93), p<0.001; ~14% relative risk reductionPeer-reviewed GLP-1 RA CVOT meta-analysisThe class as a whole reduces major CV events in T2D patients at risk. Individual trials differ on which components moved most.

Kidney outcome trials

TrialDrugPopulationPrimary endpointHeadline resultSourceWhat this proves
FLOW (2024)Semaglutide 1.0 mg s.c. weeklyT2D + CKD (3,533 participants)Composite: kidney failure, persistent ≥50% eGFR reduction, or death from kidney/CV causesHR 0.76; 24% relative risk reduction; trial stopped early under pre-specified stopping rules. Secondaries: 3-point MACE down 18%, all-cause death down 20%, eGFR slope +1.16 mL/min/1.73 m²/yrNEJM 2024 (Perkovic et al.)In T2D patients with existing kidney disease, semaglutide slowed progression to dialysis and reduced kidney/CV death. Does not automatically extend to people without kidney disease.

Obesity / weight-management trials

TrialDrugPopulationPrimary endpoint(s)Headline resultSourceWhat this proves
STEP 1 (2021)Semaglutide 2.4 mg s.c. weeklyAdults with obesity, no diabetesCo-primary: % body-weight change at week 68; proportion ≥5% loss~14.9% mean weight loss vs. 2.4% placebo; ~86% achieving ≥5%NEJM 2021 (Wilding et al.)Semaglutide produced clinically meaningful weight loss with lifestyle support.
SURMOUNT-1 (2022)Tirzepatide 5/10/15 mg s.c. weeklyAdults with obesity ± comorbidity, no T2DCo-primary at week 72: % body-weight change; ≥5% lossTreatment-regimen: 15.0% / 19.5% / 20.9% vs. 3.1%. Efficacy: 16.0% / 21.4% / 22.5%. Key secondary: 63% of 15 mg group achieved ≥20% weight loss.NEJM 2022 (Jastreboff et al.)Tirzepatide produced larger average weight loss than any GLP-1 mono-agonist seen up to that point. Results differ by estimand.
SURMOUNT-3 (2023)Tirzepatide after intensive lifestyle lead-inAdults who first lost ≥5% with lifestyleCo-primary at week 72: additional % weight change; ≥5% additional loss~21 percentage-point gap vs. placebo; 87.5% vs. 16.5% achieving ≥5% additionalNature Medicine 2023Adding tirzepatide after lifestyle success produced substantial additional weight loss.
SURMOUNT-4 (2023)Tirzepatide maintenance vs. placeboAdults completing 36-wk lead-in on max dose% weight change from week 36 to week 88; ≥80% maintenanceContinued tirzepatide maintained loss; placebo regained weight substantiallyJAMA 2024Stopping the drug leads to weight regain. The drug is doing its job, not delivering a permanent cure.
SURMOUNT-5 (2025)Tirzepatide vs. semaglutide 2.4 mgAdults with obesity ± comorbidity% body-weight change at week 72−20.2% with tirzepatide vs. −13.7% with semaglutide. Tirzepatide superior.NEJM 2025 (Aronne et al.)First direct comparison. Differences favor tirzepatide on average; individual response varies.
STEP UP / Wegovy 7.2 mg (FDA approved March 2026)Semaglutide 7.2 mg s.c. weeklyAdults with obesityCo-primary: % weight change at week 72; ≥5% lossMean body-weight reduction: 20.7%; 89% achieved ≥5% loss vs. 38% placebo. Dysesthesia higher at 7.2 mg (22%) vs. 2.4 mg (6%) or placebo (0.3%).FDA approval announcement + STEP UP publicationHigher dose produced more weight loss with a different side-effect profile. Higher dose is not a free upgrade.
ATTAIN-1 (2025)Orforglipron 6/12/36 mg oralAdults with obesity, no T2D (n=3,127)% body-weight change at week 72 (treatment-regimen estimand primary)Efficacy estimand up to 12.4% (27.3 lb) vs. 0.9% (2.2 lb) placebo. 59.6% achieved ≥10%, 39.6% achieved ≥15%.NEJM 2025; Lilly topline releaseFirst oral small-molecule GLP-1 with clinically meaningful weight loss. Smaller average effect than top injectables.
ATTAIN-2 (2026)Orforglipron oralAdults with obesity + T2D% body-weight change at week 72Up to 10.5% (22.9 lb) vs. 2.2% (5.1 lb). A1C reduction 1.3–1.8% from 8.1% baseline; 75% reached A1C ≤6.5%.Lancet 2025/2026; Lilly topline releaseIn adults with obesity and T2D, orforglipron reduced body weight and A1C. Do not read as a separate diabetes indication unless the current FDA label says so.

Diabetes trials

TrialDrugPopulationPrimary endpointHeadline resultSourceWhat this proves
ACHIEVE-1 (2025)Orforglipron oralAdults with T2D on diet/exercise onlyA1C reduction at week 40Efficacy estimand: A1C down 1.3–1.6% from 8.0% baseline; 65%+ of highest dose reached A1C ≤6.5%; 7.9% weight loss at highest dose (secondary)Lilly topline releaseFirst oral small-molecule GLP-1 to clear Phase 3 in T2D.
ACHIEVE-3 (2026)Orforglipron vs. oral semaglutideT2D, head-to-headA1C reduction at week 52Orforglipron 12/36 mg: A1C down 1.9%/2.2%. Oral semaglutide 7/14 mg: A1C down 1.1%/1.4%. Weight loss 6.7%/9.2% vs. 3.7%/5.3% — 73.6% greater relative weight loss at highest comparison.The Lancet 2026; Lilly topline releaseOrforglipron beat oral semaglutide head-to-head on A1C and weight, at these doses, in this population.

For the full indication picture, see the GLP-1 FDA indication vs. off-label use guide.

Liver (MASH) trials

GLP-1 for fatty liver: Wegovy vs. tirzepatide comparison in MASH and MASLD -- what the trial data shows about liver biopsy improvement
TrialDrugPopulationPrimary endpoint(s)Headline resultSourceWhat this proves
ESSENCE Part 1 (interim, 2024–2025)Semaglutide 2.4 mg s.c. weeklyBiopsy-confirmed MASH, fibrosis F2 or F3 (800 of 1,197 randomized)Co-primary at week 72: (1) MASH resolution with no worsening of fibrosis; (2) fibrosis improvement with no worsening of MASH(1) 62.9% vs. 34.3% placebo (est. difference 28.7%, p<0.001); (2) 36.8% vs. 22.4% (est. difference 14.4%, p<0.001)NEJM 2025 (Sanyal et al.); FDA Wegovy MASH approval Aug 2025Semaglutide produced significantly higher rates of liver-tissue improvement at 72 weeks. Long-term clinical outcomes come from Part 2 at week 240 (ongoing).
GLP-1 for fatty liver disease: MASH vs MASLD staging decision guide -- when to consider Wegovy, fibrosis stage, and what the biopsy endpoints mean

Disease-specific trials worth knowing

TrialDrugPopulationPrimary endpointSourceWhat it adds
STEP-HFpEF / STEP-HFpEF DMSemaglutide 2.4 mgObesity-related HFpEF (preserved ejection fraction)Dual primary at week 52: KCCQ-CSS change and % body-weight changeNEJM 2023 (Kosiborod et al.)Symptom and function improvements plus weight. Trial was not powered to prove mortality reduction.
SURMOUNT-OSATirzepatideAdults with obesity and moderate-to-severe OSAChange in AHINEJM 2024 (Malhotra et al.)AHI and several secondary endpoints improved vs. placebo. Any weight-independent claim should be tied to a specific post-hoc or mediation analysis.
SUMMITTirzepatideHFpEF with obesityComposite of CV death or worsening HF events; KCCQ-CSSNEJM 2024/2025 (Packer et al.)Reduced the composite of CV death or worsening HF events and improved KCCQ-CSS. Not a standalone mortality endpoint.

Ongoing trials worth tracking

TrialDrugPopulationPrimary endpointStatus
SURMOUNT-MMO (NCT05556512)Tirzepatide vs. placebo~15,000 adults with obesity, no T2DComposite: all-cause mortality, non-fatal MI, non-fatal stroke, coronary revascularization, HF eventsOngoing. The pivotal CVOT for tirzepatide in non-diabetic obesity.
ESSENCE Part 2Semaglutide 2.4 mgBiopsy-confirmed MASHLong-term clinical liver outcomes at week 240Ongoing. Will test whether week-72 histology gains translate into fewer cirrhosis events.
TRIUMPH-3 (NCT05882045)Retatrutide (GIP/GLP-1/glucagon receptor agonist)Severe obesity + established CVDEfficacy and safety vs. placeboActive, not recruiting; ~113 weeks.
TRIUMPH-Outcomes (NCT06383390)RetatrutideBMI ≥27 with ASCVD and/or CKDCardiovascular and kidney outcomesOngoing; longer-term outcomes trial.
What you should take from this table. Every trial answered a different question, in a different population, against a different comparator, at a different timepoint, using a different statistical approach. Numbers do not transfer across rows without translation.

Estimands, hazard ratios, and the words that change everything

How to read a GLP-1 trial result with placebo adjustment: understanding estimands, absolute vs relative risk reduction, and what each number actually means

Estimand: efficacy vs. treatment-regimen

Two trials can both report “average weight loss” and answer different questions because they used different estimands. If two trials use different estimands, their headline weight-loss numbers are not clean comparisons.

Treatment-regimen estimand

Answers: what was the average effect across everyone who was assigned to the drug, regardless of whether they stayed on it or added other treatments? This is closer to “what happens in the real world.”

Efficacy estimand

Answers: what would the average effect have been if everyone had stayed on the drug for the full trial without adding other treatments? This is closer to “best case if adherence is perfect.”

Same trial, two estimands: SURMOUNT-1

EndpointTreatment-regimen estimandEfficacy estimand
Mean weight loss, tirzepatide 5 / 10 / 15 mg15.0% / 19.5% / 20.9%16.0% / 21.4% / 22.5%
Mean weight loss, placebo3.1%2.4%
% achieving ≥5% loss, tirzepatide 5 / 10 / 15 mg85.1% / 88.9% / 90.9%89% / 96% / 96%
% achieving ≥5% loss, placebo34.5%28%

Same trial. Same patients. Two different statistical questions. Both numbers are honest. They answer different versions of “did the drug work.”

Hazard ratio

A hazard ratio (HR) compares the rate of an event in two groups over time. HR of 1.0 means no difference. HR below 1.0 means fewer events in the treatment group; above 1.0 means more.

SELECT

0.80

~20% fewer MACE events vs. placebo over up to 5 years

FLOW

0.76

24% lower rate of kidney composite vs. placebo

SURPASS-CVOT

0.92

Directionally favorable vs. dulaglutide; did not reach superiority

A hazard ratio comes with a confidence interval (CI) — usually 95% — the range of values consistent with the data. If the 95% CI does not cross 1.0, the result is statistically significant. SELECT's CI was 0.72–0.90, comfortably below 1.0.

Relative vs. absolute risk reduction

A 20% relative risk reduction is not the same as a 20-percentage-point absolute risk reduction.

SELECT: 6.5% of the semaglutide group had a MACE event vs. 8.0% of placebo.

Absolute risk reduction

1.5 pts

8.0% − 6.5% = 1.5 percentage points

Relative risk reduction

~20%

1.5 / 8.0 ≈ 19% — the headline number

Number needed to treat (NNT)

~67

1 / 0.015 ≈ 67 patients treated for 5 years to prevent 1 MACE event

The relative number sounds more impressive — that's why press releases lead with it. Both are honest. The NNT doesn't make the result less real; it frames it accurately at the individual level.


How to read a GLP-1 trial press release without getting played

GLP-1 trial press releases follow a template. Once you know the template, every announcement reads the same way.

The translation template you can run against any announcement:

“[Drug] met [primary endpoint] in [population] compared with [comparator] at [timepoint] using [estimand]. The result means [plain-English meaning]. It does not prove [specific limitation].”

SELECT, decoded

NEJM 2023 — peer-reviewed

Semaglutide 2.4 mg met its primary endpoint (3-point MACE) in adults aged ≥45 with overweight/obesity (BMI ≥27) and established CVD but without diabetes, compared with placebo over up to 5 years using a time-to-first-event analysis. The result (HR 0.80; 95% CI 0.72–0.90) means semaglutide reduced the rate of major CV events in this high-risk group by about 20%. It does not prove that GLP-1s reduce CV risk in lower-risk users, people with diabetes, or people without established CVD.

ATTAIN-1, decoded

NEJM 2025 + Lilly topline release

Orforglipron (highest dose, 36 mg, once-daily oral) met its primary endpoint (% body-weight change at week 72, treatment-regimen estimand) in 3,127 adults with obesity, no diabetes, compared with placebo. Up to 12.4% (27.3 lb) weight loss using the efficacy estimand. It does not prove orforglipron matches injectable tirzepatide head-to-head -- that's a different trial.

SURPASS-CVOT, decoded

ACC summary 2026; primary publication December 2025 — peer-reviewed

Tirzepatide met its primary endpoint of noninferiority to dulaglutide for 3-point MACE in T2D patients with established ASCVD (12.2% vs. 13.1%; HR 0.92). It did NOT meet superiority. The result means tirzepatide is at least as safe as dulaglutide for MACE in this population. It does not prove tirzepatide is better than dulaglutide on cardiovascular outcomes.

ESSENCE Part 1, decoded

NEJM 2025 — peer-reviewed

Semaglutide 2.4 mg met both co-primary endpoints (MASH resolution without worsening fibrosis; fibrosis improvement without worsening MASH) in biopsy-confirmed MASH patients with F2/F3 fibrosis at week 72. It does not yet prove reduction in cirrhosis, liver failure, or death -- that data comes from Part 2 at week 240.

Press release red flags

  • !Only relative numbers, no absolutes — "reduced events by 20%" with no baseline rate given.
  • !Secondary endpoint headlined as if it were primary — if a primary endpoint missed, sponsors sometimes lead with positive secondaries.
  • !No estimand named — if you can't tell which question the analysis answered, the headline number is harder to compare.
  • !No comparator specified — a result vs. "standard of care" when standard of care differs by country.
  • !Topline/investor release before peer review — results can shift when the full dataset is published.

Glossary of GLP-1 trial terms

A1C (HbA1c)

Blood test reflecting average blood-glucose level over the past 2–3 months. ADA defines diabetes as A1C ≥6.5%. Used in: SUSTAIN, ACHIEVE, SURPASS as primary; many trials as secondary. Common misread: assuming a 1.0% A1C drop equals the same clinical benefit as a 1.5% drop -- magnitude matters.

AHI (apnea-hypopnea index)

Breathing pauses and shallow-breathing events per hour of sleep. Used in: SURMOUNT-OSA. Common misread: equating an AHI improvement with cure or elimination of PAP-therapy need.

Composite endpoint

A single endpoint built from multiple event types counted together (e.g., MACE; FLOW's kidney composite). Common misread: assuming all components moved equally -- they often didn't.

Confidence interval (CI)

Range of values consistent with the trial data, usually 95%. If the CI for a hazard ratio doesn't cross 1.0, the result is statistically significant at the conventional threshold.

Co-primary endpoints

Two primary endpoints, both usually required to pass. Used in: SURMOUNT-1, STEP-HFpEF, ESSENCE.

Efficacy estimand

Statistical analysis asking "what would the effect have been if everyone stayed on treatment for the full trial without rescue therapy?" Common misread: treating this as the real-world result.

eGFR (estimated glomerular filtration rate)

Blood-test estimate of kidney filtering function, in mL/min/1.73 m². Lower is worse.

eGFR slope

Annual rate of change in eGFR. Flatter slope = function declining more slowly.

Hazard ratio (HR)

Ratio of event rates between two groups over time. HR < 1.0 = fewer events with treatment. Common misread: treating HR as absolute risk reduction.

Intention-to-treat (ITT)

Analyzing every randomized participant in the group they were assigned to, regardless of what happened next. Standard for primary efficacy analysis in registration trials.

KCCQ-CSS

Kansas City Cardiomyopathy Questionnaire Clinical Summary Score. Patient-reported measure of heart-failure symptoms and physical limitations. Higher is better. Used in: STEP-HFpEF, SUMMIT. Common misread: assuming a symptom benefit equals a mortality benefit.

MASH

Metabolic dysfunction-associated steatohepatitis. Progressive, inflammatory form of fatty liver disease, formerly called NASH.

NAS (NAFLD activity score)

Histologic scoring system for fatty liver. Components: steatosis, inflammation, ballooning. Used in MASH trial biopsy endpoints.

NASH CRN fibrosis stage

F0–F4 staging of liver scarring used in MASH trials. F2 = moderate; F3 = advanced (bridging); F4 = cirrhosis.

Number needed to treat (NNT)

Approximate number of patients who must be treated for one event to be prevented. Useful for putting absolute risk reduction in patient terms. SELECT MACE NNT ≈ 67 over up to 5 years.

Persistent ≥50% eGFR decline

Kidney endpoint defined as a sustained drop in eGFR of at least 50%, confirmed on a follow-up measurement. Used in: FLOW.

p-value

Probability of seeing the observed result (or more extreme) if there were truly no difference. p<0.05 is the conventional threshold; trials with multiplicity adjustment use stricter cutoffs.

Primary endpoint

The main pre-specified outcome the trial is statistically powered to test. Common misread: "met primary endpoint" = clinically meaningful for you. It does not.

Responder analysis

Reporting the share of participants who crossed a fixed threshold (≥5%, ≥10%, ≥15%, ≥20% weight loss). FDA cautions that responder analyses can exaggerate effects when used as the primary metric.

Secondary endpoint

Additional outcomes the trial measured. Supports or qualifies the primary result. Not proof on its own.

Surrogate endpoint

Measurable marker (A1C, percent body weight, biopsy findings) believed to correlate with long-term clinical outcomes. Surrogates speed approvals but require confirmation.

Treatment-regimen estimand (treatment-policy estimand)

Statistical analysis asking "what was the effect across everyone assigned to treatment, regardless of adherence or added therapies?" Closer to real-world experience.

UACR (urine albumin-to-creatinine ratio)

Urine test for protein leakage; elevated UACR signals kidney damage. Used in: FLOW, SELECT kidney sub-analysis.


What we actually verified for this page

Source order we used:

  1. 1.FDA guidance documents, FDA labels, and FDA approval announcements (regulatory benchmarks and indication scope)
  2. 2.Peer-reviewed primary publications (NEJM, Lancet, Nature Medicine, JACC, Diabetes Care, JAMA, Nephrology Dialysis Transplantation)
  3. 3.ClinicalTrials.gov records and posted protocols
  4. 4.Professional society summaries (ACC, ADA, AASLD)
  5. 5.Manufacturer investor-relations releases, clearly labeled as manufacturer source material
  6. 6.Patient and forum language for understanding how readers ask questions — never as medical evidence

Specific sources verified:

  • SELECT primary results: NEJM 2023, ACC SELECT summary — HR 0.80 for 3-point MACE, 6.5% vs. 8.0%, up to 5-year follow-up
  • FLOW primary results: NEJM 2024 (Perkovic et al.), ADA press release, NephJC summary — HR 0.76 for kidney composite
  • SURPASS-CVOT: ACC summary 2026; primary publication December 17, 2025 — HR 0.92, noninferior, did not meet superiority
  • ESSENCE Part 1 interim: NEJM 2025 (Sanyal et al.), AASLD 2024 presentation — 62.9% vs. 34.3% MASH resolution; 36.8% vs. 22.4% fibrosis improvement
  • SURMOUNT-1: NEJM 2022 (Jastreboff et al.), ACC summary, Lilly investor release — treatment-regimen and efficacy estimands reported separately
  • SURMOUNT-3: Nature Medicine 2023 | SURMOUNT-4: JAMA 2024
  • SURMOUNT-5: NEJM 2025 (Aronne et al.) — tirzepatide −20.2% vs. semaglutide −13.7% at week 72
  • STEP UP / Wegovy 7.2 mg: FDA approval announcement (March 2026), STEP UP primary publication, Wegovy prescribing information
  • STEP-HFpEF and SUMMIT: NEJM publications (Kosiborod et al.; Packer et al.)
  • SURMOUNT-OSA: NEJM 2024 (Malhotra et al.); ADA summary
  • ATTAIN-1, ATTAIN-2: NEJM 2025 + Lancet 2025/2026; Lilly topline releases
  • ACHIEVE-1, ACHIEVE-3: Lilly topline releases; Lancet 2026 for ACHIEVE-3 (peer-reviewed)
  • FDA obesity drug guidance (January 2025 draft): Federal Register notice
  • FDA E9(R1) estimand guidance: FDA regulatory information page
  • Earlier CVOT meta-analysis: peer-reviewed pooled analysis, HR 0.86 (95% CI 0.80–0.93)
  • TRIUMPH-3 and TRIUMPH-Outcomes: ClinicalTrials.gov NCT05882045 and NCT06383390
Version history. May 14, 2026 — Initial publication. Endpoint table, regulatory-status claims, and trial citations verified against the sources listed above. Refresh cadence: quarterly; monthly during ADA Scientific Sessions, AASLD Liver Meeting, ESC, AHA, EASD, and major regulatory action windows.

Frequently asked questions about GLP-1 clinical trial endpoints

What is a primary endpoint in a GLP-1 clinical trial?

The primary endpoint is the single outcome a trial was designed and statistically powered to test. In obesity trials, it's usually percent body-weight change at a fixed week. In diabetes trials, it's usually A1C reduction. In cardiovascular trials, it's usually 3-point MACE. The primary endpoint answers "did it work?" -- everything else supports or qualifies that answer.

What is a secondary endpoint in a GLP-1 trial?

A secondary endpoint is anything else the trial measured to support, qualify, or extend the primary result. Common examples: waist circumference, blood pressure, lipids, fasting glucose, A1C target attainment, responder thresholds. Secondary results inform interpretation but are not strong enough on their own to claim the drug works.

What does "met its primary endpoint" mean?

The trial's main pre-specified analysis was positive at the statistical threshold the protocol set -- usually p<0.05, or stricter when the trial used multiplicity adjustment. It does not mean the result is clinically meaningful for any individual patient, that the drug is approved, or that safety endpoints came in clean.

What is MACE in GLP-1 trials?

MACE stands for major adverse cardiovascular events. 3-point MACE is a composite counting cardiovascular death, non-fatal myocardial infarction, and non-fatal stroke as a single event. 4-point MACE adds hospitalization for unstable angina or heart failure. SELECT's primary endpoint was 3-point MACE.

What is a hazard ratio?

A hazard ratio (HR) compares event rates between two groups over time. HR of 1.0 means no difference. HR below 1.0 means fewer events in the treatment group. SELECT's HR for MACE was 0.80, meaning about 20% fewer MACE events with semaglutide than placebo over up to 5 years.

What is the difference between efficacy estimand and treatment-regimen estimand?

The efficacy estimand answers what the effect would have been if everyone had stayed on the drug for the full trial without rescue therapy -- closer to best-case adherence. The treatment-regimen estimand answers what the effect was across everyone assigned to the drug, whether they stayed on it or not -- closer to real-world experience. Both are reported in modern GLP-1 trials.

Is percent weight loss the same as pounds lost?

No. Percent body-weight change is a relative measure. 15% of a 200-lb person is 30 lb; 15% of a 300-lb person is 45 lb. Trials standardize on percent to compare across body sizes. Always check starting weight or average body weight in the trial population when translating percent to pounds.

What does ≥20% weight loss mean in a trial?

It's a responder threshold -- the share of participants who lost at least 20% of their starting body weight by the endpoint week. In SURMOUNT-1, 63% of the tirzepatide 15 mg group hit ≥20% at week 72. Responder rates can complement mean results, but FDA cautions that responder analyses can exaggerate effects if used as the primary metric.

What is a co-primary endpoint?

A co-primary endpoint is one of two primary endpoints where both usually have to pass for the trial to claim success. SURMOUNT-1 used two: mean percent body-weight change and the proportion losing at least 5%. ESSENCE used two: MASH resolution without worsening fibrosis, and fibrosis improvement without worsening MASH.

What is a composite endpoint?

A composite endpoint counts multiple events as one. SELECT's 3-point MACE combines cardiovascular death, non-fatal MI, and non-fatal stroke. FLOW's kidney composite combines kidney failure, persistent ≥50% eGFR decline, and kidney or cardiovascular death. Always check that components moved in the same direction -- they sometimes don't.

What endpoint did SELECT use?

SELECT used 3-point MACE (cardiovascular death, non-fatal myocardial infarction, non-fatal stroke) as the primary endpoint, in adults aged ≥45 with BMI ≥27, established CVD, and no diabetes. Secondary endpoints included a heart-failure composite, kidney composite, and all-cause death.

What did the FLOW trial measure?

FLOW measured a composite kidney endpoint: time to first kidney failure (persistent eGFR <15 or chronic kidney replacement therapy), persistent ≥50% reduction in eGFR, or death from kidney or cardiovascular causes -- in adults with type 2 diabetes and chronic kidney disease. HR was 0.76, a 24% relative risk reduction. The trial was stopped early under pre-specified stopping rules.

What primary endpoints did the SURMOUNT trials use?

All SURMOUNT obesity trials used co-primary endpoints centered on percent body-weight change and ≥5% responder rate at the trial's main timepoint (typically week 72). SURMOUNT-4 measured weight change during a maintenance phase. SURMOUNT-5 was a head-to-head against semaglutide (-20.2% vs -13.7%). SURMOUNT-MMO is using a CV-event composite. SURMOUNT-OSA used change in AHI.

Why are GLP-1 trial weight-loss numbers sometimes different in the press release vs. the journal article?

Different estimands. Manufacturers often report both the efficacy estimand (idealized adherence) and the treatment-regimen estimand (real-world assigned-treatment). Press release headlines sometimes lead with the larger efficacy estimand number; the journal usually shows both. This is not dishonest -- they answer different questions -- but you need to know which number you're reading.

Is "statistically significant" the same as "clinically meaningful"?

No. Statistical significance tells you the result is unlikely to be chance. Clinical meaning depends on the absolute size of the effect, durability, safety profile, and how relevant the result is to a specific patient. A small but statistically significant difference can be clinically trivial; a large but underpowered result can be clinically important without quite hitting significance.

Is a surrogate endpoint as good as a hard endpoint?

No. A surrogate (A1C, percent body weight, biopsy findings) is a marker believed to correlate with long-term clinical outcomes. A hard endpoint is the outcome itself (heart attack, stroke, death, transplant). Surrogates allow trials to read out faster; hard outcomes carry more weight when available.

Does the FDA require the same endpoint for every GLP-1 trial?

No. The expected primary endpoint depends on the indication. Obesity drugs: percent body-weight change per FDA's January 2025 draft guidance. Diabetes drugs: A1C, plus a cardiovascular safety evaluation. MASH drugs: histologic endpoints for accelerated approval, with longer-term clinical confirmation required. Cardiovascular outcome trials: MACE or composite outcomes.

Why did the FLOW trial stop early?

FLOW was stopped at a planned interim analysis after the primary outcome reached statistical significance for efficacy under the trial's pre-specified stopping rules. Early stopping is not a red flag when pre-specified -- it means the benefit was clear enough before the planned end date that continuing would have been ethically difficult.


Related guides and tools


About this page: The RX Index is a pricing intelligence and comparison resource for GLP-1 telehealth providers. This page is independently produced for educational purposes. We have no relationship with any drug manufacturer mentioned. Some pages on therxindex.com reference telehealth providers with whom we have affiliate relationships. This page does not.

Correction policy: If a source changes or a new trial publication supersedes a press release, we update the table and note the change in the version history above.

Last verified: May 14, 2026. The RX Index editorial team. Refresh cadence: quarterly; monthly during major scientific congresses.

Editorial Standards · Methodology · Corrections Policy