GLP-1 Clinical Trial Endpoints Explained
Published: · Last reviewed:
By The RX Index Editorial Team · Last verified: May 14, 2026 · Educational content. Not medical advice.

GLP-1 clinical trial endpoints explained in one breath: an endpoint is the specific outcome a trial was designed to measure, and the endpoint that matters depends on what the drug was being tested for. Weight-loss trials track percent body-weight change. Diabetes trials track A1C. Heart trials track MACE — major adverse cardiovascular events. Kidney trials track a kidney composite. Liver trials track biopsy findings. When a press release says a drug “met its primary endpoint,” those three words carry all the meaning — and none of it — without knowing what the endpoint actually was.
By the end of this page, you'll read GLP-1 trial news with the same eye a cautious clinician does. No stats degree required. Skip to the cross-trial table if you came for the numbers.
The GLP-1 trial headline cheat sheet
Bookmark this section. Every GLP-1 press release maps to one of these patterns.
| If the headline says… | It usually means… | Before you trust it, check… |
|---|---|---|
| "Met primary endpoint" | The main pre-specified analysis was positive. | Was the endpoint patient-important or just a surrogate? Was it composite? |
| "Average weight loss was 22.5%" | Mean percent change in body weight from baseline. | Population. Timepoint. Comparator. Which estimand (efficacy vs. treatment-regimen). |
| "63% achieved ≥20% weight loss" | A responder threshold — the share who crossed a cutoff. | Also look at the mean. Thresholds can flatter the result. |
| "Reduced MACE by 20%" | A relative reduction in a cardiovascular event composite. | The absolute numbers. Baseline risk. Follow-up duration. Population studied. |
| "Slowed kidney disease" | A composite kidney outcome -- usually kidney failure, sustained eGFR decline, or kidney/CV death. | Did the patients already have kidney disease? Diabetes? Both? |
| "Resolved MASH" / "improved fibrosis" | Liver biopsy findings at a fixed timepoint. | Whether long-term clinical outcomes (cirrhosis, death) have been confirmed yet. |
| "Improved sleep apnea" | Usually a change in AHI (breathing events per hour of sleep). | PAP-machine status. Baseline AHI severity. |
| "Versus placebo, p<0.001" | Statistical significance — unlikely to be chance. | Statistical ≠ clinically meaningful. Always check the absolute difference. |
GLP-1 clinical trial endpoints explained: what counts as success?
A clinical trial endpoint is the pre-specified outcome a study analyzes to decide whether a drug worked or was safe. The endpoint is locked into the protocol before the trial starts — researchers cannot move the goalposts after seeing the data. The U.S. National Center for Advancing Translational Sciences (NCATS) defines an endpoint as a targeted outcome that is statistically analyzed to help determine the efficacy and safety of an intervention.
Primary endpoint
The one that determines whether the trial succeeded or failed. The trial is statistically powered to answer this question. If the primary endpoint misses, nothing else really matters.
Secondary endpoint
Supporting questions the trial also measured. Useful additional information, but the trial usually wasn't designed to live or die by them. Don't lead with secondaries.
Exploratory endpoint
Hypothesis-generating signals. Useful for steering the next trial. Not strong enough on their own to base a decision on.
Co-primary endpoints
Two primary endpoints, both usually required to pass. SURMOUNT-1: mean % weight change AND proportion losing ≥5%. ESSENCE: MASH resolution AND fibrosis improvement.
Composite endpoint
A single endpoint built from several events counted together. SELECT's MACE = CV death + non-fatal MI + non-fatal stroke. Always check that components moved the same direction.
What does “met its primary endpoint” mean in a GLP-1 trial?
It means the trial's main pre-specified analysis was positive at the statistical threshold the protocol set — usually p<0.05, or stricter when the trial used multiplicity adjustment. That's it. That's the claim.
What it does mean
- ✓ The main pre-specified analysis was positive
- ✓ The result was statistically significant at the protocol threshold
- ✓ The trial was designed to answer this specific question
What it does NOT mean
- ✕ The result is clinically meaningful for any individual
- ✕ The drug is approved (or will be)
- ✕ Safety endpoints came in clean
- ✕ Other doses or populations see the same result
- ✕ The result holds at longer follow-up
The next question is always the same: primary endpoint of what, in whom, compared with what, at what timepoint, and analyzed how?
The endpoint families you'll see across GLP-1 trials
Knowing which bucket a trial is in tells you which endpoint to look for first. GLP-1 trials sit in a handful of buckets, sometimes overlapping.
Weight endpoints
Obesity trialsThe standard primary endpoint is mean percent change in body weight from baseline, measured at a fixed week (usually 68, 72, or 88). The FDA's January 2025 draft guidance on obesity drug development points to mean percent body-weight change as the primary efficacy assessment. Responder thresholds (≥5%, ≥10%, ≥15%, ≥20%, ≥25%) are used as co-primary or key secondary endpoints.
Glycemic endpoints (A1C)
Type 2 diabetes trialsThe standard primary endpoint is mean change in A1C -- the hemoglobin A1c blood test reflecting average blood-sugar level over the previous 2-3 months. Secondary glycemic endpoints include fasting plasma glucose, the share reaching A1C targets (<7.0% or ≤6.5%), and hypoglycemia rates.
Cardiovascular endpoints (MACE)
Cardiovascular outcome trials (CVOTs)The standard primary endpoint is 3-point MACE: a composite of cardiovascular death, non-fatal myocardial infarction, and non-fatal stroke. Some trials use 4-point MACE, which adds hospitalization for unstable angina or heart failure. CVOTs drove much of the GLP-1 story after 2008.
Kidney endpoints
Renal outcome trialsThe standard primary endpoint is a composite -- typically time to kidney failure (eGFR <15 or dialysis/transplant), persistent ≥50% reduction in eGFR, or death from kidney or cardiovascular causes. Confirmatory secondaries include eGFR slope (rate of filtering decline per year) and changes in albuminuria.
Liver histology endpoints
MASH trialsMASH is the newer name for progressive fatty liver disease (formerly NASH). Primary endpoints are biopsy-based: resolution of steatohepatitis without worsening fibrosis, and improvement in fibrosis without worsening steatohepatitis. FDA accepts these histologic endpoints under accelerated approval but requires longer follow-up for hard clinical outcomes.
Heart-failure endpoints
HFpEF trials (STEP-HFpEF, SUMMIT)The KCCQ-CSS (Kansas City Cardiomyopathy Questionnaire Clinical Summary Score) measures how much heart failure limits daily life -- higher is better. SUMMIT also tested a composite of cardiovascular death or worsening heart-failure events, which is not the same as a standalone mortality endpoint.
Sleep apnea endpoints
SURMOUNT-OSAThe primary endpoint is change in AHI -- the apnea-hypopnea index, or the number of breathing pauses and shallow-breathing events per hour of sleep. Lower is better. An AHI improvement is not the same as eliminating need for PAP therapy.
Safety endpoints
All trialsAdverse events, serious adverse events, discontinuations due to side effects, and pre-specified adverse events of special interest (gallbladder problems, pancreatitis, thyroid markers). Don't judge efficacy without the tolerability table.

For a deeper look at how clinical trial data translates to real patient outcomes, see our GLP-1 adverse event rates guide or the GLP-1 FDA indications chart.
Major GLP-1 / incretin Phase 3 endpoint examples, side-by-side
Inclusion methodology
We included trials that either changed GLP-1 labeling, introduced a major endpoint family, produced a major outcomes result, or are frequently cited in GLP-1 headline comparisons. This is a selected set, not a complete registry of every GLP-1 Phase 3 trial. Every row points to a primary publication or a clearly labeled manufacturer release.

Cardiovascular outcome trials
| Trial | Drug | Population | Primary endpoint | Headline result | Source | What this proves |
|---|---|---|---|---|---|---|
| SELECT (2023) | Semaglutide 2.4 mg s.c. weekly | Adults ≥45, BMI ≥27, established CVD, no diabetes | Time to first 3-point MACE (CV death, non-fatal MI, non-fatal stroke) | HR 0.80 (95% CI 0.72–0.90); 6.5% vs. 8.0% over up to 5 years; 1,270 first MACE events; 20% relative risk reduction | NEJM 2023; ACC summary | In this high-risk population, semaglutide cut MACE rate by about one-fifth. Doesn't automatically apply to lower-risk users. |
| SURPASS-CVOT (Dec 2025) | Tirzepatide vs. dulaglutide | T2D with established ASCVD | 3-point MACE | Tirzepatide noninferior to dulaglutide (12.2% vs. 13.1%; HR 0.92); did NOT meet superiority | ACC summary 2026; primary publication Dec 2025 | Tirzepatide is at least as safe as dulaglutide for MACE in T2D + ASCVD. Did not prove superior on primary endpoint. |
| GLP-1 RA class meta-analysis | LEADER, SUSTAIN-6, REWIND, PIONEER 6, EXSCEL, ELIXA, HARMONY, AMPLITUDE-O (pooled) | T2D ± CVD | 3-point MACE | HR 0.86 (95% CI 0.80–0.93), p<0.001; ~14% relative risk reduction | Peer-reviewed GLP-1 RA CVOT meta-analysis | The class as a whole reduces major CV events in T2D patients at risk. Individual trials differ on which components moved most. |
Kidney outcome trials
| Trial | Drug | Population | Primary endpoint | Headline result | Source | What this proves |
|---|---|---|---|---|---|---|
| FLOW (2024) | Semaglutide 1.0 mg s.c. weekly | T2D + CKD (3,533 participants) | Composite: kidney failure, persistent ≥50% eGFR reduction, or death from kidney/CV causes | HR 0.76; 24% relative risk reduction; trial stopped early under pre-specified stopping rules. Secondaries: 3-point MACE down 18%, all-cause death down 20%, eGFR slope +1.16 mL/min/1.73 m²/yr | NEJM 2024 (Perkovic et al.) | In T2D patients with existing kidney disease, semaglutide slowed progression to dialysis and reduced kidney/CV death. Does not automatically extend to people without kidney disease. |
Obesity / weight-management trials
| Trial | Drug | Population | Primary endpoint(s) | Headline result | Source | What this proves |
|---|---|---|---|---|---|---|
| STEP 1 (2021) | Semaglutide 2.4 mg s.c. weekly | Adults with obesity, no diabetes | Co-primary: % body-weight change at week 68; proportion ≥5% loss | ~14.9% mean weight loss vs. 2.4% placebo; ~86% achieving ≥5% | NEJM 2021 (Wilding et al.) | Semaglutide produced clinically meaningful weight loss with lifestyle support. |
| SURMOUNT-1 (2022) | Tirzepatide 5/10/15 mg s.c. weekly | Adults with obesity ± comorbidity, no T2D | Co-primary at week 72: % body-weight change; ≥5% loss | Treatment-regimen: 15.0% / 19.5% / 20.9% vs. 3.1%. Efficacy: 16.0% / 21.4% / 22.5%. Key secondary: 63% of 15 mg group achieved ≥20% weight loss. | NEJM 2022 (Jastreboff et al.) | Tirzepatide produced larger average weight loss than any GLP-1 mono-agonist seen up to that point. Results differ by estimand. |
| SURMOUNT-3 (2023) | Tirzepatide after intensive lifestyle lead-in | Adults who first lost ≥5% with lifestyle | Co-primary at week 72: additional % weight change; ≥5% additional loss | ~21 percentage-point gap vs. placebo; 87.5% vs. 16.5% achieving ≥5% additional | Nature Medicine 2023 | Adding tirzepatide after lifestyle success produced substantial additional weight loss. |
| SURMOUNT-4 (2023) | Tirzepatide maintenance vs. placebo | Adults completing 36-wk lead-in on max dose | % weight change from week 36 to week 88; ≥80% maintenance | Continued tirzepatide maintained loss; placebo regained weight substantially | JAMA 2024 | Stopping the drug leads to weight regain. The drug is doing its job, not delivering a permanent cure. |
| SURMOUNT-5 (2025) | Tirzepatide vs. semaglutide 2.4 mg | Adults with obesity ± comorbidity | % body-weight change at week 72 | −20.2% with tirzepatide vs. −13.7% with semaglutide. Tirzepatide superior. | NEJM 2025 (Aronne et al.) | First direct comparison. Differences favor tirzepatide on average; individual response varies. |
| STEP UP / Wegovy 7.2 mg (FDA approved March 2026) | Semaglutide 7.2 mg s.c. weekly | Adults with obesity | Co-primary: % weight change at week 72; ≥5% loss | Mean body-weight reduction: 20.7%; 89% achieved ≥5% loss vs. 38% placebo. Dysesthesia higher at 7.2 mg (22%) vs. 2.4 mg (6%) or placebo (0.3%). | FDA approval announcement + STEP UP publication | Higher dose produced more weight loss with a different side-effect profile. Higher dose is not a free upgrade. |
| ATTAIN-1 (2025) | Orforglipron 6/12/36 mg oral | Adults with obesity, no T2D (n=3,127) | % body-weight change at week 72 (treatment-regimen estimand primary) | Efficacy estimand up to 12.4% (27.3 lb) vs. 0.9% (2.2 lb) placebo. 59.6% achieved ≥10%, 39.6% achieved ≥15%. | NEJM 2025; Lilly topline release | First oral small-molecule GLP-1 with clinically meaningful weight loss. Smaller average effect than top injectables. |
| ATTAIN-2 (2026) | Orforglipron oral | Adults with obesity + T2D | % body-weight change at week 72 | Up to 10.5% (22.9 lb) vs. 2.2% (5.1 lb). A1C reduction 1.3–1.8% from 8.1% baseline; 75% reached A1C ≤6.5%. | Lancet 2025/2026; Lilly topline release | In adults with obesity and T2D, orforglipron reduced body weight and A1C. Do not read as a separate diabetes indication unless the current FDA label says so. |
Diabetes trials
| Trial | Drug | Population | Primary endpoint | Headline result | Source | What this proves |
|---|---|---|---|---|---|---|
| ACHIEVE-1 (2025) | Orforglipron oral | Adults with T2D on diet/exercise only | A1C reduction at week 40 | Efficacy estimand: A1C down 1.3–1.6% from 8.0% baseline; 65%+ of highest dose reached A1C ≤6.5%; 7.9% weight loss at highest dose (secondary) | Lilly topline release | First oral small-molecule GLP-1 to clear Phase 3 in T2D. |
| ACHIEVE-3 (2026) | Orforglipron vs. oral semaglutide | T2D, head-to-head | A1C reduction at week 52 | Orforglipron 12/36 mg: A1C down 1.9%/2.2%. Oral semaglutide 7/14 mg: A1C down 1.1%/1.4%. Weight loss 6.7%/9.2% vs. 3.7%/5.3% — 73.6% greater relative weight loss at highest comparison. | The Lancet 2026; Lilly topline release | Orforglipron beat oral semaglutide head-to-head on A1C and weight, at these doses, in this population. |
For the full indication picture, see the GLP-1 FDA indication vs. off-label use guide.
Liver (MASH) trials

| Trial | Drug | Population | Primary endpoint(s) | Headline result | Source | What this proves |
|---|---|---|---|---|---|---|
| ESSENCE Part 1 (interim, 2024–2025) | Semaglutide 2.4 mg s.c. weekly | Biopsy-confirmed MASH, fibrosis F2 or F3 (800 of 1,197 randomized) | Co-primary at week 72: (1) MASH resolution with no worsening of fibrosis; (2) fibrosis improvement with no worsening of MASH | (1) 62.9% vs. 34.3% placebo (est. difference 28.7%, p<0.001); (2) 36.8% vs. 22.4% (est. difference 14.4%, p<0.001) | NEJM 2025 (Sanyal et al.); FDA Wegovy MASH approval Aug 2025 | Semaglutide produced significantly higher rates of liver-tissue improvement at 72 weeks. Long-term clinical outcomes come from Part 2 at week 240 (ongoing). |

Disease-specific trials worth knowing
| Trial | Drug | Population | Primary endpoint | Source | What it adds |
|---|---|---|---|---|---|
| STEP-HFpEF / STEP-HFpEF DM | Semaglutide 2.4 mg | Obesity-related HFpEF (preserved ejection fraction) | Dual primary at week 52: KCCQ-CSS change and % body-weight change | NEJM 2023 (Kosiborod et al.) | Symptom and function improvements plus weight. Trial was not powered to prove mortality reduction. |
| SURMOUNT-OSA | Tirzepatide | Adults with obesity and moderate-to-severe OSA | Change in AHI | NEJM 2024 (Malhotra et al.) | AHI and several secondary endpoints improved vs. placebo. Any weight-independent claim should be tied to a specific post-hoc or mediation analysis. |
| SUMMIT | Tirzepatide | HFpEF with obesity | Composite of CV death or worsening HF events; KCCQ-CSS | NEJM 2024/2025 (Packer et al.) | Reduced the composite of CV death or worsening HF events and improved KCCQ-CSS. Not a standalone mortality endpoint. |
Ongoing trials worth tracking
| Trial | Drug | Population | Primary endpoint | Status |
|---|---|---|---|---|
| SURMOUNT-MMO (NCT05556512) | Tirzepatide vs. placebo | ~15,000 adults with obesity, no T2D | Composite: all-cause mortality, non-fatal MI, non-fatal stroke, coronary revascularization, HF events | Ongoing. The pivotal CVOT for tirzepatide in non-diabetic obesity. |
| ESSENCE Part 2 | Semaglutide 2.4 mg | Biopsy-confirmed MASH | Long-term clinical liver outcomes at week 240 | Ongoing. Will test whether week-72 histology gains translate into fewer cirrhosis events. |
| TRIUMPH-3 (NCT05882045) | Retatrutide (GIP/GLP-1/glucagon receptor agonist) | Severe obesity + established CVD | Efficacy and safety vs. placebo | Active, not recruiting; ~113 weeks. |
| TRIUMPH-Outcomes (NCT06383390) | Retatrutide | BMI ≥27 with ASCVD and/or CKD | Cardiovascular and kidney outcomes | Ongoing; longer-term outcomes trial. |
Estimands, hazard ratios, and the words that change everything

Estimand: efficacy vs. treatment-regimen
Two trials can both report “average weight loss” and answer different questions because they used different estimands. If two trials use different estimands, their headline weight-loss numbers are not clean comparisons.
Treatment-regimen estimand
Answers: what was the average effect across everyone who was assigned to the drug, regardless of whether they stayed on it or added other treatments? This is closer to “what happens in the real world.”
Efficacy estimand
Answers: what would the average effect have been if everyone had stayed on the drug for the full trial without adding other treatments? This is closer to “best case if adherence is perfect.”
Same trial, two estimands: SURMOUNT-1
| Endpoint | Treatment-regimen estimand | Efficacy estimand |
|---|---|---|
| Mean weight loss, tirzepatide 5 / 10 / 15 mg | 15.0% / 19.5% / 20.9% | 16.0% / 21.4% / 22.5% |
| Mean weight loss, placebo | 3.1% | 2.4% |
| % achieving ≥5% loss, tirzepatide 5 / 10 / 15 mg | 85.1% / 88.9% / 90.9% | 89% / 96% / 96% |
| % achieving ≥5% loss, placebo | 34.5% | 28% |
Same trial. Same patients. Two different statistical questions. Both numbers are honest. They answer different versions of “did the drug work.”
Hazard ratio
A hazard ratio (HR) compares the rate of an event in two groups over time. HR of 1.0 means no difference. HR below 1.0 means fewer events in the treatment group; above 1.0 means more.
SELECT
0.80
~20% fewer MACE events vs. placebo over up to 5 years
FLOW
0.76
24% lower rate of kidney composite vs. placebo
SURPASS-CVOT
0.92
Directionally favorable vs. dulaglutide; did not reach superiority
A hazard ratio comes with a confidence interval (CI) — usually 95% — the range of values consistent with the data. If the 95% CI does not cross 1.0, the result is statistically significant. SELECT's CI was 0.72–0.90, comfortably below 1.0.
Relative vs. absolute risk reduction
A 20% relative risk reduction is not the same as a 20-percentage-point absolute risk reduction.
SELECT: 6.5% of the semaglutide group had a MACE event vs. 8.0% of placebo.
Absolute risk reduction
1.5 pts
8.0% − 6.5% = 1.5 percentage points
Relative risk reduction
~20%
1.5 / 8.0 ≈ 19% — the headline number
Number needed to treat (NNT)
~67
1 / 0.015 ≈ 67 patients treated for 5 years to prevent 1 MACE event
The relative number sounds more impressive — that's why press releases lead with it. Both are honest. The NNT doesn't make the result less real; it frames it accurately at the individual level.
How to read a GLP-1 trial press release without getting played
GLP-1 trial press releases follow a template. Once you know the template, every announcement reads the same way.
The translation template you can run against any announcement:
“[Drug] met [primary endpoint] in [population] compared with [comparator] at [timepoint] using [estimand]. The result means [plain-English meaning]. It does not prove [specific limitation].”
SELECT, decoded
NEJM 2023 — peer-reviewedSemaglutide 2.4 mg met its primary endpoint (3-point MACE) in adults aged ≥45 with overweight/obesity (BMI ≥27) and established CVD but without diabetes, compared with placebo over up to 5 years using a time-to-first-event analysis. The result (HR 0.80; 95% CI 0.72–0.90) means semaglutide reduced the rate of major CV events in this high-risk group by about 20%. It does not prove that GLP-1s reduce CV risk in lower-risk users, people with diabetes, or people without established CVD.
ATTAIN-1, decoded
NEJM 2025 + Lilly topline releaseOrforglipron (highest dose, 36 mg, once-daily oral) met its primary endpoint (% body-weight change at week 72, treatment-regimen estimand) in 3,127 adults with obesity, no diabetes, compared with placebo. Up to 12.4% (27.3 lb) weight loss using the efficacy estimand. It does not prove orforglipron matches injectable tirzepatide head-to-head -- that's a different trial.
SURPASS-CVOT, decoded
ACC summary 2026; primary publication December 2025 — peer-reviewedTirzepatide met its primary endpoint of noninferiority to dulaglutide for 3-point MACE in T2D patients with established ASCVD (12.2% vs. 13.1%; HR 0.92). It did NOT meet superiority. The result means tirzepatide is at least as safe as dulaglutide for MACE in this population. It does not prove tirzepatide is better than dulaglutide on cardiovascular outcomes.
ESSENCE Part 1, decoded
NEJM 2025 — peer-reviewedSemaglutide 2.4 mg met both co-primary endpoints (MASH resolution without worsening fibrosis; fibrosis improvement without worsening MASH) in biopsy-confirmed MASH patients with F2/F3 fibrosis at week 72. It does not yet prove reduction in cirrhosis, liver failure, or death -- that data comes from Part 2 at week 240.
Press release red flags
- !Only relative numbers, no absolutes — "reduced events by 20%" with no baseline rate given.
- !Secondary endpoint headlined as if it were primary — if a primary endpoint missed, sponsors sometimes lead with positive secondaries.
- !No estimand named — if you can't tell which question the analysis answered, the headline number is harder to compare.
- !No comparator specified — a result vs. "standard of care" when standard of care differs by country.
- !Topline/investor release before peer review — results can shift when the full dataset is published.
Glossary of GLP-1 trial terms
A1C (HbA1c)
Blood test reflecting average blood-glucose level over the past 2–3 months. ADA defines diabetes as A1C ≥6.5%. Used in: SUSTAIN, ACHIEVE, SURPASS as primary; many trials as secondary. Common misread: assuming a 1.0% A1C drop equals the same clinical benefit as a 1.5% drop -- magnitude matters.
AHI (apnea-hypopnea index)
Breathing pauses and shallow-breathing events per hour of sleep. Used in: SURMOUNT-OSA. Common misread: equating an AHI improvement with cure or elimination of PAP-therapy need.
Composite endpoint
A single endpoint built from multiple event types counted together (e.g., MACE; FLOW's kidney composite). Common misread: assuming all components moved equally -- they often didn't.
Confidence interval (CI)
Range of values consistent with the trial data, usually 95%. If the CI for a hazard ratio doesn't cross 1.0, the result is statistically significant at the conventional threshold.
Co-primary endpoints
Two primary endpoints, both usually required to pass. Used in: SURMOUNT-1, STEP-HFpEF, ESSENCE.
Efficacy estimand
Statistical analysis asking "what would the effect have been if everyone stayed on treatment for the full trial without rescue therapy?" Common misread: treating this as the real-world result.
eGFR (estimated glomerular filtration rate)
Blood-test estimate of kidney filtering function, in mL/min/1.73 m². Lower is worse.
eGFR slope
Annual rate of change in eGFR. Flatter slope = function declining more slowly.
Hazard ratio (HR)
Ratio of event rates between two groups over time. HR < 1.0 = fewer events with treatment. Common misread: treating HR as absolute risk reduction.
Intention-to-treat (ITT)
Analyzing every randomized participant in the group they were assigned to, regardless of what happened next. Standard for primary efficacy analysis in registration trials.
KCCQ-CSS
Kansas City Cardiomyopathy Questionnaire Clinical Summary Score. Patient-reported measure of heart-failure symptoms and physical limitations. Higher is better. Used in: STEP-HFpEF, SUMMIT. Common misread: assuming a symptom benefit equals a mortality benefit.
MASH
Metabolic dysfunction-associated steatohepatitis. Progressive, inflammatory form of fatty liver disease, formerly called NASH.
NAS (NAFLD activity score)
Histologic scoring system for fatty liver. Components: steatosis, inflammation, ballooning. Used in MASH trial biopsy endpoints.
NASH CRN fibrosis stage
F0–F4 staging of liver scarring used in MASH trials. F2 = moderate; F3 = advanced (bridging); F4 = cirrhosis.
Number needed to treat (NNT)
Approximate number of patients who must be treated for one event to be prevented. Useful for putting absolute risk reduction in patient terms. SELECT MACE NNT ≈ 67 over up to 5 years.
Persistent ≥50% eGFR decline
Kidney endpoint defined as a sustained drop in eGFR of at least 50%, confirmed on a follow-up measurement. Used in: FLOW.
p-value
Probability of seeing the observed result (or more extreme) if there were truly no difference. p<0.05 is the conventional threshold; trials with multiplicity adjustment use stricter cutoffs.
Primary endpoint
The main pre-specified outcome the trial is statistically powered to test. Common misread: "met primary endpoint" = clinically meaningful for you. It does not.
Responder analysis
Reporting the share of participants who crossed a fixed threshold (≥5%, ≥10%, ≥15%, ≥20% weight loss). FDA cautions that responder analyses can exaggerate effects when used as the primary metric.
Secondary endpoint
Additional outcomes the trial measured. Supports or qualifies the primary result. Not proof on its own.
Surrogate endpoint
Measurable marker (A1C, percent body weight, biopsy findings) believed to correlate with long-term clinical outcomes. Surrogates speed approvals but require confirmation.
Treatment-regimen estimand (treatment-policy estimand)
Statistical analysis asking "what was the effect across everyone assigned to treatment, regardless of adherence or added therapies?" Closer to real-world experience.
UACR (urine albumin-to-creatinine ratio)
Urine test for protein leakage; elevated UACR signals kidney damage. Used in: FLOW, SELECT kidney sub-analysis.
What we actually verified for this page
Source order we used:
- 1.FDA guidance documents, FDA labels, and FDA approval announcements (regulatory benchmarks and indication scope)
- 2.Peer-reviewed primary publications (NEJM, Lancet, Nature Medicine, JACC, Diabetes Care, JAMA, Nephrology Dialysis Transplantation)
- 3.ClinicalTrials.gov records and posted protocols
- 4.Professional society summaries (ACC, ADA, AASLD)
- 5.Manufacturer investor-relations releases, clearly labeled as manufacturer source material
- 6.Patient and forum language for understanding how readers ask questions — never as medical evidence
Specific sources verified:
- ✓SELECT primary results: NEJM 2023, ACC SELECT summary — HR 0.80 for 3-point MACE, 6.5% vs. 8.0%, up to 5-year follow-up
- ✓FLOW primary results: NEJM 2024 (Perkovic et al.), ADA press release, NephJC summary — HR 0.76 for kidney composite
- ✓SURPASS-CVOT: ACC summary 2026; primary publication December 17, 2025 — HR 0.92, noninferior, did not meet superiority
- ✓ESSENCE Part 1 interim: NEJM 2025 (Sanyal et al.), AASLD 2024 presentation — 62.9% vs. 34.3% MASH resolution; 36.8% vs. 22.4% fibrosis improvement
- ✓SURMOUNT-1: NEJM 2022 (Jastreboff et al.), ACC summary, Lilly investor release — treatment-regimen and efficacy estimands reported separately
- ✓SURMOUNT-3: Nature Medicine 2023 | SURMOUNT-4: JAMA 2024
- ✓SURMOUNT-5: NEJM 2025 (Aronne et al.) — tirzepatide −20.2% vs. semaglutide −13.7% at week 72
- ✓STEP UP / Wegovy 7.2 mg: FDA approval announcement (March 2026), STEP UP primary publication, Wegovy prescribing information
- ✓STEP-HFpEF and SUMMIT: NEJM publications (Kosiborod et al.; Packer et al.)
- ✓SURMOUNT-OSA: NEJM 2024 (Malhotra et al.); ADA summary
- ✓ATTAIN-1, ATTAIN-2: NEJM 2025 + Lancet 2025/2026; Lilly topline releases
- ✓ACHIEVE-1, ACHIEVE-3: Lilly topline releases; Lancet 2026 for ACHIEVE-3 (peer-reviewed)
- ✓FDA obesity drug guidance (January 2025 draft): Federal Register notice
- ✓FDA E9(R1) estimand guidance: FDA regulatory information page
- ✓Earlier CVOT meta-analysis: peer-reviewed pooled analysis, HR 0.86 (95% CI 0.80–0.93)
- ✓TRIUMPH-3 and TRIUMPH-Outcomes: ClinicalTrials.gov NCT05882045 and NCT06383390
Frequently asked questions about GLP-1 clinical trial endpoints
What is a primary endpoint in a GLP-1 clinical trial?
The primary endpoint is the single outcome a trial was designed and statistically powered to test. In obesity trials, it's usually percent body-weight change at a fixed week. In diabetes trials, it's usually A1C reduction. In cardiovascular trials, it's usually 3-point MACE. The primary endpoint answers "did it work?" -- everything else supports or qualifies that answer.
What is a secondary endpoint in a GLP-1 trial?
A secondary endpoint is anything else the trial measured to support, qualify, or extend the primary result. Common examples: waist circumference, blood pressure, lipids, fasting glucose, A1C target attainment, responder thresholds. Secondary results inform interpretation but are not strong enough on their own to claim the drug works.
What does "met its primary endpoint" mean?
The trial's main pre-specified analysis was positive at the statistical threshold the protocol set -- usually p<0.05, or stricter when the trial used multiplicity adjustment. It does not mean the result is clinically meaningful for any individual patient, that the drug is approved, or that safety endpoints came in clean.
What is MACE in GLP-1 trials?
MACE stands for major adverse cardiovascular events. 3-point MACE is a composite counting cardiovascular death, non-fatal myocardial infarction, and non-fatal stroke as a single event. 4-point MACE adds hospitalization for unstable angina or heart failure. SELECT's primary endpoint was 3-point MACE.
What is a hazard ratio?
A hazard ratio (HR) compares event rates between two groups over time. HR of 1.0 means no difference. HR below 1.0 means fewer events in the treatment group. SELECT's HR for MACE was 0.80, meaning about 20% fewer MACE events with semaglutide than placebo over up to 5 years.
What is the difference between efficacy estimand and treatment-regimen estimand?
The efficacy estimand answers what the effect would have been if everyone had stayed on the drug for the full trial without rescue therapy -- closer to best-case adherence. The treatment-regimen estimand answers what the effect was across everyone assigned to the drug, whether they stayed on it or not -- closer to real-world experience. Both are reported in modern GLP-1 trials.
Is percent weight loss the same as pounds lost?
No. Percent body-weight change is a relative measure. 15% of a 200-lb person is 30 lb; 15% of a 300-lb person is 45 lb. Trials standardize on percent to compare across body sizes. Always check starting weight or average body weight in the trial population when translating percent to pounds.
What does ≥20% weight loss mean in a trial?
It's a responder threshold -- the share of participants who lost at least 20% of their starting body weight by the endpoint week. In SURMOUNT-1, 63% of the tirzepatide 15 mg group hit ≥20% at week 72. Responder rates can complement mean results, but FDA cautions that responder analyses can exaggerate effects if used as the primary metric.
What is a co-primary endpoint?
A co-primary endpoint is one of two primary endpoints where both usually have to pass for the trial to claim success. SURMOUNT-1 used two: mean percent body-weight change and the proportion losing at least 5%. ESSENCE used two: MASH resolution without worsening fibrosis, and fibrosis improvement without worsening MASH.
What is a composite endpoint?
A composite endpoint counts multiple events as one. SELECT's 3-point MACE combines cardiovascular death, non-fatal MI, and non-fatal stroke. FLOW's kidney composite combines kidney failure, persistent ≥50% eGFR decline, and kidney or cardiovascular death. Always check that components moved in the same direction -- they sometimes don't.
What endpoint did SELECT use?
SELECT used 3-point MACE (cardiovascular death, non-fatal myocardial infarction, non-fatal stroke) as the primary endpoint, in adults aged ≥45 with BMI ≥27, established CVD, and no diabetes. Secondary endpoints included a heart-failure composite, kidney composite, and all-cause death.
What did the FLOW trial measure?
FLOW measured a composite kidney endpoint: time to first kidney failure (persistent eGFR <15 or chronic kidney replacement therapy), persistent ≥50% reduction in eGFR, or death from kidney or cardiovascular causes -- in adults with type 2 diabetes and chronic kidney disease. HR was 0.76, a 24% relative risk reduction. The trial was stopped early under pre-specified stopping rules.
What primary endpoints did the SURMOUNT trials use?
All SURMOUNT obesity trials used co-primary endpoints centered on percent body-weight change and ≥5% responder rate at the trial's main timepoint (typically week 72). SURMOUNT-4 measured weight change during a maintenance phase. SURMOUNT-5 was a head-to-head against semaglutide (-20.2% vs -13.7%). SURMOUNT-MMO is using a CV-event composite. SURMOUNT-OSA used change in AHI.
Why are GLP-1 trial weight-loss numbers sometimes different in the press release vs. the journal article?
Different estimands. Manufacturers often report both the efficacy estimand (idealized adherence) and the treatment-regimen estimand (real-world assigned-treatment). Press release headlines sometimes lead with the larger efficacy estimand number; the journal usually shows both. This is not dishonest -- they answer different questions -- but you need to know which number you're reading.
Is "statistically significant" the same as "clinically meaningful"?
No. Statistical significance tells you the result is unlikely to be chance. Clinical meaning depends on the absolute size of the effect, durability, safety profile, and how relevant the result is to a specific patient. A small but statistically significant difference can be clinically trivial; a large but underpowered result can be clinically important without quite hitting significance.
Is a surrogate endpoint as good as a hard endpoint?
No. A surrogate (A1C, percent body weight, biopsy findings) is a marker believed to correlate with long-term clinical outcomes. A hard endpoint is the outcome itself (heart attack, stroke, death, transplant). Surrogates allow trials to read out faster; hard outcomes carry more weight when available.
Does the FDA require the same endpoint for every GLP-1 trial?
No. The expected primary endpoint depends on the indication. Obesity drugs: percent body-weight change per FDA's January 2025 draft guidance. Diabetes drugs: A1C, plus a cardiovascular safety evaluation. MASH drugs: histologic endpoints for accelerated approval, with longer-term clinical confirmation required. Cardiovascular outcome trials: MACE or composite outcomes.
Why did the FLOW trial stop early?
FLOW was stopped at a planned interim analysis after the primary outcome reached statistical significance for efficacy under the trial's pre-specified stopping rules. Early stopping is not a red flag when pre-specified -- it means the benefit was clear enough before the planned end date that continuing would have been ethically difficult.
Related guides and tools
- GLP-1 Clinical Trials Tracker: Active and Recent Phase 3 Studies
- GLP-1 FDA Indications Chart: What Each Drug Is Approved For
- GLP-1 FDA Indication vs. Off-Label Use: The Real Difference
- GLP-1 Adverse Event Rates Explained: 2026 FDA Label Data
- Best GLP-1 with the Least Side Effects: 2026 Trial Data
- GLP-1 Side Effects Comparison Chart (2026)
- Foundayo Reviews: Real Patient Data and What the ATTAIN Trials Show
- GLP-1 Path Quiz — Find the Option That Fits Your Situation
About this page: The RX Index is a pricing intelligence and comparison resource for GLP-1 telehealth providers. This page is independently produced for educational purposes. We have no relationship with any drug manufacturer mentioned. Some pages on therxindex.com reference telehealth providers with whom we have affiliate relationships. This page does not.
Correction policy: If a source changes or a new trial publication supersedes a press release, we update the table and note the change in the version history above.
Last verified: May 14, 2026. The RX Index editorial team. Refresh cadence: quarterly; monthly during major scientific congresses.