Analysis · Clinical data analysis · June 2026

STEP-4: the trial everyone misquotes.

Bulls say it proves the drug keeps working. Skeptics say it proves you gain it all back. Both misread the same withdrawal design — and the truth in between is the whole point.

Clinical data and laboratory glassware in soft daylight

STEP-4 is the most-cited semaglutide trial that almost nobody describes correctly. Enthusiasts quote it to prove the drug keeps working. Skeptics quote it to prove you gain it all back. I think both readings are wrong — and the gap between them is the most useful thing in the paper.

Here is the thesis. STEP-4, published in JAMA in 2021, is not an efficacy trial in the ordinary sense. It is a randomized-withdrawal trial, a design built to answer one specific question: once the weight is off, what does the drug do for maintenance? That single structural fact changes what every number in it means — and it is the fact that gets dropped first when the trial is summarized on social media or in a sales deck.

Read the design correctly and STEP-4 stops being ammunition for either camp. It becomes the cleanest evidence we have that obesity behaves like a chronic, relapsing condition rather than a one-time problem you fix and walk away from. Let me take the numbers in the order the trial actually produced them.

The design is the argument

STEP-4 did not randomize people at the start. Every one of the 902 enrolled participants first went through a 20-week open-label run-in, all of them on semaglutide 2.4 mg, titrated up to the full maintenance dose. Only the 803 who reached that dose were then randomized — 2:1 — either to continue semaglutide for another 48 weeks or to switch, blinded, to placebo.

That is why STEP-4 cannot be quoted like STEP-1 or a SURMOUNT arm. Everyone in the randomized phase had already lost weight and already tolerated the drug. The trial is not measuring whether semaglutide causes weight loss. It is measuring what happens at week 20 when you either keep it going or take it away. Miss that and every figure below gets misread.

Number one: 10.6 percent, before anyone was randomized

During the 20-week run-in, the mean weight loss was about 10.6 percent. This is the number that quietly disqualifies STEP-4 as a general-efficacy citation. The randomized population is responder-enriched by construction: it excludes the people who could not tolerate titration or did not lose enough to continue. So when you see the eye-catching continuation figure later, remember it describes people who were already winning at week 20 — not the average patient walking into a clinic.

Number two: the curve that kept falling

From week 20 to week 68, the group that stayed on semaglutide lost a further 7.9 percent on average — reaching roughly 17.4 percent total weight loss from baseline. The bulls love this number, and it is real: at 68 weeks the maintenance group had not plateaued in the way you might expect after such a long run.

But the honest framing is narrow. This is continued loss in a population pre-selected for response, not proof that the drug "keeps working" for everyone indefinitely. It tells you the maintenance effect is durable in responders over 68 weeks. That is genuinely useful — and it is a much smaller claim than the one usually built on top of it.

The number everyone fights over is the regain figure. Read it carefully and it refutes both the people who cite it and the people who fear it.

Number three: 6.9 — not everything

This is the misquote that matters. The group switched to placebo at week 20 regained about 6.9 percent over the following 48 weeks. The estimated treatment difference between the two arms was roughly 14.8 percentage points — one of the largest divergences you will see in a maintenance trial.

Now do the arithmetic the headlines skip. The placebo-switch group had lost 10.6 percent in run-in, then regained 6.9. They did not return to baseline. They ended the trial still meaningfully below where they started — a net loss preserved, not erased. "You gain it all back" is not what STEP-4 shows. What it shows is partial regain after abrupt withdrawal of a full maintenance dose, with the trajectory bending upward but not all the way home over 48 weeks.

So the skeptic's quote is false on its face, and the enthusiast's quote — that it "just keeps working" — leans on a responder-enriched arm. The accurate sentence is duller and more important: stop the drug and the body starts defending its old setpoint again — partially, and on a timescale of months.

What actually reverted: the cardiometabolic numbers

The part almost nobody quotes at all is what happened to the rest of the panel. In the placebo-switch group, the improvements in waist circumference, blood pressure, and several cardiometabolic markers drifted back toward baseline alongside the weight. The benefits, in other words, tracked the drug, not some permanent reset of physiology achieved during the first 20 weeks.

I think this is the most underrated finding in the trial. It reframes the whole class. The weight number is the headline, but STEP-4 quietly says the metabolic improvements are a maintained state, not a cure banked once and kept. That is a statement about the nature of the condition, and it is far more consequential than any single percentage.

The counter-argument I take seriously

Here is the case against leaning too hard on STEP-4. The withdrawal was abrupt — full dose to placebo, by design. That is not how most real-world discontinuation happens; people taper, lose coverage gradually, or stop alongside other life changes. A cliff-edge withdrawal probably produces a sharper regain curve than a managed one, so the 6.9 percent should not be read as the universal shape of stopping.

And 68 weeks is short against a condition people live with for decades. STEP-4 tells you what the first year after a stop looks like in trial conditions. It does not tell you what five years of any pattern — continuous, intermittent, drug-plus-structure — would look like, because nobody ran that trial. Anyone extrapolating STEP-4 to "lifetime" certainty is filling a gap the data leaves open.

What STEP-4 actually proved

Strip out both spins and what remains is precise. In responders, semaglutide maintenance keeps weight coming off through 68 weeks. Withdraw it abruptly and roughly two-thirds of the run-in loss comes back over the next year, while the cardiometabolic gains fade in step — but participants still end below baseline, not back at it. That is not a story about willpower or about a miracle. It is a story about a chronic condition with a pharmacological maintenance phase, behaving the way chronic conditions with maintenance therapies tend to behave.

The reason STEP-4 gets misquoted in both directions is that neither slogan is comfortable. "It keeps working" oversells a selected arm; "you gain it all back" is simply false. The accurate reading — partial, drug-dependent, time-limited evidence about maintenance — does not fit on a graphic. Which is exactly why it is worth reading the trial instead of the quote.

Ozemback — June 2026

One analytical letter per month.

If you want the data read this closely the rest of the time, the monthly letter is more of it. First Sunday of every month. Free, never advice.

Subscribe to the letter
Editorial analysis — not medical advice Ozemback is an independent magazine. This essay reports and analyzes published clinical-trial data for educational purposes only and is not medical advice. The magazine does not recommend, endorse, or discourage any medication, dose, protocol, supplement, or intervention. Always consult a qualified, licensed healthcare professional for any medical question or decision. See full Legal & Disclaimer.