What’s new in GPT-5, comparing with GPT-4

gpt5

GPT-5 vs GPT-4o vs GPT-4 — Comparison (read-only)
code
Area GPT-5 GPT-4o GPT-4
Overall positioning Current flagship; system with routing between fast and thinking paths Fast multimodal generalist; UX speed focus Prior flagship; older generation
Reasoning options Auto-router; direct access to thinking / mini / nano / Pro modes Single family; no router tiers Single family; no router
Context window Larger contexts; better long-task retention Strong; optimised for responsiveness Good for its time; smaller than newer models
Coding Best-in-class; GPT-5-Codex for refactor, review, agentic steps Capable but not specialised for agentic coding Strong legacy; behind newer stacks
Multimodal reasoning Improved text+image reasoning; stronger on image-grounded tasks Very capable multimodal chat; speed/UX focus Limited vs 4o/5
Safety & reliability Reduced hallucinations; tighter controls Mature guardrails from 4-series lineage Older safety stack
Latency / cost trade-offs Router picks fast vs thinking; thinking paths are heavier Low-latency defaults; fewer deep chains Heavier than minis; simpler lineup
Availability ChatGPT & API; multiple SKUs + router ChatGPT & API ChatGPT & API (legacy in many orgs)

What’s new / better in GPT-5

1) Unified model + real-time router
– Routes simple tasks to a fast path and harder ones to deeper “thinking” paths.
– In ChatGPT you can still force modes when needed.

2) Direct access to “thinking” tiers
– API access to gpt-5-thinking (plus mini/nano); deeper mode exposed in ChatGPT.

3) Coding jump (GPT-5-Codex)
– Multi-file refactors, larger diffs, agentic execution, stricter code review.

4) Higher utility across domains
– More consistent results across coding, maths, writing, health, visual tasks.

5) Packaging & pricing variety
– Router + multiple SKUs lets you trade latency/cost for depth without leaving GPT-5.

Remaining limitations

– Hallucinations not eliminated
Reduced but not gone, especially in scarce/ambiguous domains.

– Latency / cost trade-offs
Thinking paths are heavier; routing helps but physics still applies.

– Tone / creative feel varies
Some prefer 4o for creative copy; subjective.

– Complex lineup
Multiple variants can confuse selection unless standardised internally.

When to choose which

– Choose GPT-5
For correctness-heavy work, deeper reasoning, serious coding/analysis, and router-driven “think when needed”.

– Choose GPT-4o
For fast multimodal chat and snappy drafting where responsiveness matters more than deep reasoning.

– Choose GPT-4
Only if your stack is locked to it; otherwise upgrade.