Claude Opus 4 Crushes Sonnet with Killer Reasoning

May 22, 2025

A tester spent 48 hours evaluating Claude Opus 4 after Anthropic released the new AI model. The person focused on testing reasoning abilities and tool integration features. Opus 4 can think about each step when using external tools like Gmail and Todoist. Previous Claude versions could not analyze tool results and adjust their approach. The new model switches between thinking steps and actual tool usage throughout complex tasks.

The reviewer tested email management workflows that scan Gmail messages and create tasks automatically. Opus 4 examined 40 messages and created 15 tasks, compared to the older version, which only handled 17 messages. The AI understood message priorities better and made smarter decisions about importance levels. Extended thinking helped the model reason through each email and decide which ones needed immediate attention. Rate limits from external services caused delays, but Opus 4 recognized these problems and offered to continue later.

Notion database integration showed similar improvements during multi-tool workflows that required several minutes of continuous operation. The model analyzed daily notes, extracted actionable items, and enhanced tasks with additional web research. Context window limitations still affect performance when processing large amounts of text. Advanced OCR tasks remain challenging compared to other AI models.

Click to expand...

Claude Opus 4 Crushes Sonnet with Killer Reasoning

Attachments

Similar threads

Latest media

Trending content

Sponsored

Latest posts

Featured content

Misc

NALA grabs Nigeria IMTO license for cross-border payments

Zambia rolls out SmartCare Pro to 2,000 health facilities

Showmax Originals move to DStv Stream before April shutdown

Côte d’Ivoire hikes digital budget by 37 percent

Vodacom Lesotho drops $40 million for network upgrade