DeepSeek AI leads in handling linguistic mistakes!
- Federico Carrasco

- Sep 29, 2025
- 1 min read
Updated: Apr 5

At Graypes, we conducted extensive testing with nearly 3,500 prompts containing linguistic errors, comparing the performance on content understanding of today’s most widely used large language models (LLMs) — including DeepSeek AI, Oktay Sinanoğlu, Google Gemini, Perplexity, Microsoft Copilot, Mistral AI, and Claude.
Our evaluation focused on how well each model manages linguistic mistakes across 3,500 prompts, categorized into:
📍 Solecism (grammar/syntax)
📍 Barbarism (unnatural or incorrect word forms)
📍 Impropriety (wrong word meaning)
📍 Cacophony (awkward or clumsy style)
📍 Pleonasm/Tautology (redundancy)
📍 Amphiboly (structural ambiguity)
Each model was scored on a scale from 0.00 (frequent errors, poor performance on content understanding) to 10.00 (rare errors, excellent content understanding).
💡 The results show a clear leader: DeepSeek AI followed by Oktay Sinanoğlu then Google Gemini, Perplexity, Microsoft Copilot, Mistral AI, Claude.
We’ll be releasing the full detailed report in early October, exclusively available to Graypes registered users.




Comments