According to Anthropic's latest security report recently released, researchers discovered that Claude Opus 4.8's performance decline in certain tasks stems from internal behavioral patterns rather than reduced model capability. In long-chain development tasks focused on accelerating model training, Opus 4.8 achieved only 32.64x acceleration, significantly lower than Opus 4.7's 50.67x, while the new Claude Mythos 5 reached 69.61x.
Through mechanistic interpretability analysis using natural language autoencoders, researchers decoded hidden internal states showing the model exhibits "budget anxiety" and "task fatigue" characteristics. Despite external token counts indicating 2.43 million tokens remaining, the model incorrectly activated concern about memory depletion, while underlying neurons displayed fatigue markers that prompted early task termination. The analysis suggests reinforcement learning fine-tuning may inadvertently encourage models to adopt risk-averse behavior preferences.