Is Claude Dumb Today?
Daily HumanEvalPlus-CC164 benchmark for Claude Code (Opus 4.6)
...
Loading latest results…
Score
—
Model
—
Cost
—
Runtime
—
Score History (last 90 runs)
Per-Task Results
Task
Function
Result
Attempts
Turns
Cost
Error
Loading…