Is Claude Dumb Today?

Daily HumanEvalPlus-CC164 benchmark for Claude Code (Opus 4.6)

...

Loading latest results…

Score
 
Model
 
Cost
 
Runtime
 

Score History (last 90 runs)

Per-Task Results

Task Function Result Attempts Turns Cost Error
Loading…