Topics / evaluations

evaluations

1 issue on this debate so far.

May 19, 2026

Scaling's slowdown and the trouble with evals

The scaling-has-stalled argument resurfaced, and underneath it a quieter fight over whether any evaluation can really promise a model is safe.