METR Research Finds SWE-Bench-Passing PRs Often Fail Human Code Review Standards

Thursday, March 12, 2026