SWE-rebench from Nebius, problems from Aug 31st to Sep 30 They draw attention to GLM 4.6 which is in the same tier as GPT-5-medium, but I notice very strong performance by *both* GLMs 4.5*(released Aug 11), in fact it's nearly identical. So I expect great things from 4.6 Air.
Loading thread detail
Fetching the original tweets from X for a clean reading view.
Hang tight—this usually only takes a few seconds.
