Implications of GPT-5 Benchmark Saturation