MiniMax M2.7 scores worse than M2.5 on BridgeBench. M2.5 ranked #12. Overall 92.3. M2.7 ranked #19. Overall 88.1. UI dropped from 76.6 to 61.9. Refactor from 97.3 to 90.7. Gen from 94.3 to 89.2. #1 on Multi-SWE Bench. #19 on BridgeBench. That's a 18 rank difference between synthetic benchmarks and real vibe coding evaluations. This is why BridgeBench exists.