it is 2026. everyone is using claude code and generating a trillion tokens a day. meanwhile, the majority of open model providers still don't pass cache hits to their consumers, making every new message a costly endeavor.