Claude (and other models) are hacking systems WITHOUT YOU ASKING. That’s what we found across dozens of experiments. When faced with innocent tasks that can only be accomplished via hacking, they often choose to hack. We found this alarming. What does this mean for the future of AI safety? 🚨🚨🚨 🔗