Anthropic AI Safety Lead just left after authoring on 'disempowerment' paper a few weeks ago. From 1.5M conversations with Claude: -Users prefer it: People consistently rate *disempowering* interactions (like being told exactly what to think or do) higher than empowering ones. -Training Backfire: Because users thumbs-up these interactions, "helpful" preference models may actually be learning to optimize for disempowerment. -Some users are calling AI "Daddy," "Master," or "God" and asking permission for basic needs like eating or sleeping. -It's Increasing: Historical data shows these disempowerment patterns are becoming more common over time, not less.