chinese models are so smart because they have 8 time more meaning per token due to symbol based writing system