GLM team is now using MLA!! this is pretty insane model with 30B total param and about 4B active. very nice release in terms of structure it's approximatively the same depth as glm4.5 air and qwen3 30B A3B, 64 total expert instead of 128, but they only active 5 instead of 9 if you count the shared expert