This guy is awesome plus. He has embedded a WebAssembly (WASM) interpreter into the weights of the Transformer model in a hard-coded way, and it's lossless. This is equivalent to running a real computer inside the LLM. This computer can actually run computations, rather than just providing results through inference like most models do now. This idea is somewhat similar to TI's DSP chips, where the ARM handles logical thinking, and the DSP is specifically responsible for high-speed data computation. Each plays to its strengths. So, drawing a parallel, the LLM's persistent inability to determine which is larger between 3.11 and 3.8 can be addressed by constructing a hybrid architecture: 1. Neural networks are responsible for reasoning and understanding. 2. Embedded interpreters / computation engines are responsible for high-precision calculations. This way, we can balance intelligent reasoning capabilities with deterministic computational accuracy. This is very beneficial for numerical computation, physical simulation, financial modeling, and cryptographic operations.