Ollama v0.23.1 Accelerates Gemma 4 MLX Inference with MTP Speculative Decoding
Ollama has released version 0.23.1, introducing Gemma 4 Multi-token Processing (MTP) speculative decoding for its MLX runner. This update, primarily benefiting macOS users with Apple Silicon, can prov...
Read more →