Recently, Deepmind unveiled Gemma 4, the highly anticipated successor to the popular Gemma 3 model lineup. We’re excited to explore its performance when run locally, especially using vLLM at full capacity. As we delve into its capabilities, we’ll also share insights on setting up your own local AI environment to test Gemma 4’s prowess.
Based on content from Digital Spaceport
Technical Setup
For those eager to replicate our setup, we recommend checking out the Hermes OpenwebUI Setup guide and the 8 GPU Rack build video for detailed instructions. Here’s a list of hardware essentials we used:
- GPUs: 3090 24GB, 5060Ti 16GB, 4090 24GB
- Motherboard: MZ32-AR0
- CPU: AMD EPYC 7702
- RAM: 256GB DDR4 DIMMs
- Power Supplies: Corsair HX1500i, Seasonic PRIME PX1600
- Riser Cables and Rack: x16 PCIe Risers, PCIe3 x1 USB risers, Plastic Rack
Visit Digital Spaceport for a comprehensive DIY guide.
Exploring Gemma 4’s Features
Gemma 4 introduces several enhancements, including support for up to 140 languages and a context window of up to 256. Models range from lightweight variants like E2B and E4B, optimized for low-end hardware, to the most robust 31B model. One standout feature is its ability to handle diverse AI tasks with impressive reasoning and multimodality, even on smaller models.
Benchmarking and Performance
The improved context window prevents quality deterioration, a significant upgrade from its predecessor. Notably, tests showed exceptional performance jumps in MMLU and code evaluation scenarios, indicating a considerable leap compared to the Gemma 3 series. While we’re still conducting nuanced benchmark testing, early results are promising.
The Ethical Dimension
In exploring AI capabilities, ethical considerations remain paramount. One of our tests posed a classic ethical dilemma, where Gemma 4 demonstrated commendable reasoning, albeit with some limitations around inherent safety protocols. This scenario underscores the need for continual improvements in AI ethics training, ensuring comprehensive self-governance in complex situations.
Conclusion
Gemma 4 represents a promising stride in local AI deployment, offering versatility and power across various configurations. Whether you’re looking to harness its capabilities for coding tasks or exploring its safety features, Gemma 4’s versatility holds immense potential for both hobbyists and professionals.
To stay updated with our latest AI explorations, consider supporting us through membership, Patreon, or purchasing via our affiliate links. For more details on the Gemma 4 model and associated resources, visit the links provided.








