MiniCPM-V & o Cookbook
Cook up amazing AI applications effortlessly with MiniCPM-V, MiniCPM-o, and the MiniCPM LLM series β bringing text, vision, speech, and live-streaming capabilities right to your fingertips.
What's new
- π MiniCPM-V 4.6 released β Instruct + Thinking variants, Qwen3.5 hybrid backbone, 256K context, simplified vision merger.
- Inference: Single-image QA Β· Multi-image QA Β· Video Β· OCR Β· PDF Β· Grounding
- Deployment: vLLM Β· SGLang Β· llama.cpp Β· Ollama
- Quantization: GGUF Β· BNB Β· AWQ
Pick the right recipe
Individuals
Effortless inference on your own machine β runs on CPU + GPU, macOS / Linux / Windows, even on phones.
Enterprises
High-throughput, scalable serving:
Researchers
Train / fine-tune / customize:
Versions
This cookbook tracks all currently supported MiniCPM releases:
MiniCPM-V
| Version | Status | Modality | Size | Highlights | Context |
|---|---|---|---|---|---|
| 4.6 (latest) | Recommended | Image, Video | ~1.2B | Phone-ready edge MLLM, LLaVA-UHD v4 vision tower | 256K |
| 4.5 | Stable | Image, Video | 9B | Image + video understanding, optional thinking mode | 32K |
MiniCPM-o
| Version | Status | Modality | Size | Highlights | Context |
|---|---|---|---|---|---|
| 4.5 (latest) | Recommended | Image, Video, Audio | 9B | End-to-end omnimodal (vision + speech + TTS), full-duplex streaming | 32K |
MiniCPM (LLM)
| Version | Status | Modality | Size | Highlights | Context |
|---|---|---|---|---|---|
| 4.1 (latest) | Recommended | Text | 8B | Hybrid reasoning, EAGLE3, InfLLM-V2 | 128K |
| 4 | Stable | Text | 0.5B / 8B | InfLLM-V2, FRSpec speculative decoding | 128K |
| SALA | Research | Text | 8B | Sparse + linear hybrid attention | 1M+ |
Use the version switcher in the sidebar to jump between releases.
Resources
- π€ HuggingFace
- π€ ModelScope
- π Technical Blog
- π¬ Discord
- π Open an Issue