LFM2-24B-A2B
Deploy LFM2-24B-A2B by Liquid AI on Clore.ai — hybrid SSM+Attention architecture with 24B total / 2B active parameters
At a Glance
Why LFM2-24B-A2B?
GPU Recommendations
GPU
VRAM
Performance
Daily Cost*
Deploy with vLLM
Install vLLM
Single GPU Setup
Query the Server
Deploy with Ollama
Ollama API Usage
Docker Template
Speed Benchmark
Quantization for Lower VRAM
GPTQ Quantization
AWQ Quantization
Advanced Configuration
Memory-Optimized Setup
High-Throughput Setup
SSM Architecture Benefits
Tips for Clore.ai Users
Troubleshooting
Issue
Solution
Performance Comparison
Model
Active Params
VRAM (FP16)
Speed (RTX 4090)
Resources
Last updated
Was this helpful?