Liquid AI Releases LFM2.5-1.2B-Thinking: A 1.2B Parameter Reasoning Model That Fits Under 1 GB On-Device

•January 21, 2026

MarkTechPost•Jan 21, 2026

Companies Mentioned

Liquid AI

Qualcomm

QCOM

OpenRouter

Hugging Face

AMD

Why It Matters

Bringing high‑quality reasoning to smartphones democratizes AI agents, cutting latency and data‑privacy costs for enterprise and consumer apps.

Key Takeaways

•1.2B‑parameter model fits ~900 MB, runs on phones
•Generates explicit reasoning traces for agentic tasks
•Beats comparable 1B models on math and tool use benchmarks
•Doom‑loop rate cut from 15.7% to 0.36% via RLVR
•Decodes ~239 tps CPU, ~82 tps mobile NPU

Pulse Analysis

Edge AI has reached a new milestone with Liquid AI's LFM2.5-1.2B‑Thinking. By compressing a 1.2 billion‑parameter network into a sub‑gigabyte footprint, the model makes sophisticated reasoning accessible on standard smartphones and embedded devices. This shift reduces dependence on data‑center inference, slashing latency and operational expenses while preserving user privacy—critical factors for industries like finance, healthcare, and on‑device assistants.

The model’s core innovation lies in its "thinking" architecture. Trained with multi‑stage reinforcement learning, it produces step‑by‑step reasoning traces before delivering final answers, a capability that enhances transparency and tool integration. A dedicated training pipeline—mid‑training with reasoning prompts, supervised fine‑tuning on synthetic chains, preference alignment, and RLVR with n‑gram penalties—drastically lowers doom‑loop occurrences from 15.7% to 0.36%, ensuring reliable interactive experiences.

Performance metrics underscore its practicality: roughly 239 tokens per second on an AMD CPU and 82 tokens per second on a Qualcomm NPU, all while staying under 1 GB RAM. Compatibility with llama.cpp, MLX, vLLM, and formats like GGUF and ONNX simplifies deployment across cloud APIs, edge platforms, and self‑hosted environments. As enterprises seek to embed intelligent agents directly into products, LFM2.5-1.2B‑Thinking offers a compelling blend of reasoning depth, efficiency, and on‑device accessibility, likely accelerating the adoption of edge‑first AI strategies.