Breaking News • AI • Technology • Startups • Cybersecurity • Future Tech

Liquid AI’s LFM2.5-230M: Powering Efficient On-Device AI and Agentic Tasks

Liquid AI's LFM2.5-230M: Powering Efficient On-Device AI and Agentic Tasks

Introduction

The central development is this: The promise of artificial intelligence running directly on our devices – from smartphones to robots – is rapidly becoming a reality. However, achieving powerful AI capabilities within the constraints of edge hardware requires highly specialized models.

Liquid AI is stepping into this space with its latest innovation, the LFM2.5-230M. This compact yet potent model is designed specifically to excel at agentic tasks and data extraction on-device, marking a significant step towards more autonomous and private AI applications.

What is LFM2.5-230M?

Meanwhile, Liquid AI’s LFM2.5-230M is their smallest and most targeted model to date, featuring 230 million parameters. Built upon the robust LFM2 architecture, this text-only model is engineered for efficiency. It boasts a hybrid layer design, incorporating eight double-gated LIV convolution blocks and six grouped-query attention (GQA) blocks.

This unique structure is optimized for rapid CPU inference, making it ideal for devices where computational resources are limited. The model also offers an impressive context length of 32,768 tokens and supports a vocabulary of 65,536, encompassing ten languages including English, Chinese, Arabic, and Japanese.

Designed for the Edge: Performance and Specialization

Unlike general-purpose reasoning models, LFM2.5-230M has a laser focus: enabling agentic tasks and precise data extraction directly on edge hardware. This includes everything from mobile phones and smart devices to industrial robots.

The model demonstrates remarkable speed, achieving 213 tokens per second on a Galaxy S25 Ultra and 42 tokens per second even on a Raspberry Pi 5. Its compact footprint, ranging from 293 to 375 MB, further underscores its suitability for on-device deployment.

In practical terms, Crucially, LFM2.5-230M has shown competitive performance against larger models like Qwen3.5-0.8B and Gemma 3 1B IT in its specialized domains, particularly instruction following and data extraction. However, Liquid AI is transparent about its limitations; it’s not intended for complex mathematical reasoning, extensive code generation, or creative writing tasks. Its strength lies in its ability to follow instructions and extract information efficiently.

Under the Hood: Architecture and Training

The model’s impressive capabilities stem from a sophisticated training regimen. LFM2.5-230M was pre-trained on a massive 19 trillion tokens, including an extended context phase. This was followed by a three-stage post-training process:

  • Supervised Fine-Tuning (SFT): This stage included distillation from the larger LFM2.5-350M model, allowing the smaller model to inherit advanced behaviors for targeted tasks.
  • Direct Preference Optimization (DPO): Enhancing the model’s ability to align with human preferences.
  • Multi-Domain Reinforcement Learning: This final stage helps maintain flexibility for future specialization and diverse applications.

For example, This distillation step is particularly key to enabling the 230M parameter model to compete effectively with significantly larger checkpoints on specific tasks.

Benchmarking Success: Where it Shines

Evaluations across ten benchmarks confirm LFM2.5-230M’s targeted strengths. It excels in instruction following and data extraction, outperforming competitors in these areas. For example, on the IFEval benchmark, it scored 71.71, surpassing Qwen3.5-0.8B (59.94) and Gemma 3 1B IT (63.49).

Similarly, it led on IFBench and CaseReportBench, a clinical data-extraction test. While it doesn’t aim for broad general knowledge (scoring lower on MMLU-Pro), its focused performance makes it a powerful tool for its intended applications.

Real-World Applications: Practical Use Cases

LFM2.5-230M is perfectly suited for two primary application areas:

  1. Large-Scale Data Extraction Pipelines

    Imagine processing hundreds of thousands of clinical reports or legal documents, extracting specific data points into structured formats. With LFM2.5-230M, this can be done locally on commodity CPUs, eliminating per-token API costs and enhancing data privacy. Its minimal memory footprint makes this highly feasible.

  2. Lightweight On-Device Agentic Workloads

    Interestingly, This includes applications like smart home hubs that convert spoken commands into device actions, or mobile assistants that route user requests to the correct functions. Liquid AI demonstrated this by deploying the model on a Unitree G1 humanoid robot, where it served as a skill-selection layer. It translated natural language instructions into sequences of tool calls, leveraging NVIDIA’s SONIC framework for low-level skills.

Seamless Tool Integration

A core strength of LFM2.5 is its robust support for function calling, enabling seamless interaction with external tools and services. The process involves four straightforward steps:

  1. Defining available tools as JSON within the system prompt.
  2. The model generates a Pythonic function call, encapsulated by special tokens (<|tool_call_start|> and <|tool_call_end|>).
  3. The application executes this generated tool call and returns the result.
  4. The model then processes the result and provides a plain-text answer to the user.

However, This structured approach facilitates the creation of highly interactive and functional AI agents.

Getting Started: Implementation and Fine-Tuning

For developers eager to integrate LFM2.5-230M, it works seamlessly with Transformers library versions 5.0.0 and above. Liquid AI recommends specific generation settings for optimal performance, including a temperature of 0.1, top_k 50, and a repetition penalty of 1.05, with do_sample=True enabled.

Meanwhile, Furthermore, Liquid AI provides comprehensive fine-tuning recipes. These cover Supervised Fine-Tuning (SFT), Direct Preference Optimization (DPO), and Grouped Reinforcement Learning with Preference Optimization (GRPO) using LoRA, all accessible via Unsloth and TRL in convenient Colab notebooks. This makes it easier for developers to adapt and specialize the model for their unique requirements.

The Future of On-Device AI

LFM2.5-230M represents a calculated and effective move by Liquid AI to address the growing demand for efficient, specialized AI on edge devices. By focusing on agentic tasks and data extraction, and providing robust support for on-device inference and tool use, this model paves the way for a new generation of smart, responsive, and privacy-preserving AI applications across a multitude of hardware platforms.

Expert Perspective

A practical read on LFM2.5-230M on-device inference starts with model. That is where the earliest effects are likely to show up if this development keeps building.

What happens next will come down to adoption speed, policy response, and execution quality. That combination could make LFM2.5-230M on-device inference a meaningful reference point across lfm2.

For decision-makers, the useful lens is not the headline alone but how 230m changes priorities once organizations have to respond.

Frequently Asked Questions

Why is LFM2.5-230M on-device inference important?

IntroductionThe central development is this: The promise of artificial intelligence running directly on our devices – from smartphones to robots – is rapidly becoming a reality.

What impact could LFM2.5-230M on-device inference have?

However, achieving powerful AI capabilities within the constraints of edge hardware requires highly specialized models.Liquid AI is stepping into this space with its latest innovation, the LFM2.5-230M.

What should readers watch next with LFM2.5-230M on-device inference?

This compact yet potent model is designed specifically to excel at agentic tasks and data extraction on-device, marking a significant step towards more autonomous and private AI applications.What is LFM2.5-230M?Meanwhile, Liquid AI’s LFM2.5-230M is their smallest and most targeted model to date, featuring 230 million parameters.

How does this relate to model?

It connects because the article frames model as one of the clearest areas where the topic may be felt in practice.

Source: https://www.marktechpost.com/2026/06/27/liquid-ai-ships-lfm2-5-230m-with-llama-cpp-mlx-vllm-sglang-and-onnx-support-for-on-device-inference/

Share this article

Subscribe

By pressing the Subscribe button, you confirm that you have read our Privacy Policy.

Latest News

More Articles