Instructions to use LiquidAI/LFM2.5-1.2B-JP with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use LiquidAI/LFM2.5-1.2B-JP with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="LiquidAI/LFM2.5-1.2B-JP")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("LiquidAI/LFM2.5-1.2B-JP")
model = AutoModelForCausalLM.from_pretrained("LiquidAI/LFM2.5-1.2B-JP")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use LiquidAI/LFM2.5-1.2B-JP with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "LiquidAI/LFM2.5-1.2B-JP"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "LiquidAI/LFM2.5-1.2B-JP",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/LiquidAI/LFM2.5-1.2B-JP

SGLang

How to use LiquidAI/LFM2.5-1.2B-JP with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "LiquidAI/LFM2.5-1.2B-JP" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "LiquidAI/LFM2.5-1.2B-JP",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "LiquidAI/LFM2.5-1.2B-JP" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "LiquidAI/LFM2.5-1.2B-JP",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use LiquidAI/LFM2.5-1.2B-JP with Docker Model Runner:
```
docker model run hf.co/LiquidAI/LFM2.5-1.2B-JP
```
Browse Quantizations to use this model in llama.cpp, Ollama, LM Studio, or any compatible app.

Try LFM • Docs • LEAP • Discord

LFM2.5-1.2B-JP

LFM2.5-1.2B-JP is a chat model specifically optimized for Japanese. While LFM2 already supported Japanese as one of eight languages, LFM2.5-JP pushes state-of-the-art on Japanese knowledge and instruction-following at its scale. This model is ideal for developers building Japanese-language applications where cultural and linguistic nuance matter.

Find more information about LFM2.5 in our blog post.

🏃 Inference

LFM2.5 is supported by many inference frameworks. See the Inference documentation for the full list.

Name	Description	Docs
Transformers	Simple inference with direct access to model internals.	Link
vLLM	High-throughput production deployments with GPU.	Link
llama.cpp	Cross-platform inference with CPU offloading.	Link

Here's a quick start example with transformers:

from transformers import AutoModelForCausalLM, AutoTokenizer, TextStreamer

model_id = "LiquidAI/LFM2.5-1.2B-JP"
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    device_map="auto",
    dtype="bfloat16",
#   attn_implementation="flash_attention_2" <- uncomment on compatible GPU
)
tokenizer = AutoTokenizer.from_pretrained(model_id)
streamer = TextStreamer(tokenizer, skip_prompt=True, skip_special_tokens=True)

prompt = "What is C. elegans?"

input_ids = tokenizer.apply_chat_template(
    [{"role": "user", "content": prompt}],
    add_generation_prompt=True,
    return_tensors="pt",
    tokenize=True,
).to(model.device)

output = model.generate(
    input_ids,
    do_sample=True,
    temperature=0.3,
    min_p=0.15,
    repetition_penalty=1.05,
    max_new_tokens=512,
    streamer=streamer,
)

Recommended generation parameters:
- temperature: 0.3
- min_p: 0.15
- repetition_penalty: 1.05

🔧 Fine-Tuning

We recommend fine-tuning LFM2.5 for your specific use case to achieve the best results.

Name	Description	Docs
SFT (Unsloth)	Supervised Fine-Tuning with LoRA using Unsloth.	Link
SFT (TRL)	Supervised Fine-Tuning with LoRA using TRL.	Link
DPO (TRL)	Direct Preference Optimization with LoRA using TRL.	Link

📊 Performance

Model	JMMLU	M-IFEval (ja)	GSM8K (ja)
LFM2.5-1.2B-JP	50.7	58.1	56.0
LFM2.5-1.2B-Instruct	47.7	41.8	46.8
Qwen3-1.7B (Instruct mode)	47.7	40.3	46.0
Llama 3.2 1B Instruct	34.0	24.1	25.2
TinySwallow-1.5B-Instruct	48.0	36.5	47.2
Gemma-2-Llama-Swallow-2b-it-v0.1	48.1	33.4	34.4
Gemma-3-1b-it	34.5	26.3	33.6
Granite-4.0-h-1b	42.2	39.3	42.8
Sarashina2.2-1b-instruct-v0.1	40.2	21.9	44.4

Evaluation Notes

All results are zero-shot evaluations using greedy decoding.
M-IFEval (ja) scores correspond to the loose evaluation setting.
JMMLU was evaluated using a prompt format in a similar style to the ArtificialAnalysis methodology (with corresponding parsing logic). The Japanese prompt template used is shown below:

PROMPT_TEMPLATE = """与えられた選択問題に答えてください。回答の最後の行に「答え：{valid_options}」のように出力してください（例：「答え：X」）。

{question}

{options}"""

📬 Contact

Got questions or want to connect? Join our Discord community
If you are interested in custom solutions with edge deployment, please contact our sales team.

Citation

@article{liquidai2025lfm2,
  title={LFM2 Technical Report},
  author={Liquid AI},
  journal={arXiv preprint arXiv:2511.23404},
  year={2025}
}

Downloads last month: 2,374

Safetensors

Model size

1B params

Tensor type

BF16

Model tree for LiquidAI/LFM2.5-1.2B-JP

Base model

LiquidAI/LFM2.5-1.2B-Base

Finetuned

(35)

this model

Finetunes

5 models

Quantizations

26 models

Collection including LiquidAI/LFM2.5-1.2B-JP

💧 LFM2.5

Collection

Collection of post-trained and base LFM2.5 models. • 35 items • Updated 2 days ago • 148

Papers for LiquidAI/LFM2.5-1.2B-JP

LFM2 Technical Report

Paper • 2511.23404 • Published Nov 28, 2025 • 61

M-IFEval: Multilingual Instruction-Following Evaluation

Paper • 2502.04688 • Published Feb 7, 2025

Should We Respect LLMs? A Cross-Lingual Study on the Influence of Prompt Politeness on LLM Performance

Paper • 2402.14531 • Published Feb 22, 2024