Tacita

Gemma‑4‑Tacita‑E4B · GGUF

The desktop build of Tacita's on‑device fine‑tune of Gemma 4 E4B.


What this is

A QLoRA fine‑tune of Gemma 4 E4B (≈4B effective) that bakes Tacita's per‑turn director behaviors directly into the weights — preamble, native thinking, end‑to‑end search reasoning, honesty, and Tacita identity — so the assistant needs far fewer serial passes per turn and runs fully offline.

This is the GGUF / llama.cpp build, consumed by tacita-desktop (Rust runtime). For mobile (flutter_gemma), use the LiteRT‑LM build: Gemma‑4‑Tacita‑E4B‑litert‑lm.

What's baked in

  • 🗣️ Inline preamble + native thinking — route‑declaration preamble, then enable_thinking honored natively; emits nothing when thinking is off (fixes the stock‑Gemma llama.cpp #21338 class of bugs).
  • 🔎 End‑to‑end web search — decides when to search, multilingual query plan inside the tool call ({queries:[{q,lang}]}), grounds answers, retries off‑topic results.
  • 🤝 Honesty — refuses unanswerable live‑data questions without inventing numbers.
  • 🪪 Tacita identity — knows it is Tacita, built on Gemma 4, private on‑device.

Capability metadata

GGUF bundle is stamped with tacita.* keys (gguf namespace) so the Tacita desktop runtime reads them once at load and bypasses the matching director call. A stock llama.cpp runtime ignores them and runs the model normally.

tacita.model        = gemma-4-tacita
tacita.variant      = E4B
tacita.tier         = desktop-pro
tacita.capabilities = [preamble_inline, thinking_native, search_plan_inline, ...]

Quantizations

File Bits Size (approx) Use
*-Q4_K_M.gguf 4‑bit ~2.6 GB default desktop
*-Q8_0.gguf 8‑bit ~4.3 GB quality‑first
*-F16.gguf 16‑bit ~8 GB reference

Run with llama.cpp (b‑recent) or any GGUF runtime. eos_token is <end_of_turn> (Gemma‑4 turn format).

License

Base Gemma 4 is Apache 2.0; this derivative inherits Apache 2.0.

Provenance

Training labels = adapted permissively‑licensed open datasets + an open‑weights teacher — never frontier‑API model outputs.


Built with ❤️ for on‑device privacy · TacitaModels
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for TacitaModels/Gemma-4-Tacita-E4B-GGUF

Finetuned
(75)
this model