Runnable with vLLM
Jan-v3-4B-base-instruct is a 4B-parameter model obtained via post-training distillation from a larger teacher, transferring capabilities while preserving general-purpose performance on standard benchmarks. The result is a compact, ownable base that is straightforward to fine-tune, broadly applicable and minimizing the usual capacity–capability trade-offs.
Building on this base, Jan-Code, a code-tuned variant, will be released soon.
This repo contains the BF16 version of Jan-v3-4B-base-instruct, which has the following features:
Intended Use

Jan-v3 demo is hosted on Jan Browser at chat.jan.ai. It is also optimized for direct integration with Jan Desktop, select the model in the app to start using it.
Using vLLM:
vllm serve janhq/Jan-v3-4B-base-instruct \
--host 0.0.0.0 \
--port 1234 \
--enable-auto-tool-choice \
--tool-call-parser hermes
Using llama.cpp:
llama-server --model Jan-v3-4B-base-instruct-Q8_0.gguf \
--host 0.0.0.0 \
--port 1234 \
--jinja \
--no-context-shift
For optimal performance in agentic and general tasks, we recommend the following inference parameters:
temperature: 0.7
top_p: 0.8
top_k: 20
Updated Soon