Treatment load
ml/models/mistral::load
Configuration
⬡ mistral: ml/models/mistral::Mistral
Inputs
⇥ safetensors: Stream<string>
⇥ tokenizer: Block<string>
Outputs
↦ error: Block<string>
↦ failed: Block<void>
↦ loaded: Block<void>
Load weights and tokenizer into a Mistral model.
Collects all .safetensors shard paths from safetensors (stream closes when all shards
have been emitted), then waits for the single tokenizer path on tokenizer. Once both are
received, memory-maps the weight shards and starts the inference worker thread inside the
Mistral model.
loaded is emitted when the model is ready to accept prompts. If any step fails —
file not found, incompatible weights, tokenizer parse error — failed and error are
emitted instead and loaded is never sent.
ℹ️ Wire safetensors and tokenizer directly from a fetch treatment.
⚠️ generate will silently drop prompts until load has successfully completed.
graph LR
T("load()")
S["🟩 🟩 🟩 …"] -->|safetensors| T
K["〈🟨〉"] -->|tokenizer| T
T -->|loaded| L["〈🟦〉"]
T -->|failed| F["〈🟦〉"]
T -->|error| E["〈🟨〉"]
style S fill:#ffff,stroke:#ffff
style K fill:#ffff,stroke:#ffff
style L fill:#ffff,stroke:#ffff
style F fill:#ffff,stroke:#ffff
style E fill:#ffff,stroke:#ffff
use ml/repos/hf::HfHub
use ml/repos/hf::fetch
use ml/models/mistral::Mistral
use ml/models/mistral::load
use std/engine/util::startup
treatment example()
model hub: HfHub(repo_id = "mistralai/Mistral-7B-v0.1")
model mistral: Mistral()
{
startup()
fetch[hub=hub]()
load[mistral=mistral]()
startup.trigger -> fetch.trigger
fetch.safetensors -> load.safetensors
fetch.tokenizer -> load.tokenizer
}