Treatment load

ml/models/mistral::load


Configuration

⬡ mistral: ml/models/mistral::Mistral

Inputs

⇥ safetensors: Stream<string>
⇥ tokenizer: Block<string>

Outputs

↦ error: Block<string>
↦ failed: Block<void>
↦ loaded: Block<void>


Load weights and tokenizer into a Mistral model.

Collects all .safetensors shard paths from safetensors (stream closes when all shards have been emitted), then waits for the single tokenizer path on tokenizer. Once both are received, memory-maps the weight shards and starts the inference worker thread inside the Mistral model.

loaded is emitted when the model is ready to accept prompts. If any step fails — file not found, incompatible weights, tokenizer parse error — failed and error are emitted instead and loaded is never sent.

ℹ️ Wire safetensors and tokenizer directly from a fetch treatment.

⚠️ generate will silently drop prompts until load has successfully completed.

graph LR
     T("load()")
     S["🟩 🟩 🟩 …"] -->|safetensors| T
     K["〈🟨〉"]       -->|tokenizer|   T
     T -->|loaded| L["〈🟦〉"]
     T -->|failed| F["〈🟦〉"]
     T -->|error|  E["〈🟨〉"]

     style S fill:#ffff,stroke:#ffff
     style K fill:#ffff,stroke:#ffff
     style L fill:#ffff,stroke:#ffff
     style F fill:#ffff,stroke:#ffff
     style E fill:#ffff,stroke:#ffff
use ml/repos/hf::HfHub
use ml/repos/hf::fetch
use ml/models/mistral::Mistral
use ml/models/mistral::load
use std/engine/util::startup

treatment example()
  model hub:     HfHub(repo_id = "mistralai/Mistral-7B-v0.1")
  model mistral: Mistral()
{
    startup()
    fetch[hub=hub]()
    load[mistral=mistral]()

    startup.trigger   -> fetch.trigger
    fetch.safetensors -> load.safetensors
    fetch.tokenizer   -> load.tokenizer
}