Treatment fetch

ml/repos/hf::fetch


Configuration

⬡ hub: ml/repos/hf::HfHub

Inputs

⇥ trigger: Block<void>

Outputs

↦ error: Block<string>
↦ failed: Block<void>
↦ safetensors: Stream<string>
↦ tokenizer: Block<string>


Resolve and download files from a HuggingFace Hub repository.

On trigger, contacts the Hub API (or the local cache if files are already present), lists all .safetensors shards and tokenizer.json in the configured repository, downloads any files not already cached, then emits their local filesystem paths.

safetensors emits one path per shard in sorted order — this covers both single-file and multi-shard models transparently. tokenizer emits the single path to tokenizer.json. If any network or cache error occurs, failed and error are emitted instead and safetensors / tokenizer remain empty.

ℹ️ Wire safetensors and tokenizer directly into a load treatment to initialise a Mistral model.

⚠️ The Hub API calls are synchronous and may block for several minutes on the first run while shards are downloaded. Subsequent runs return cached paths immediately.

graph LR
     T("fetch()")
     B["〈🟦〉"]           -->|trigger|     T
     T -->|safetensors|   S["🟩 🟩 🟩 …"]
     T -->|tokenizer|     K["〈🟨〉"]
     T -->|failed|        F["〈🟦〉"]
     T -->|error|         E["〈🟨〉"]

     style B fill:#ffff,stroke:#ffff
     style S fill:#ffff,stroke:#ffff
     style K fill:#ffff,stroke:#ffff
     style F fill:#ffff,stroke:#ffff
     style E fill:#ffff,stroke:#ffff
use ml/repos/hf::HfHub
use ml/repos/hf::fetch
use ml/models/mistral::Mistral
use ml/models/mistral::load
use ml/models/mistral::generate
use std/engine/util::startup

treatment example()
  model hub:     HfHub(repo_id = "mistralai/Mistral-7B-v0.1")
  model mistral: Mistral()
  input  prompt:    Stream<string>
  output generated: Stream<string>
{
    startup()
    fetch[hub=hub]()
    load[mistral=mistral]()
    generate[mistral=mistral]()

    startup.trigger   -> fetch.trigger
    fetch.safetensors -> load.safetensors
    fetch.tokenizer   -> load.tokenizer
    load.loaded       -> generate.ready
    Self.prompt       -> generate.prompt
    generate.generated -> Self.generated
}