Treatment decode
ml/models/whisper::decode
Configuration
⬡ whisper: ml/models/whisper::Whisper
Inputs
⇥ audio: Stream<f32>
⇥ ready: Block<void>
Outputs
↦ transcribed: Stream<string>
Decode a continuous stream of PCM audio samples into text using a Whisper model.
Forwards incoming f32 sample batches to the worker thread as they arrive; the
worker decodes each complete 30-second window (480 000 samples at 16 kHz) into
text and emits the result on transcribed without waiting for the stream to end.
Any remaining samples shorter than one window are flushed and decoded when the
audio stream closes.
ℹ️ load must have completed successfully before audio is sent, otherwise the audio
is silently discarded.
graph LR
T("decode()")
R["〈🟦〉"] -->|ready| T
A["🟩 🟩 🟩 …"] -->|audio| T
T -->|transcribed| X["🟩 🟩 …"]
style R fill:#ffff,stroke:#ffff
style A fill:#ffff,stroke:#ffff
style X fill:#ffff,stroke:#ffff
use ml/repos/hf::HfHub
use ml/repos/hf::fetch
use ml/models/whisper::Whisper
use ml/models/whisper::load
use ml/models/whisper::decode
use std/engine/util::startup
treatment example()
model hub: HfHub(repo_id = "openai/whisper-tiny")
model whisper: Whisper()
input audio: Stream<f32>
output transcribed: Stream<string>
{
startup()
fetch[hub=hub]()
load[whisper=whisper]()
decode[whisper=whisper]()
startup.trigger -> fetch.trigger
fetch.safetensors -> load.safetensors
load.loaded -> decode.ready
Self.audio -> decode.audio
decode.transcribed -> Self.transcribed
}