Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Treatment transcribe

ml/remote/stt::transcribe


Configuration

⬡ stt: ml/remote/stt::RemoteStt

Inputs

⇥ audio: Stream<byte>

Outputs

↦ error: Block<string>
↦ failed: Block<void>
↦ transcript: Block<string>


Transcribe audio bytes to text using a remote speech-to-text service.

Collects all bytes from the audio stream into a single buffer, then sends them to the configured provider for transcription. The resulting text is emitted on transcript. If the request fails, failed and error are emitted instead.

ℹ️ The audio stream should be closed by the sender once all audio data has been sent; transcribe waits for the stream to close before submitting the request. Common audio formats: WAV, MP3, FLAC (format support depends on the backend).

graph LR
     T("transcribe()")
     A["🟩 🟩 🟩 …"] -->|audio| T
     T -->|transcript| R["〈🟨〉"]
     T -->|failed| F["〈🟦〉"]
     T -->|error| E["〈🟨〉"]

     style A fill:#ffffff,stroke:#ffffff
     style R fill:#ffffff,stroke:#ffffff
     style F fill:#ffffff,stroke:#ffffff
     style E fill:#ffffff,stroke:#ffffff
use ml/remote/stt::RemoteStt
use ml/remote/stt::transcribe

treatment example()
  model stt: RemoteStt(backend = "openai", api_key = "sk-...", model = "whisper-1")
  input  audio:      Stream<byte>
  output transcript: Block<string>
{
    transcribe[stt=stt]()
    Self.audio -> transcribe.audio,transcript -> Self.transcript
}