Treatment transcribe

ml/remote/stt::transcribe

Outputs

↦ error: Block<string>
↦ failed: Block<void>
↦ transcript: Block<string>

Transcribe audio bytes to text using a remote speech-to-text service.

Collects all bytes from the audio stream into a single buffer, then sends them to the configured provider for transcription. The resulting text is emitted on transcript. If the request fails, failed and error are emitted instead.

ℹ️ The audio stream should be closed by the sender once all audio data has been sent; transcribe waits for the stream to close before submitting the request. Common audio formats: WAV, MP3, FLAC (format support depends on the backend).

graph LR
     T("transcribe()")
     A["🟩 🟩 🟩 …"] -->|audio| T
     T -->|transcript| R["〈🟨〉"]
     T -->|failed| F["〈🟦〉"]
     T -->|error| E["〈🟨〉"]

     style A fill:#ffffff,stroke:#ffffff
     style R fill:#ffffff,stroke:#ffffff
     style F fill:#ffffff,stroke:#ffffff
     style E fill:#ffffff,stroke:#ffffff

use ml/remote/stt::RemoteStt
use ml/remote/stt::transcribe

treatment example()
  model stt: RemoteStt(backend = "openai", api_key = "sk-...", model = "whisper-1")
  input  audio:      Stream<byte>
  output transcript: Block<string>
{
    transcribe[stt=stt]()
    Self.audio -> transcribe.audio,transcript -> Self.transcript
}

Mélodium Standard Reference

Treatment transcribe

Configuration

Inputs

Outputs

Keyboard shortcuts

Mélodium Standard Reference

Treatment transcribe

Configuration

Inputs

Outputs