Model RemoteLlm

ml/remote/llm::RemoteLlm

Parameters

↳ const api_key: Option<string>
↳ const backend: string = "mistral"
↳ const base_url: string = ""
↳ const max_tokens: Option<u64>
↳ const model: string = ""
↳ const system: string = ""
↳ const temperature: Option<f32>
↳ const timeout: Option<u64>
↳ const top_p: Option<f32>

Remote LLM provider configuration.

Holds connection and inference parameters for a remote large language model service.

backend: provider name (default "mistral"); see the table below.
api_key: API key for authentication (omit for Ollama or unauthenticated endpoints).
base_url: override the provider base URL (required for Ollama, Azure OpenAI, or custom OpenAI-compatible endpoints; see the table below).
model: model identifier; see the table below for recommended values per backend.
system: system prompt injected at the start of every conversation.
max_tokens: maximum tokens to generate per response (omit to use the backend default).
temperature: sampling temperature from 0.0 to 2.0 (omit to use the backend default; some backends reject temperature and top_p being set simultaneously).
top_p: nucleus sampling cutoff from 0.0 to 1.0 (omit to use the backend default).
timeout: request timeout in seconds (omit to use the backend default).

Backends

`backend`	`api_key`	`base_url`	Example `model`	Model list
`"mistral"`	Mistral API key	(built-in)	`"mistral-small-latest"`	https://docs.mistral.ai/getting-started/models/models_overview/
`"openai"`	OpenAI API key	(built-in)	`"gpt-4.1-nano"`	https://platform.openai.com/docs/models
`"anthropic"`	Anthropic API key	(built-in)	`"claude-sonnet-4-6"`	https://docs.anthropic.com/en/docs/about-claude/models/all-models
`"google"`	Google API key	(built-in)	`"gemini-2.5-flash"`	https://ai.google.dev/gemini-api/docs/models
`"groq"`	Groq API key	(built-in)	`"llama-3.3-70b-versatile"`	https://console.groq.com/docs/models
`"deepseek"`	DeepSeek API key	(built-in)	`"deepseek-chat"`	https://api-docs.deepseek.com/quick_start/pricing
`"xai"`	xAI API key	(built-in)	`"grok-2-latest"`	https://docs.x.ai/docs/models
`"cohere"`	Cohere API key	(built-in)	`"command-a-03-2025"`	https://docs.cohere.com/docs/models
`"openrouter"`	OpenRouter key	(built-in)	`"provider/model-name"`	https://openrouter.ai/models
`"huggingface"`	HF token	(built-in)	any HF Inference model ID	https://huggingface.co/models?pipeline_tag=text-generation&inference=warm
`"ollama"`	(omit)	`"http://localhost:11434"`	`"llama3.2"`	https://ollama.com/library
`"azure-openai"`	Azure API key	`"https://<resource>.openai.azure.com"`	`"gpt-4o"`	https://learn.microsoft.com/azure/ai-services/openai/concepts/models
`"aws-bedrock"`	(AWS env vars)	(built-in)	Bedrock model ARN or ID	https://docs.aws.amazon.com/bedrock/latest/userguide/models-supported.html

ℹ️ Use RemoteLlm together with chat, stream, or visionChat.

use ml/remote/llm::RemoteLlm
use ml/remote/llm::stream
use std/engine/util::startup

treatment example()
  model llm: RemoteLlm(
    backend    = "mistral",
    api_key    = "...",
    model      = "mistral-small-latest",
    system     = "You are a helpful assistant.",
    max_tokens = 2048
  )
  input  prompt: Stream<string>
  output token:  Stream<string>
{
    stream[llm=llm]()
    Self.prompt -> stream.prompt,token -> Self.token
}

Keyboard shortcuts

Mélodium Standard Reference

Model RemoteLlm

Parameters

Backends