Model RemoteLlm
ml/remote/llm::RemoteLlm
Parameters
↳ const api_key: Option<string>
↳ const backend: string = "mistral"
↳ const base_url: string = ""
↳ const max_tokens: Option<u64>
↳ const model: string = ""
↳ const system: string = ""
↳ const temperature: Option<f32>
↳ const timeout: Option<u64>
↳ const top_p: Option<f32>
Remote LLM provider configuration.
Holds connection and inference parameters for a remote large language model service.
backend: provider name (default"mistral"); see the table below.api_key: API key for authentication (omit for Ollama or unauthenticated endpoints).base_url: override the provider base URL — required for Ollama, Azure OpenAI, or custom OpenAI-compatible endpoints; see the table below.model: model identifier — see the table below for recommended values per backend.system: system prompt injected at the start of every conversation.max_tokens: maximum tokens to generate per response (omit to use the backend default).temperature: sampling temperature 0.0–2.0 (omit to use the backend default; some backends rejecttemperatureandtop_pbeing set simultaneously).top_p: nucleus sampling cutoff 0.0–1.0 (omit to use the backend default).timeout: request timeout in seconds (omit to use the backend default).
Backends
backend | api_key | base_url | Example model | Model list |
|---|---|---|---|---|
"mistral" | Mistral API key | (built-in) | "mistral-small-latest" | https://docs.mistral.ai/getting-started/models/models_overview/ |
"openai" | OpenAI API key | (built-in) | "gpt-4.1-nano" | https://platform.openai.com/docs/models |
"anthropic" | Anthropic API key | (built-in) | "claude-sonnet-4-6" | https://docs.anthropic.com/en/docs/about-claude/models/all-models |
"google" | Google API key | (built-in) | "gemini-2.5-flash" | https://ai.google.dev/gemini-api/docs/models |
"groq" | Groq API key | (built-in) | "llama-3.3-70b-versatile" | https://console.groq.com/docs/models |
"deepseek" | DeepSeek API key | (built-in) | "deepseek-chat" | https://api-docs.deepseek.com/quick_start/pricing |
"xai" | xAI API key | (built-in) | "grok-2-latest" | https://docs.x.ai/docs/models |
"cohere" | Cohere API key | (built-in) | "command-a-03-2025" | https://docs.cohere.com/docs/models |
"openrouter" | OpenRouter key | (built-in) | "provider/model-name" | https://openrouter.ai/models |
"huggingface" | HF token | (built-in) | any HF Inference model ID | https://huggingface.co/models?pipeline_tag=text-generation&inference=warm |
"ollama" | (omit) | "http://localhost:11434" | "llama3.2" | https://ollama.com/library |
"azure-openai" | Azure API key | "https://<resource>.openai.azure.com" | "gpt-4o" | https://learn.microsoft.com/azure/ai-services/openai/concepts/models |
"aws-bedrock" | (AWS env vars) | (built-in) | Bedrock model ARN or ID | https://docs.aws.amazon.com/bedrock/latest/userguide/models-supported.html |
ℹ️ Use RemoteLlm together with chat, stream, or visionChat.
use ml/remote/llm::RemoteLlm
use ml/remote/llm::stream
use std/engine/util::startup
treatment example()
model llm: RemoteLlm(
backend = "mistral",
api_key = "...",
model = "mistral-small-latest",
system = "You are a helpful assistant.",
max_tokens = 2048
)
input prompt: Stream<string>
output token: Stream<string>
{
stream[llm=llm]()
Self.prompt -> stream.prompt,token -> Self.token
}