LLM Provider Guide
This is the canonical source for LLM provider configuration. This system is designed to support both local and cloud-based Large Language Model (LLM) providers.
Currently supported providers:
- Ollama — local models (default and preferred)
- OpenAI — cloud-based models
The system uses a unified configuration approach, allowing you to switch providers without changing application code.
Required fields
Set these values in .env and reference them from config.yaml:
LLM_PROVIDER=openai
LLM_BASE_URL=https://api.openai.com/v1
LLM_API_KEY=your-api-key
LLM_EMBEDDING_MODEL=text-embedding-3-small
LLM_CHAT_MODEL=gpt-4o-mini
VECTOR_SIZE=1536
global:
llm:
provider: "${LLM_PROVIDER}"
base_url: "${LLM_BASE_URL}"
api_key: "${LLM_API_KEY}"
models:
embeddings: "${LLM_EMBEDDING_MODEL}"
chat: "${LLM_CHAT_MODEL}"
embeddings:
vector_size: ${VECTOR_SIZE}
Provider profiles
OpenAI
LLM_PROVIDER=openai
LLM_BASE_URL=https://api.openai.com/v1
LLM_API_KEY=sk-your-openai-key
LLM_EMBEDDING_MODEL=text-embedding-3-small
LLM_CHAT_MODEL=gpt-4o-mini
VECTOR_SIZE=1536
Azure OpenAI
LLM_PROVIDER=azure_openai
LLM_BASE_URL=https://<resource>.openai.azure.com
LLM_API_KEY=your-azure-key
LLM_API_VERSION=2024-05-01-preview
LLM_EMBEDDING_MODEL=<embedding-deployment-name>
LLM_CHAT_MODEL=<chat-deployment-name>
VECTOR_SIZE=1536
Notes:
- Use the resource root in
LLM_BASE_URL. - Use deployment names for
LLM_*_MODELvalues.
Ollama (Local Models – Recommended)
Ollama is the default and recommended provider, especially for local development and privacy-sensitive environments.
Prerequisites
To use local models with Ollama, ensure the following:
- Ollama is installed and running on your machine
- Ollama must be accessible at the configured endpoint
(default:http://localhost:11434).
- Installation guide:
https://ollama.com/download
- Required models must be pulled locally before running the system - For embedding generation, the following model is required:
bash
ollama pull argus-ai/pplx-embed-v1-0.6b:fp32
LLM_PROVIDER=ollama
LLM_BASE_URL=http://localhost:11434/v1
LLM_API_KEY=ollama # no authentication required
LLM_EMBEDDING_MODEL=argus-ai/pplx-embed-v1-0.6b:fp32
LLM_CHAT_MODEL=llama3.1:8b
VECTOR_SIZE=1024
Notes:
- If your deployment uses native Ollama APIs, keep model names valid for your server.
- For local/no-auth setups,
LLM_API_KEYmay be ignored by backend logic.
OpenAI-compatible endpoint
LLM_PROVIDER=openai_compat
LLM_BASE_URL=https://your-endpoint.example.com/v1
LLM_API_KEY=your-endpoint-key
LLM_EMBEDDING_MODEL=your-embedding-model
LLM_CHAT_MODEL=your-chat-model
Vector size mapping
Set global.llm.embeddings.vector_size to match embedding output dimension.
Common examples:
text-embedding-3-small->1536text-embedding-3-large->3072- Ollama models vary by model; verify before collection creation
Legacy compatibility
OPENAI_API_KEY is still supported in some flows, but canonical config should use LLM_API_KEY.
Related references
- Variables: Environment Variables Reference
- YAML schema: Configuration File Reference
- Security: Security Considerations