This comprehensive guide covers all configuration options available in OllamaFlow, including the main settings file (ollamaflow.json
) and the structure of key objects like frontends and backends.
Main Configuration File
OllamaFlow uses a JSON configuration file (ollamaflow.json
) that defines server settings, logging, and authentication. On first startup, a default configuration is created if none exists.
Default Configuration Location
- Docker:
/app/ollamaflow.json
- Bare Metal: Same directory as executable
Complete Configuration Example
{
"Logging": {
"Servers": [
{
"Hostname": "127.0.0.1",
"Port": 514,
"RandomizePorts": false,
"MinimumPort": 65000,
"MaximumPort": 65535
}
],
"LogDirectory": "./logs/",
"LogFilename": "ollamaflow.log",
"ConsoleLogging": true,
"EnableColors": true,
"MinimumSeverity": 1
},
"Webserver": {
"Hostname": "*",
"Port": 43411,
"IO": {
"StreamBufferSize": 65536,
"MaxRequests": 1024,
"ReadTimeoutMs": 10000,
"MaxIncomingHeadersSize": 65536,
"EnableKeepAlive": false
},
"Ssl": {
"Enable": false,
"MutuallyAuthenticate": false,
"AcceptInvalidCertificates": true,
"CertificateFile": "",
"CertificatePassword": ""
},
"Headers": {
"IncludeContentLength": true,
"DefaultHeaders": {
"Access-Control-Allow-Origin": "*",
"Access-Control-Allow-Methods": "OPTIONS, HEAD, GET, PUT, POST, DELETE, PATCH",
"Access-Control-Allow-Headers": "*",
"Access-Control-Expose-Headers": "",
"Accept": "*/*",
"Accept-Language": "en-US, en",
"Accept-Charset": "ISO-8859-1, utf-8",
"Cache-Control": "no-cache",
"Connection": "close",
"Host": "localhost:43411"
}
},
"AccessControl": {
"DenyList": {},
"PermitList": {},
"Mode": "DefaultPermit"
},
"Debug": {
"AccessControl": false,
"Routing": false,
"Requests": false,
"Responses": false
}
},
"DatabaseFilename": "ollamaflow.db",
"AdminBearerTokens": [
"your-secure-admin-token"
]
}
Configuration Sections
Logging Settings
Controls how OllamaFlow logs information and errors.
Setting | Type | Default | Description |
---|---|---|---|
LogDirectory | string | "./logs/" | Directory for log files |
LogFilename | string | "ollamaflow.log" | Base filename for logs |
ConsoleLogging | boolean | true | Enable console output |
EnableColors | boolean | true | Enable colored console output |
MinimumSeverity | integer | 1 | Minimum log level (0=Debug, 1=Info, 2=Warn, 3=Error) |
Syslog Servers
Optional remote logging configuration:
{
"Servers": [
{
"Hostname": "syslog.company.com",
"Port": 514,
"RandomizePorts": false,
"MinimumPort": 65000,
"MaximumPort": 65535
}
]
}
Webserver Settings
Configures the HTTP server that handles all requests.
Basic Settings
Setting | Type | Default | Description |
---|---|---|---|
Hostname | string | "*" | Bind hostname (* for all interfaces) |
Port | integer | 43411 | TCP port to listen on |
IO Settings
Controls request handling and performance:
Setting | Type | Default | Description |
---|---|---|---|
StreamBufferSize | integer | 65536 | Buffer size for streaming responses |
MaxRequests | integer | 1024 | Maximum concurrent requests |
ReadTimeoutMs | integer | 10000 | Request read timeout in milliseconds |
MaxIncomingHeadersSize | integer | 65536 | Maximum size of request headers |
EnableKeepAlive | boolean | false | Enable HTTP keep-alive connections |
SSL Settings
HTTPS configuration for secure connections:
Setting | Type | Default | Description |
---|---|---|---|
Enable | boolean | false | Enable HTTPS |
MutuallyAuthenticate | boolean | false | Require client certificates |
AcceptInvalidCertificates | boolean | true | Accept self-signed certificates |
CertificateFile | string | "" | Path to SSL certificate file |
CertificatePassword | string | "" | Certificate password if required |
Headers Settings
Default HTTP headers and CORS configuration:
{
"IncludeContentLength": true,
"DefaultHeaders": {
"Access-Control-Allow-Origin": "*",
"Access-Control-Allow-Methods": "OPTIONS, HEAD, GET, PUT, POST, DELETE, PATCH",
"Access-Control-Allow-Headers": "*"
}
}
Access Control
IP-based access control (optional):
{
"DenyList": {
"192.168.1.100": "Blocked IP",
"10.0.0.0/8": "Blocked network"
},
"PermitList": {
"192.168.1.0/24": "Allowed network"
},
"Mode": "DefaultPermit"
}
Modes:
DefaultPermit
: Allow all except denied IPsDefaultDeny
: Deny all except permitted IPs
Database Settings
Setting | Type | Default | Description |
---|---|---|---|
DatabaseFilename | string | "ollamaflow.db" | SQLite database file path |
Authentication Settings
Setting | Type | Default | Description |
---|---|---|---|
AdminBearerTokens | array | ["ollamaflowadmin"] | Valid bearer tokens for admin APIs |
Frontend Configuration
Frontends are virtual Ollama endpoints that clients connect to. They are stored in the database and managed via the API.
Frontend Object Structure
{
"Identifier": "production-frontend",
"Name": "Production AI Inference",
"Hostname": "ai.company.com",
"TimeoutMs": 90000,
"LoadBalancing": "RoundRobin",
"BlockHttp10": true,
"MaxRequestBodySize": 1073741824,
"Backends": ["gpu-1", "gpu-2", "gpu-3"],
"RequiredModels": ["llama3:8b", "mistral:7b"],
"AllowEmbeddings": true,
"AllowCompletions": true,
"PinnedEmbeddingsProperties": {
"model": "nomic-embed-text",
"options": {
"temperature": 0.1
}
},
"PinnedCompletionsProperties": {
"options": {
"temperature": 0.7,
"num_ctx": 2048
}
},
"LogRequestFull": false,
"LogRequestBody": false,
"LogResponseBody": false,
"UseStickySessions": false,
"StickySessionExpirationMs": 1800000,
"Active": true
}
Frontend Properties
Property | Type | Default | Description |
---|---|---|---|
Identifier | string | Required | Unique identifier for this frontend |
Name | string | Required | Human-readable name |
Hostname | string | "*" | Hostname pattern (* for catch-all) |
TimeoutMs | integer | 60000 | Request timeout in milliseconds |
LoadBalancing | enum | "RoundRobin" | Load balancing algorithm |
BlockHttp10 | boolean | true | Reject HTTP/1.0 requests |
MaxRequestBodySize | integer | 536870912 | Max request size in bytes (512MB) |
Backends | array | [] | List of backend identifiers |
RequiredModels | array | [] | Models that must be available |
AllowEmbeddings | boolean | true | Allow embeddings API requests |
AllowCompletions | boolean | true | Allow completions API requests |
PinnedEmbeddingsProperties | object | {} | Key-value pairs merged into embeddings requests |
PinnedCompletionsProperties | object | {} | Key-value pairs merged into completions requests |
UseStickySessions | boolean | false | Enable session stickiness |
StickySessionExpirationMs | integer | 1800000 | Session timeout (30 minutes, min: 10s, max: 24h) |
LogRequestFull | boolean | false | Log complete requests |
LogRequestBody | boolean | false | Log request bodies |
LogResponseBody | boolean | false | Log response bodies |
Active | boolean | true | Whether frontend is active |
Load Balancing Options
"RoundRobin"
: Cycle through backends sequentially"Random"
: Randomly select from healthy backends
Hostname Patterns
"*"
: Match all hostnames (catch-all)"api.company.com"
: Exact hostname match- Multiple frontends can exist with different hostname patterns
Security Controls
Frontend security controls enable fine-grained access control and request parameter enforcement:
Request Type Controls
AllowEmbeddings
: Controls whether embeddings API endpoints are accessible through this frontend- Ollama API:
/api/embed
- OpenAI API:
/v1/embeddings
- Ollama API:
AllowCompletions
: Controls whether completion API endpoints are accessible through this frontend- Ollama API:
/api/generate
,/api/chat
- OpenAI API:
/v1/completions
,/v1/chat/completions
- Ollama API:
For a request to succeed, both the frontend and at least one assigned backend must allow the request type.
Pinned Properties
Pinned properties allow administrators to enforce specific parameters in requests, providing security compliance and standardization:
PinnedEmbeddingsProperties
: Key-value pairs automatically merged into all embeddings requestsPinnedCompletionsProperties
: Key-value pairs automatically merged into all completion requests
Common use cases:
- Enforce maximum context size:
{"options": {"num_ctx": 2048}}
- Standardize temperature settings:
{"options": {"temperature": 0.7}}
- Override model selection:
{"model": "approved-model:latest"}
- Set organizational defaults:
{"options": {"top_p": 0.9, "top_k": 40}}
Properties are merged with client requests, with pinned properties taking precedence over client-specified values.
Backend Configuration
Backends represent physical Ollama instances in your infrastructure.
Backend Object Structure
{
"Identifier": "gpu-server-1",
"Name": "Primary GPU Server",
"Hostname": "192.168.1.100",
"Port": 11434,
"Ssl": false,
"UnhealthyThreshold": 3,
"HealthyThreshold": 2,
"HealthCheckMethod": "GET",
"HealthCheckUrl": "/",
"MaxParallelRequests": 8,
"RateLimitRequestsThreshold": 20,
"AllowEmbeddings": true,
"AllowCompletions": true,
"PinnedEmbeddingsProperties": {
"options": {
"num_ctx": 512
}
},
"PinnedCompletionsProperties": {
"options": {
"num_ctx": 4096,
"temperature": 0.8
}
},
"LogRequestFull": false,
"LogRequestBody": false,
"LogResponseBody": false,
"Active": true
}
Backend Properties
Property | Type | Default | Description |
---|---|---|---|
Identifier | string | Required | Unique identifier for this backend |
Name | string | Required | Human-readable name |
Hostname | string | Required | Backend server hostname/IP |
Port | integer | 11434 | Backend server port |
Ssl | boolean | false | Use HTTPS for backend communication |
UnhealthyThreshold | integer | 2 | Failed checks before marking unhealthy |
HealthyThreshold | integer | 2 | Successful checks before marking healthy |
HealthCheckMethod | string | "GET" | HTTP method for health checks, either GET or HEAD |
HealthCheckUrl | string | "/" | URL path for health checks |
MaxParallelRequests | integer | 4 | Maximum concurrent requests |
RateLimitRequestsThreshold | integer | 10 | Rate limiting threshold |
AllowEmbeddings | boolean | true | Allow embeddings API requests |
AllowCompletions | boolean | true | Allow completions API requests |
PinnedEmbeddingsProperties | object | {} | Key-value pairs merged into embeddings requests |
PinnedCompletionsProperties | object | {} | Key-value pairs merged into completions requests |
LogRequestFull | boolean | false | Log complete requests |
LogRequestBody | boolean | false | Log request bodies |
LogResponseBody | boolean | false | Log response bodies |
Active | boolean | true | Whether backend is active |
Health Check Configuration
Health checks validate backend availability:
- Method: HTTP method (
GET
,HEAD
) - URL: Path to check (e.g.,
/
,/api/version
,/health
) - Thresholds: Number of consecutive successes/failures to change state
Common health check endpoints:
HEAD /
: Basic connectivity check for OllamaGET /health
: Basic connectivity check for vLLM
Rate Limiting
Backends can enforce rate limits:
- Requests exceeding
RateLimitRequestsThreshold
receive HTTP 429 - Rate limiting is per backend, not global
- Helps protect individual Ollama instances from overload
Security Controls
Backend security controls provide additional layers of request filtering and parameter enforcement:
Request Type Controls
AllowEmbeddings
: Controls whether this backend can process embeddings requestsAllowCompletions
: Controls whether this backend can process completion requests
Requests are only routed to backends that allow the specific request type. This enables:
- Dedicated embeddings servers that only handle embeddings requests:
- Ollama API:
/api/embed
- OpenAI API:
/v1/embeddings
- Ollama API:
- Completion-only servers that only handle completion requests:
- Ollama API:
/api/generate
,/api/chat
- OpenAI API:
/v1/completions
,/v1/chat/completions
- Ollama API:
- Multi-tenant isolation by request type
Pinned Properties
Backend pinned properties provide server-level parameter enforcement:
PinnedEmbeddingsProperties
: Applied to all embeddings requests routed to this backendPinnedCompletionsProperties
: Applied to all completion requests routed to this backend
Backend pinned properties are merged after frontend pinned properties, allowing for:
- Server-specific resource limits:
{"options": {"num_ctx": 1024}}
- Hardware-optimized settings:
{"options": {"num_gpu": 2}}
- Backend-specific model overrides:
{"model": "server-optimized-model"}
The merge order is: Client Request → Frontend Pinned Properties → Backend Pinned Properties, with later values taking precedence.
docker run -d \
-e OLLAMAFLOW_PORT=8080 \
-e OLLAMAFLOW_ADMIN_TOKEN=my-secure-token \
-e ASPNETCORE_ENVIRONMENT=Production \
-p 8080:8080 \
jchristn/ollamaflow
Configuration Examples
Basic Single Backend
Minimal configuration for testing:
{
"Webserver": {
"Port": 43411
},
"AdminBearerTokens": ["test-token"]
}
Frontend/Backend via API:
# Create backend
curl -X PUT -H "Authorization: Bearer test-token" \
-H "Content-Type: application/json" \
-d '{"Identifier": "local", "Hostname": "localhost", "Port": 11434}' \
http://localhost:43411/v1.0/backends
# Create frontend
curl -X PUT -H "Authorization: Bearer test-token" \
-H "Content-Type: application/json" \
-d '{"Identifier": "main", "Backends": ["local"]}' \
http://localhost:43411/v1.0/frontends
Production Multi-Backend
Production configuration with multiple GPU servers:
{
"Webserver": {
"Hostname": "*",
"Port": 43411,
"Ssl": {
"Enable": true,
"CertificateFile": "/etc/ssl/ollamaflow.crt",
"CertificatePassword": "cert-password"
}
},
"Logging": {
"LogDirectory": "/var/log/ollamaflow/",
"ConsoleLogging": false,
"MinimumSeverity": 1
},
"DatabaseFilename": "/var/lib/ollamaflow/ollamaflow.db",
"AdminBearerTokens": [
"secure-production-token-1",
"secure-production-token-2"
]
}
Backends configuration:
# GPU servers
for i in {1..4}; do
curl -X PUT -H "Authorization: Bearer secure-production-token-1" \
-H "Content-Type: application/json" \
-d "{
\"Identifier\": \"gpu-$i\",
\"Name\": \"GPU Server $i\",
\"Hostname\": \"gpu$i.company.internal\",
\"Port\": 11434,
\"MaxParallelRequests\": 8,
\"HealthCheckUrl\": \"/api/version\"
}" \
http://localhost:43411/v1.0/backends
done
# Production frontend
curl -X PUT -H "Authorization: Bearer secure-production-token-1" \
-H "Content-Type: application/json" \
-d '{
"Identifier": "production",
"Name": "Production AI Inference",
"Hostname": "ai.company.com",
"LoadBalancing": "RoundRobin",
"Backends": ["gpu-1", "gpu-2", "gpu-3", "gpu-4"],
"RequiredModels": ["llama3:8b", "mistral:7b", "codellama:13b"],
"TimeoutMs": 120000
}' \
http://localhost:43411/v1.0/frontends
Development Environment
Development setup with debugging enabled:
{
"Webserver": {
"Port": 43411,
"Debug": {
"Routing": true,
"Requests": true,
"Responses": false
}
},
"Logging": {
"ConsoleLogging": true,
"EnableColors": true,
"MinimumSeverity": 0
},
"AdminBearerTokens": ["dev-token"]
}
Configuration Validation
OllamaFlow validates configuration on startup:
Common Validation Errors
- Invalid Port Range: Ports must be 1-65535
- Missing Required Fields: Identifier, Hostname required for backends
- Duplicate Identifiers: Frontend/Backend IDs must be unique
- Invalid Load Balancing: Must be "RoundRobin" or "Random"
- Invalid Hostnames: Must be valid hostname or "*"
Configuration Test
Validate configuration without starting the server:
# Test configuration file
dotnet OllamaFlow.Server.dll --validate-config
# Test with specific config
OLLAMAFLOW_CONFIG=/path/to/config.json dotnet OllamaFlow.Server.dll --validate-config
Migration and Backup
Database Backup
Backup the SQLite database regularly:
# Simple copy (stop OllamaFlow first)
cp ollamaflow.db ollamaflow.db.backup
# Online backup (while running)
sqlite3 ollamaflow.db ".backup /path/to/backup.db"
Configuration Migration
When upgrading OllamaFlow:
- Backup Configuration: Save current
ollamaflow.json
- Backup Database: Save current
ollamaflow.db
- Review Changes: Check for new configuration options
- Test Upgrade: Test in non-production environment first
Export/Import Configuration
Export current configuration for replication:
# Export all frontends
curl -H "Authorization: Bearer token" \
http://localhost:43411/v1.0/frontends > frontends.json
# Export all backends
curl -H "Authorization: Bearer token" \
http://localhost:43411/v1.0/backends > backends.json
Import configuration to new instance:
# Import backends first
cat backends.json | jq '.[]' | while read backend; do
curl -X PUT -H "Authorization: Bearer token" \
-H "Content-Type: application/json" \
-d "$backend" \
http://new-host:43411/v1.0/backends
done
# Then import frontends
cat frontends.json | jq '.[]' | while read frontend; do
curl -X PUT -H "Authorization: Bearer token" \
-H "Content-Type: application/json" \
-d "$frontend" \
http://new-host:43411/v1.0/frontends
done
Next Steps
- Review API Reference for programmatic configuration
- Explore Deployment Options for your infrastructure
- Check Monitoring and Observability for production insights
Main Configuration File
OllamaFlow uses a JSON configuration file (ollamaflow.json
) that defines server settings, logging, and authentication. On first startup, a default configuration is created if none exists.
Default Configuration Location
- Docker:
/app/ollamaflow.json
- Bare Metal: Same directory as executable
- Custom: Specify with
OLLAMAFLOW_CONFIG
environment variable
Complete Configuration Example
{
"Logging": {
"Servers": [
{
"Hostname": "127.0.0.1",
"Port": 514,
"RandomizePorts": false,
"MinimumPort": 65000,
"MaximumPort": 65535
}
],
"LogDirectory": "./logs/",
"LogFilename": "ollamaflow.log",
"ConsoleLogging": true,
"EnableColors": true,
"MinimumSeverity": 1
},
"Webserver": {
"Hostname": "*",
"Port": 43411,
"IO": {
"StreamBufferSize": 65536,
"MaxRequests": 1024,
"ReadTimeoutMs": 10000,
"MaxIncomingHeadersSize": 65536,
"EnableKeepAlive": false
},
"Ssl": {
"Enable": false,
"MutuallyAuthenticate": false,
"AcceptInvalidCertificates": true,
"CertificateFile": "",
"CertificatePassword": ""
},
"Headers": {
"IncludeContentLength": true,
"DefaultHeaders": {
"Access-Control-Allow-Origin": "*",
"Access-Control-Allow-Methods": "OPTIONS, HEAD, GET, PUT, POST, DELETE, PATCH",
"Access-Control-Allow-Headers": "*",
"Access-Control-Expose-Headers": "",
"Accept": "*/*",
"Accept-Language": "en-US, en",
"Accept-Charset": "ISO-8859-1, utf-8",
"Cache-Control": "no-cache",
"Connection": "close",
"Host": "localhost:43411"
}
},
"AccessControl": {
"DenyList": {},
"PermitList": {},
"Mode": "DefaultPermit"
},
"Debug": {
"AccessControl": false,
"Routing": false,
"Requests": false,
"Responses": false
}
},
"DatabaseFilename": "ollamaflow.db",
"AdminBearerTokens": [
"your-secure-admin-token"
]
}
Configuration Sections
Logging Settings
Controls how OllamaFlow logs information and errors.
Setting | Type | Default | Description |
---|---|---|---|
LogDirectory | string | "./logs/" | Directory for log files |
LogFilename | string | "ollamaflow.log" | Base filename for logs |
ConsoleLogging | boolean | true | Enable console output |
EnableColors | boolean | true | Enable colored console output |
MinimumSeverity | integer | 1 | Minimum log level (0=Debug, 1=Info, 2=Warn, 3=Error) |
Syslog Servers
Optional remote logging configuration:
{
"Servers": [
{
"Hostname": "syslog.company.com",
"Port": 514,
"RandomizePorts": false,
"MinimumPort": 65000,
"MaximumPort": 65535
}
]
}
Webserver Settings
Configures the HTTP server that handles all requests.
Basic Settings
Setting | Type | Default | Description |
---|---|---|---|
Hostname | string | "*" | Bind hostname (* for all interfaces) |
Port | integer | 43411 | TCP port to listen on |
IO Settings
Controls request handling and performance:
Setting | Type | Default | Description |
---|---|---|---|
StreamBufferSize | integer | 65536 | Buffer size for streaming responses |
MaxRequests | integer | 1024 | Maximum concurrent requests |
ReadTimeoutMs | integer | 10000 | Request read timeout in milliseconds |
MaxIncomingHeadersSize | integer | 65536 | Maximum size of request headers |
EnableKeepAlive | boolean | false | Enable HTTP keep-alive connections |
SSL Settings
HTTPS configuration for secure connections:
Setting | Type | Default | Description |
---|---|---|---|
Enable | boolean | false | Enable HTTPS |
MutuallyAuthenticate | boolean | false | Require client certificates |
AcceptInvalidCertificates | boolean | true | Accept self-signed certificates |
CertificateFile | string | "" | Path to SSL certificate file |
CertificatePassword | string | "" | Certificate password if required |
Headers Settings
Default HTTP headers and CORS configuration:
{
"IncludeContentLength": true,
"DefaultHeaders": {
"Access-Control-Allow-Origin": "*",
"Access-Control-Allow-Methods": "OPTIONS, HEAD, GET, PUT, POST, DELETE, PATCH",
"Access-Control-Allow-Headers": "*"
}
}
Access Control
IP-based access control (optional):
{
"DenyList": {
"192.168.1.100": "Blocked IP",
"10.0.0.0/8": "Blocked network"
},
"PermitList": {
"192.168.1.0/24": "Allowed network"
},
"Mode": "DefaultPermit"
}
Modes:
DefaultPermit
: Allow all except denied IPsDefaultDeny
: Deny all except permitted IPs
Database Settings
Setting | Type | Default | Description |
---|---|---|---|
DatabaseFilename | string | "ollamaflow.db" | SQLite database file path |
Authentication Settings
Setting | Type | Default | Description |
---|---|---|---|
AdminBearerTokens | array | ["ollamaflowadmin"] | Valid bearer tokens for admin APIs |
Frontend Configuration
Frontends are virtual Ollama endpoints that clients connect to. They are stored in the database and managed via the API.
Frontend Object Structure
{
"Identifier": "production-frontend",
"Name": "Production AI Inference",
"Hostname": "ai.company.com",
"TimeoutMs": 90000,
"LoadBalancing": "RoundRobin",
"BlockHttp10": true,
"MaxRequestBodySize": 1073741824,
"Backends": ["gpu-1", "gpu-2", "gpu-3"],
"RequiredModels": ["llama3:8b", "mistral:7b"],
"AllowEmbeddings": true,
"AllowCompletions": true,
"PinnedEmbeddingsProperties": {
"model": "nomic-embed-text",
"options": {
"temperature": 0.1
}
},
"PinnedCompletionsProperties": {
"options": {
"temperature": 0.7,
"num_ctx": 2048
}
},
"LogRequestFull": false,
"LogRequestBody": false,
"LogResponseBody": false,
"UseStickySessions": false,
"StickySessionExpirationMs": 1800000,
"Active": true
}
Frontend Properties
Property | Type | Default | Description |
---|---|---|---|
Identifier | string | Required | Unique identifier for this frontend |
Name | string | Required | Human-readable name |
Hostname | string | "*" | Hostname pattern (* for catch-all) |
TimeoutMs | integer | 60000 | Request timeout in milliseconds |
LoadBalancing | enum | "RoundRobin" | Load balancing algorithm |
BlockHttp10 | boolean | true | Reject HTTP/1.0 requests |
MaxRequestBodySize | integer | 536870912 | Max request size in bytes (512MB) |
Backends | array | [] | List of backend identifiers |
RequiredModels | array | [] | Models that must be available |
AllowEmbeddings | boolean | true | Allow embeddings API requests |
AllowCompletions | boolean | true | Allow completions API requests |
PinnedEmbeddingsProperties | object | {} | Key-value pairs merged into embeddings requests |
PinnedCompletionsProperties | object | {} | Key-value pairs merged into completions requests |
UseStickySessions | boolean | false | Enable session stickiness |
StickySessionExpirationMs | integer | 1800000 | Session timeout (30 minutes, min: 10s, max: 24h) |
LogRequestFull | boolean | false | Log complete requests |
LogRequestBody | boolean | false | Log request bodies |
LogResponseBody | boolean | false | Log response bodies |
Active | boolean | true | Whether frontend is active |
Load Balancing Options
"RoundRobin"
: Cycle through backends sequentially"Random"
: Randomly select from healthy backends
Hostname Patterns
"*"
: Match all hostnames (catch-all)"api.company.com"
: Exact hostname match- Multiple frontends can exist with different hostname patterns
Security Controls
Frontend security controls enable fine-grained access control and request parameter enforcement:
Request Type Controls
AllowEmbeddings
: Controls whether embeddings API endpoints are accessible through this frontend- Ollama API:
/api/embed
- OpenAI API:
/v1/embeddings
- Ollama API:
AllowCompletions
: Controls whether completion API endpoints are accessible through this frontend- Ollama API:
/api/generate
,/api/chat
- OpenAI API:
/v1/completions
,/v1/chat/completions
- Ollama API:
For a request to succeed, both the frontend and at least one assigned backend must allow the request type.
Pinned Properties
Pinned properties allow administrators to enforce specific parameters in requests, providing security compliance and standardization:
PinnedEmbeddingsProperties
: Key-value pairs automatically merged into all embeddings requestsPinnedCompletionsProperties
: Key-value pairs automatically merged into all completion requests
Common use cases:
- Enforce maximum context size:
{"options": {"num_ctx": 2048}}
- Standardize temperature settings:
{"options": {"temperature": 0.7}}
- Override model selection:
{"model": "approved-model:latest"}
- Set organizational defaults:
{"options": {"top_p": 0.9, "top_k": 40}}
Properties are merged with client requests, with pinned properties taking precedence over client-specified values.
Backend Configuration
Backends represent physical Ollama instances in your infrastructure.
Backend Object Structure
{
"Identifier": "gpu-server-1",
"Name": "Primary GPU Server",
"Hostname": "192.168.1.100",
"Port": 11434,
"Ssl": false,
"UnhealthyThreshold": 3,
"HealthyThreshold": 2,
"HealthCheckMethod": "GET",
"HealthCheckUrl": "/api/version",
"MaxParallelRequests": 8,
"RateLimitRequestsThreshold": 20,
"AllowEmbeddings": true,
"AllowCompletions": true,
"PinnedEmbeddingsProperties": {
"options": {
"num_ctx": 512
}
},
"PinnedCompletionsProperties": {
"options": {
"num_ctx": 4096,
"temperature": 0.8
}
},
"LogRequestFull": false,
"LogRequestBody": false,
"LogResponseBody": false,
"Active": true
}
Backend Properties
Property | Type | Default | Description |
---|---|---|---|
Identifier | string | Required | Unique identifier for this backend |
Name | string | Required | Human-readable name |
Hostname | string | Required | Backend server hostname/IP |
Port | integer | 11434 | Backend server port |
Ssl | boolean | false | Use HTTPS for backend communication |
UnhealthyThreshold | integer | 2 | Failed checks before marking unhealthy |
HealthyThreshold | integer | 2 | Successful checks before marking healthy |
HealthCheckMethod | string | "GET" | HTTP method for health checks |
HealthCheckUrl | string | "/" | URL path for health checks |
MaxParallelRequests | integer | 4 | Maximum concurrent requests |
RateLimitRequestsThreshold | integer | 10 | Rate limiting threshold |
AllowEmbeddings | boolean | true | Allow embeddings API requests |
AllowCompletions | boolean | true | Allow completions API requests |
PinnedEmbeddingsProperties | object | {} | Key-value pairs merged into embeddings requests |
PinnedCompletionsProperties | object | {} | Key-value pairs merged into completions requests |
LogRequestFull | boolean | false | Log complete requests |
LogRequestBody | boolean | false | Log request bodies |
LogResponseBody | boolean | false | Log response bodies |
Active | boolean | true | Whether backend is active |
Health Check Configuration
Health checks validate backend availability:
- Method: HTTP method (GET, HEAD, POST)
- URL: Path to check (e.g.,
/
,/api/version
,/health
) - Thresholds: Number of consecutive successes/failures to change state
Common health check endpoints:
/
: Basic connectivity check/api/version
: Ollama version endpoint/api/tags
: Model listing endpoint
Rate Limiting
Backends can enforce rate limits:
- Requests exceeding
RateLimitRequestsThreshold
receive HTTP 429 - Rate limiting is per backend, not global
- Helps protect individual Ollama instances from overload
Security Controls
Backend security controls provide additional layers of request filtering and parameter enforcement:
Request Type Controls
AllowEmbeddings
: Controls whether this backend can process embeddings requestsAllowCompletions
: Controls whether this backend can process completion requests
Requests are only routed to backends that allow the specific request type. This enables:
- Dedicated embeddings servers that only handle embeddings requests:
- Ollama API:
/api/embed
- OpenAI API:
/v1/embeddings
- Ollama API:
- Completion-only servers that only handle completion requests:
- Ollama API:
/api/generate
,/api/chat
- OpenAI API:
/v1/completions
,/v1/chat/completions
- Ollama API:
- Multi-tenant isolation by request type
Pinned Properties
Backend pinned properties provide server-level parameter enforcement:
PinnedEmbeddingsProperties
: Applied to all embeddings requests routed to this backendPinnedCompletionsProperties
: Applied to all completion requests routed to this backend
Backend pinned properties are merged after frontend pinned properties, allowing for:
- Server-specific resource limits:
{"options": {"num_ctx": 1024}}
- Hardware-optimized settings:
{"options": {"num_gpu": 2}}
- Backend-specific model overrides:
{"model": "server-optimized-model"}
The merge order is: Client Request → Frontend Pinned Properties → Backend Pinned Properties, with later values taking precedence.
Environment Variables
Override configuration with environment variables:
Variable | Description | Example |
---|---|---|
OLLAMAFLOW_CONFIG | Configuration file path | /etc/ollamaflow/config.json |
OLLAMAFLOW_PORT | Override webserver port | 8080 |
OLLAMAFLOW_HOSTNAME | Override webserver hostname | 0.0.0.0 |
OLLAMAFLOW_DATABASE | Override database file path | /data/ollamaflow.db |
OLLAMAFLOW_ADMIN_TOKEN | Override admin token | secure-production-token |
ASPNETCORE_ENVIRONMENT | .NET environment | Production |
Docker Environment Example
docker run -d \
-e OLLAMAFLOW_PORT=8080 \
-e OLLAMAFLOW_ADMIN_TOKEN=my-secure-token \
-e ASPNETCORE_ENVIRONMENT=Production \
-p 8080:8080 \
jchristn/ollamaflow
Configuration Examples
Basic Single Backend
Minimal configuration for testing:
{
"Webserver": {
"Port": 43411
},
"AdminBearerTokens": ["test-token"]
}
Frontend/Backend via API:
# Create backend
curl -X PUT -H "Authorization: Bearer test-token" \
-H "Content-Type: application/json" \
-d '{"Identifier": "local", "Hostname": "localhost", "Port": 11434}' \
http://localhost:43411/v1.0/backends
# Create frontend
curl -X PUT -H "Authorization: Bearer test-token" \
-H "Content-Type: application/json" \
-d '{"Identifier": "main", "Backends": ["local"]}' \
http://localhost:43411/v1.0/frontends
Production Multi-Backend
Production configuration with multiple GPU servers:
{
"Webserver": {
"Hostname": "*",
"Port": 43411,
"Ssl": {
"Enable": true,
"CertificateFile": "/etc/ssl/ollamaflow.crt",
"CertificatePassword": "cert-password"
}
},
"Logging": {
"LogDirectory": "/var/log/ollamaflow/",
"ConsoleLogging": false,
"MinimumSeverity": 1
},
"DatabaseFilename": "/var/lib/ollamaflow/ollamaflow.db",
"AdminBearerTokens": [
"secure-production-token-1",
"secure-production-token-2"
]
}
Backends configuration:
# GPU servers
for i in {1..4}; do
curl -X PUT -H "Authorization: Bearer secure-production-token-1" \
-H "Content-Type: application/json" \
-d "{
\"Identifier\": \"gpu-$i\",
\"Name\": \"GPU Server $i\",
\"Hostname\": \"gpu$i.company.internal\",
\"Port\": 11434,
\"MaxParallelRequests\": 8,
\"HealthCheckUrl\": \"/api/version\"
}" \
http://localhost:43411/v1.0/backends
done
# Production frontend
curl -X PUT -H "Authorization: Bearer secure-production-token-1" \
-H "Content-Type: application/json" \
-d '{
"Identifier": "production",
"Name": "Production AI Inference",
"Hostname": "ai.company.com",
"LoadBalancing": "RoundRobin",
"Backends": ["gpu-1", "gpu-2", "gpu-3", "gpu-4"],
"RequiredModels": ["llama3:8b", "mistral:7b", "codellama:13b"],
"TimeoutMs": 120000
}' \
http://localhost:43411/v1.0/frontends
Development Environment
Development setup with debugging enabled:
{
"Webserver": {
"Port": 43411,
"Debug": {
"Routing": true,
"Requests": true,
"Responses": false
}
},
"Logging": {
"ConsoleLogging": true,
"EnableColors": true,
"MinimumSeverity": 0
},
"AdminBearerTokens": ["dev-token"]
}
Configuration Validation
OllamaFlow validates configuration on startup:
Common Validation Errors
- Invalid Port Range: Ports must be 1-65535
- Missing Required Fields: Identifier, Hostname required for backends
- Duplicate Identifiers: Frontend/Backend IDs must be unique
- Invalid Load Balancing: Must be "RoundRobin" or "Random"
- Invalid Hostnames: Must be valid hostname or "*"
Configuration Test
Validate configuration without starting the server:
# Test configuration file
dotnet OllamaFlow.Server.dll --validate-config
# Test with specific config
OLLAMAFLOW_CONFIG=/path/to/config.json dotnet OllamaFlow.Server.dll --validate-config
Migration and Backup
Database Backup
Backup the SQLite database regularly:
# Simple copy (stop OllamaFlow first)
cp ollamaflow.db ollamaflow.db.backup
# Online backup (while running)
sqlite3 ollamaflow.db ".backup /path/to/backup.db"
Configuration Migration
When upgrading OllamaFlow:
- Backup Configuration: Save current
ollamaflow.json
- Backup Database: Save current
ollamaflow.db
- Review Changes: Check for new configuration options
- Test Upgrade: Test in non-production environment first
Export/Import Configuration
Export current configuration for replication:
# Export all frontends
curl -H "Authorization: Bearer token" \
http://localhost:43411/v1.0/frontends > frontends.json
# Export all backends
curl -H "Authorization: Bearer token" \
http://localhost:43411/v1.0/backends > backends.json
Import configuration to new instance:
# Import backends first
cat backends.json | jq '.[]' | while read backend; do
curl -X PUT -H "Authorization: Bearer token" \
-H "Content-Type: application/json" \
-d "$backend" \
http://new-host:43411/v1.0/backends
done
# Then import frontends
cat frontends.json | jq '.[]' | while read frontend; do
curl -X PUT -H "Authorization: Bearer token" \
-H "Content-Type: application/json" \
-d "$frontend" \
http://new-host:43411/v1.0/frontends
done
Next Steps
- Review API Reference for programmatic configuration
- Explore Deployment Options for your infrastructure
- Check Monitoring and Observability for production insights
Main Configuration File
OllamaFlow uses a JSON configuration file (ollamaflow.json
) that defines server settings, logging, and authentication. On first startup, a default configuration is created if none exists.
Default Configuration Location
- Docker:
/app/ollamaflow.json
- Bare Metal: Same directory as executable
- Custom: Specify with
OLLAMAFLOW_CONFIG
environment variable
Complete Configuration Example
{
"Logging": {
"Servers": [
{
"Hostname": "127.0.0.1",
"Port": 514,
"RandomizePorts": false,
"MinimumPort": 65000,
"MaximumPort": 65535
}
],
"LogDirectory": "./logs/",
"LogFilename": "ollamaflow.log",
"ConsoleLogging": true,
"EnableColors": true,
"MinimumSeverity": 1
},
"Webserver": {
"Hostname": "*",
"Port": 43411,
"IO": {
"StreamBufferSize": 65536,
"MaxRequests": 1024,
"ReadTimeoutMs": 10000,
"MaxIncomingHeadersSize": 65536,
"EnableKeepAlive": false
},
"Ssl": {
"Enable": false,
"MutuallyAuthenticate": false,
"AcceptInvalidCertificates": true,
"CertificateFile": "",
"CertificatePassword": ""
},
"Headers": {
"IncludeContentLength": true,
"DefaultHeaders": {
"Access-Control-Allow-Origin": "*",
"Access-Control-Allow-Methods": "OPTIONS, HEAD, GET, PUT, POST, DELETE, PATCH",
"Access-Control-Allow-Headers": "*",
"Access-Control-Expose-Headers": "",
"Accept": "*/*",
"Accept-Language": "en-US, en",
"Accept-Charset": "ISO-8859-1, utf-8",
"Cache-Control": "no-cache",
"Connection": "close",
"Host": "localhost:43411"
}
},
"AccessControl": {
"DenyList": {},
"PermitList": {},
"Mode": "DefaultPermit"
},
"Debug": {
"AccessControl": false,
"Routing": false,
"Requests": false,
"Responses": false
}
},
"DatabaseFilename": "ollamaflow.db",
"AdminBearerTokens": [
"your-secure-admin-token"
],
"StickyHeaders": [
"x-conversation-id",
"x-thread-id"
]
}
Configuration Sections
Logging Settings
Controls how OllamaFlow logs information and errors.
Setting | Type | Default | Description |
---|---|---|---|
LogDirectory | string | "./logs/" | Directory for log files |
LogFilename | string | "ollamaflow.log" | Base filename for logs |
ConsoleLogging | boolean | true | Enable console output |
EnableColors | boolean | true | Enable colored console output |
MinimumSeverity | integer | 1 | Minimum log level (0=Debug, 1=Info, 2=Warn, 3=Error) |
Syslog Servers
Optional remote logging configuration:
{
"Servers": [
{
"Hostname": "syslog.company.com",
"Port": 514,
"RandomizePorts": false,
"MinimumPort": 65000,
"MaximumPort": 65535
}
]
}
Webserver Settings
Configures the HTTP server that handles all requests.
Basic Settings
Setting | Type | Default | Description |
---|---|---|---|
Hostname | string | "*" | Bind hostname (* for all interfaces) |
Port | integer | 43411 | TCP port to listen on |
IO Settings
Controls request handling and performance:
Setting | Type | Default | Description |
---|---|---|---|
StreamBufferSize | integer | 65536 | Buffer size for streaming responses |
MaxRequests | integer | 1024 | Maximum concurrent requests |
ReadTimeoutMs | integer | 10000 | Request read timeout in milliseconds |
MaxIncomingHeadersSize | integer | 65536 | Maximum size of request headers |
EnableKeepAlive | boolean | false | Enable HTTP keep-alive connections |
SSL Settings
HTTPS configuration for secure connections:
Setting | Type | Default | Description |
---|---|---|---|
Enable | boolean | false | Enable HTTPS |
MutuallyAuthenticate | boolean | false | Require client certificates |
AcceptInvalidCertificates | boolean | true | Accept self-signed certificates |
CertificateFile | string | "" | Path to SSL certificate file |
CertificatePassword | string | "" | Certificate password if required |
Headers Settings
Default HTTP headers and CORS configuration:
{
"IncludeContentLength": true,
"DefaultHeaders": {
"Access-Control-Allow-Origin": "*",
"Access-Control-Allow-Methods": "OPTIONS, HEAD, GET, PUT, POST, DELETE, PATCH",
"Access-Control-Allow-Headers": "*"
}
}
Access Control
IP-based access control (optional):
{
"DenyList": {
"192.168.1.100": "Blocked IP",
"10.0.0.0/8": "Blocked network"
},
"PermitList": {
"192.168.1.0/24": "Allowed network"
},
"Mode": "DefaultPermit"
}
Modes:
DefaultPermit
: Allow all except denied IPsDefaultDeny
: Deny all except permitted IPs
Database Settings
Setting | Type | Default | Description |
---|---|---|---|
DatabaseFilename | string | "ollamaflow.db" | SQLite database file path |
Authentication Settings
Setting | Type | Default | Description |
---|---|---|---|
AdminBearerTokens | array | ["ollamaflowadmin"] | Valid bearer tokens for admin APIs |
Sticky Headers
The StickyHeaders
string array specifies on which headers to match to uniquely identify a client when using session stickiness. If you are not using session stickiness, set this to an empty array. A case-insensitive comparison is used, meaning x-conversation-id
and X-Conversation-ID
are considered the same while evaluating headers.
If no sticky headers are defined and session stickiness is enabled, the client IP address will be used as the client identifier.
Frontend Configuration
Frontends are virtual Ollama endpoints that clients connect to. They are stored in the database and managed via the API.
Frontend Object Structure
{
"Identifier": "production-frontend",
"Name": "Production AI Inference",
"Hostname": "ai.company.com",
"TimeoutMs": 90000,
"LoadBalancing": "RoundRobin",
"BlockHttp10": true,
"MaxRequestBodySize": 1073741824,
"Backends": ["gpu-1", "gpu-2", "gpu-3"],
"RequiredModels": ["llama3:8b", "mistral:7b"],
"LogRequestFull": false,
"LogRequestBody": false,
"LogResponseBody": false,
"UseStickySessions": false,
"StickySessionExpirationMs": 1800000,
"Active": true
}
Frontend Properties
Property | Type | Default | Description |
---|---|---|---|
Identifier | string | Required | Unique identifier for this frontend |
Name | string | Required | Human-readable name |
Hostname | string | "*" | Hostname pattern (* for catch-all) |
TimeoutMs | integer | 60000 | Request timeout in milliseconds |
LoadBalancing | enum | "RoundRobin" | Load balancing algorithm |
BlockHttp10 | boolean | true | Reject HTTP/1.0 requests |
MaxRequestBodySize | integer | 536870912 | Max request size in bytes (512MB) |
Backends | array | [] | List of backend identifiers |
RequiredModels | array | [] | Models that must be available |
UseStickySessions | boolean | false | Enable session stickiness |
StickySessionExpirationMs | integer | 1800000 | Session timeout (30 minutes, min: 10s, max: 24h) |
LogRequestFull | boolean | false | Log complete requests |
LogRequestBody | boolean | false | Log request bodies |
LogResponseBody | boolean | false | Log response bodies |
Active | boolean | true | Whether frontend is active |
Load Balancing Options
"RoundRobin"
: Cycle through backends sequentially"Random"
: Randomly select from healthy backends
Hostname Patterns
"*"
: Match all hostnames (catch-all)"api.company.com"
: Exact hostname match- Multiple frontends can exist with different hostname patterns
Backend Configuration
Backends represent physical Ollama instances in your infrastructure.
Backend Object Structure
{
"Identifier": "gpu-server-1",
"Name": "Primary GPU Server",
"Hostname": "192.168.1.100",
"Port": 11434,
"Ssl": false,
"UnhealthyThreshold": 3,
"HealthyThreshold": 2,
"HealthCheckMethod": "GET",
"HealthCheckUrl": "/api/version",
"MaxParallelRequests": 8,
"RateLimitRequestsThreshold": 20,
"ApiFormat": "Ollama",
"LogRequestFull": false,
"LogRequestBody": false,
"LogResponseBody": false,
"Active": true
}
Backend Properties
Property | Type | Default | Description |
---|---|---|---|
Identifier | string | Required | Unique identifier for this backend |
Name | string | Required | Human-readable name |
Hostname | string | Required | Backend server hostname/IP |
Port | integer | 11434 | Backend server port |
Ssl | boolean | false | Use HTTPS for backend communication |
UnhealthyThreshold | integer | 2 | Failed checks before marking unhealthy |
HealthyThreshold | integer | 2 | Successful checks before marking healthy |
HealthCheckMethod | string | "GET" | HTTP method for health checks |
HealthCheckUrl | string | "/" | URL path for health checks |
MaxParallelRequests | integer | 4 | Maximum concurrent requests |
RateLimitRequestsThreshold | integer | 10 | Rate limiting threshold |
ApiFormat | string | Ollama | Backend API format, either Ollama or OpenAI |
LogRequestFull | boolean | false | Log complete requests |
LogRequestBody | boolean | false | Log request bodies |
LogResponseBody | boolean | false | Log response bodies |
Active | boolean | true | Whether backend is active |
Health Check Configuration
Health checks validate backend availability:
- Method: HTTP method (GET, HEAD, POST)
- URL: Path to check (e.g.,
/
,/api/version
,/health
) - Thresholds: Number of consecutive successes/failures to change state
IMPORTANT: vLLM expects healthchecks on GET /health
Common health check endpoints:
/
: Basic connectivity check/api/version
: Ollama version endpoint/api/tags
: Model listing endpoint
Rate Limiting
Backends can enforce rate limits:
- Requests exceeding
RateLimitRequestsThreshold
receive HTTP 429 - Rate limiting is per backend, not global
- Helps protect individual Ollama instances from overload
Environment Variables
Override configuration with environment variables:
Variable | Description | Example |
---|---|---|
OLLAMAFLOW_CONFIG | Configuration file path | /etc/ollamaflow/config.json |
OLLAMAFLOW_PORT | Override webserver port | 8080 |
OLLAMAFLOW_HOSTNAME | Override webserver hostname | 0.0.0.0 |
OLLAMAFLOW_DATABASE | Override database file path | /data/ollamaflow.db |
OLLAMAFLOW_ADMIN_TOKEN | Override admin token | secure-production-token |
ASPNETCORE_ENVIRONMENT | .NET environment | Production |
Docker Environment Example
docker run -d \
-e OLLAMAFLOW_PORT=8080 \
-e OLLAMAFLOW_ADMIN_TOKEN=my-secure-token \
-e ASPNETCORE_ENVIRONMENT=Production \
-p 8080:8080 \
jchristn/ollamaflow
Configuration Examples
Basic Single Backend
Minimal configuration for testing:
{
"Webserver": {
"Port": 43411
},
"AdminBearerTokens": ["test-token"]
}
Frontend/Backend via API:
# Create backend
curl -X PUT -H "Authorization: Bearer test-token" \
-H "Content-Type: application/json" \
-d '{"Identifier": "local", "Hostname": "localhost", "Port": 11434}' \
http://localhost:43411/v1.0/backends
# Create frontend
curl -X PUT -H "Authorization: Bearer test-token" \
-H "Content-Type: application/json" \
-d '{"Identifier": "main", "Backends": ["local"]}' \
http://localhost:43411/v1.0/frontends
Production Multi-Backend
Production configuration with multiple GPU servers:
{
"Webserver": {
"Hostname": "*",
"Port": 43411,
"Ssl": {
"Enable": true,
"CertificateFile": "/etc/ssl/ollamaflow.crt",
"CertificatePassword": "cert-password"
}
},
"Logging": {
"LogDirectory": "/var/log/ollamaflow/",
"ConsoleLogging": false,
"MinimumSeverity": 1
},
"DatabaseFilename": "/var/lib/ollamaflow/ollamaflow.db",
"AdminBearerTokens": [
"secure-production-token-1",
"secure-production-token-2"
]
}
Backends configuration:
# GPU servers
for i in {1..4}; do
curl -X PUT -H "Authorization: Bearer secure-production-token-1" \
-H "Content-Type: application/json" \
-d "{
\"Identifier\": \"gpu-$i\",
\"Name\": \"GPU Server $i\",
\"Hostname\": \"gpu$i.company.internal\",
\"Port\": 11434,
\"MaxParallelRequests\": 8,
\"HealthCheckUrl\": \"/api/version\"
}" \
http://localhost:43411/v1.0/backends
done
# Production frontend
curl -X PUT -H "Authorization: Bearer secure-production-token-1" \
-H "Content-Type: application/json" \
-d '{
"Identifier": "production",
"Name": "Production AI Inference",
"Hostname": "ai.company.com",
"LoadBalancing": "RoundRobin",
"Backends": ["gpu-1", "gpu-2", "gpu-3", "gpu-4"],
"RequiredModels": ["llama3:8b", "mistral:7b", "codellama:13b"],
"TimeoutMs": 120000
}' \
http://localhost:43411/v1.0/frontends
Development Environment
Development setup with debugging enabled:
{
"Webserver": {
"Port": 43411,
"Debug": {
"Routing": true,
"Requests": true,
"Responses": false
}
},
"Logging": {
"ConsoleLogging": true,
"EnableColors": true,
"MinimumSeverity": 0
},
"AdminBearerTokens": ["dev-token"]
}
Configuration Validation
OllamaFlow validates configuration on startup:
Common Validation Errors
- Invalid Port Range: Ports must be 1-65535
- Missing Required Fields: Identifier, Hostname required for backends
- Duplicate Identifiers: Frontend/Backend IDs must be unique
- Invalid Load Balancing: Must be "RoundRobin" or "Random"
- Invalid Hostnames: Must be valid hostname or "*"
Configuration Test
Validate configuration without starting the server:
# Test configuration file
dotnet OllamaFlow.Server.dll --validate-config
# Test with specific config
OLLAMAFLOW_CONFIG=/path/to/config.json dotnet OllamaFlow.Server.dll --validate-config
Migration and Backup
Database Backup
Backup the SQLite database regularly:
# Simple copy (stop OllamaFlow first)
cp ollamaflow.db ollamaflow.db.backup
# Online backup (while running)
sqlite3 ollamaflow.db ".backup /path/to/backup.db"
Configuration Migration
When upgrading OllamaFlow:
- Backup Configuration: Save current
ollamaflow.json
- Backup Database: Save current
ollamaflow.db
- Review Changes: Check for new configuration options
- Test Upgrade: Test in non-production environment first
Export/Import Configuration
Export current configuration for replication:
# Export all frontends
curl -H "Authorization: Bearer token" \
http://localhost:43411/v1.0/frontends > frontends.json
# Export all backends
curl -H "Authorization: Bearer token" \
http://localhost:43411/v1.0/backends > backends.json
Import configuration to new instance:
# Import backends first
cat backends.json | jq '.[]' | while read backend; do
curl -X PUT -H "Authorization: Bearer token" \
-H "Content-Type: application/json" \
-d "$backend" \
http://new-host:43411/v1.0/backends
done
# Then import frontends
cat frontends.json | jq '.[]' | while read frontend; do
curl -X PUT -H "Authorization: Bearer token" \
-H "Content-Type: application/json" \
-d "$frontend" \
http://new-host:43411/v1.0/frontends
done