This comprehensive guide covers all configuration options available in OllamaFlow, including the main settings file (ollamaflow.json) and the structure of key objects like frontends and backends.
Main Configuration File
OllamaFlow uses a JSON configuration file (ollamaflow.json) that defines server settings, logging, and authentication. On first startup, a default configuration is created if none exists.
Default Configuration Location
- Docker:
/app/ollamaflow.json - Bare Metal: Same directory as executable
Complete Configuration Example
{
"Logging": {
"Servers": [
{
"Hostname": "127.0.0.1",
"Port": 514,
"RandomizePorts": false,
"MinimumPort": 65000,
"MaximumPort": 65535
}
],
"LogDirectory": "./logs/",
"LogFilename": "ollamaflow.log",
"ConsoleLogging": true,
"EnableColors": true,
"MinimumSeverity": 1
},
"Webserver": {
"Hostname": "*",
"Port": 43411,
"IO": {
"StreamBufferSize": 65536,
"MaxRequests": 1024,
"ReadTimeoutMs": 10000,
"MaxIncomingHeadersSize": 65536,
"EnableKeepAlive": false
},
"Ssl": {
"Enable": false,
"MutuallyAuthenticate": false,
"AcceptInvalidCertificates": true,
"CertificateFile": "",
"CertificatePassword": ""
},
"Headers": {
"IncludeContentLength": true,
"DefaultHeaders": {
"Access-Control-Allow-Origin": "*",
"Access-Control-Allow-Methods": "OPTIONS, HEAD, GET, PUT, POST, DELETE, PATCH",
"Access-Control-Allow-Headers": "*",
"Access-Control-Expose-Headers": "",
"Accept": "*/*",
"Accept-Language": "en-US, en",
"Accept-Charset": "ISO-8859-1, utf-8",
"Cache-Control": "no-cache",
"Connection": "close",
"Host": "localhost:43411"
}
},
"AccessControl": {
"DenyList": {},
"PermitList": {},
"Mode": "DefaultPermit"
},
"Debug": {
"AccessControl": false,
"Routing": false,
"Requests": false,
"Responses": false
}
},
"DatabaseFilename": "ollamaflow.db",
"AdminBearerTokens": [
"your-secure-admin-token"
]
}Configuration Sections
Logging Settings
Controls how OllamaFlow logs information and errors.
| Setting | Type | Default | Description |
|---|---|---|---|
LogDirectory | string | "./logs/" | Directory for log files |
LogFilename | string | "ollamaflow.log" | Base filename for logs |
ConsoleLogging | boolean | true | Enable console output |
EnableColors | boolean | true | Enable colored console output |
MinimumSeverity | integer | 1 | Minimum log level (0=Debug, 1=Info, 2=Warn, 3=Error) |
Syslog Servers
Optional remote logging configuration:
{
"Servers": [
{
"Hostname": "syslog.company.com",
"Port": 514,
"RandomizePorts": false,
"MinimumPort": 65000,
"MaximumPort": 65535
}
]
}Webserver Settings
Configures the HTTP server that handles all requests.
Basic Settings
| Setting | Type | Default | Description |
|---|---|---|---|
Hostname | string | "*" | Bind hostname (* for all interfaces) |
Port | integer | 43411 | TCP port to listen on |
IO Settings
Controls request handling and performance:
| Setting | Type | Default | Description |
|---|---|---|---|
StreamBufferSize | integer | 65536 | Buffer size for streaming responses |
MaxRequests | integer | 1024 | Maximum concurrent requests |
ReadTimeoutMs | integer | 10000 | Request read timeout in milliseconds |
MaxIncomingHeadersSize | integer | 65536 | Maximum size of request headers |
EnableKeepAlive | boolean | false | Enable HTTP keep-alive connections |
SSL Settings
HTTPS configuration for secure connections:
| Setting | Type | Default | Description |
|---|---|---|---|
Enable | boolean | false | Enable HTTPS |
MutuallyAuthenticate | boolean | false | Require client certificates |
AcceptInvalidCertificates | boolean | true | Accept self-signed certificates |
CertificateFile | string | "" | Path to SSL certificate file |
CertificatePassword | string | "" | Certificate password if required |
Headers Settings
Default HTTP headers and CORS configuration:
{
"IncludeContentLength": true,
"DefaultHeaders": {
"Access-Control-Allow-Origin": "*",
"Access-Control-Allow-Methods": "OPTIONS, HEAD, GET, PUT, POST, DELETE, PATCH",
"Access-Control-Allow-Headers": "*"
}
}Access Control
IP-based access control (optional):
{
"DenyList": {
"192.168.1.100": "Blocked IP",
"10.0.0.0/8": "Blocked network"
},
"PermitList": {
"192.168.1.0/24": "Allowed network"
},
"Mode": "DefaultPermit"
}Modes:
DefaultPermit: Allow all except denied IPsDefaultDeny: Deny all except permitted IPs
Database Settings
| Setting | Type | Default | Description |
|---|---|---|---|
DatabaseFilename | string | "ollamaflow.db" | SQLite database file path |
Authentication Settings
| Setting | Type | Default | Description |
|---|---|---|---|
AdminBearerTokens | array | ["ollamaflowadmin"] | Valid bearer tokens for admin APIs |
Frontend Configuration
Frontends are virtual Ollama endpoints that clients connect to. They are stored in the database and managed via the API.
Frontend Object Structure
{
"Identifier": "production-frontend",
"Name": "Production AI Inference",
"Hostname": "ai.company.com",
"TimeoutMs": 90000,
"LoadBalancing": "RoundRobin",
"BlockHttp10": true,
"MaxRequestBodySize": 1073741824,
"Backends": ["gpu-1", "gpu-2", "gpu-3"],
"RequiredModels": ["llama3:8b", "mistral:7b"],
"AllowEmbeddings": true,
"AllowCompletions": true,
"PinnedEmbeddingsProperties": {
"model": "nomic-embed-text",
"options": {
"temperature": 0.1
}
},
"PinnedCompletionsProperties": {
"options": {
"temperature": 0.7,
"num_ctx": 2048
}
},
"LogRequestFull": false,
"LogRequestBody": false,
"LogResponseBody": false,
"UseStickySessions": false,
"StickySessionExpirationMs": 1800000,
"Active": true
}Frontend Properties
| Property | Type | Default | Description |
|---|---|---|---|
Identifier | string | Required | Unique identifier for this frontend |
Name | string | Required | Human-readable name |
Hostname | string | "*" | Hostname pattern (* for catch-all) |
TimeoutMs | integer | 60000 | Request timeout in milliseconds |
LoadBalancing | enum | "RoundRobin" | Load balancing algorithm |
BlockHttp10 | boolean | true | Reject HTTP/1.0 requests |
MaxRequestBodySize | integer | 536870912 | Max request size in bytes (512MB) |
Backends | array | [] | List of backend identifiers |
RequiredModels | array | [] | Models that must be available |
AllowEmbeddings | boolean | true | Allow embeddings API requests |
AllowCompletions | boolean | true | Allow completions API requests |
PinnedEmbeddingsProperties | object | {} | Key-value pairs merged into embeddings requests |
PinnedCompletionsProperties | object | {} | Key-value pairs merged into completions requests |
UseStickySessions | boolean | false | Enable session stickiness |
StickySessionExpirationMs | integer | 1800000 | Session timeout (30 minutes, min: 10s, max: 24h) |
LogRequestFull | boolean | false | Log complete requests |
LogRequestBody | boolean | false | Log request bodies |
LogResponseBody | boolean | false | Log response bodies |
Active | boolean | true | Whether frontend is active |
Load Balancing Options
"RoundRobin": Cycle through backends sequentially"Random": Randomly select from healthy backends
Hostname Patterns
"*": Match all hostnames (catch-all)"api.company.com": Exact hostname match- Multiple frontends can exist with different hostname patterns
Security Controls
Frontend security controls enable fine-grained access control and request parameter enforcement:
Request Type Controls
AllowEmbeddings: Controls whether embeddings API endpoints are accessible through this frontend- Ollama API:
/api/embed - OpenAI API:
/v1/embeddings
- Ollama API:
AllowCompletions: Controls whether completion API endpoints are accessible through this frontend- Ollama API:
/api/generate,/api/chat - OpenAI API:
/v1/completions,/v1/chat/completions
- Ollama API:
For a request to succeed, both the frontend and at least one assigned backend must allow the request type.
Pinned Properties
Pinned properties allow administrators to enforce specific parameters in requests, providing security compliance and standardization:
PinnedEmbeddingsProperties: Key-value pairs automatically merged into all embeddings requestsPinnedCompletionsProperties: Key-value pairs automatically merged into all completion requests
Common use cases:
- Enforce maximum context size:
{"options": {"num_ctx": 2048}} - Standardize temperature settings:
{"options": {"temperature": 0.7}} - Override model selection:
{"model": "approved-model:latest"} - Set organizational defaults:
{"options": {"top_p": 0.9, "top_k": 40}}
Properties are merged with client requests, with pinned properties taking precedence over client-specified values.
Backend Configuration
Backends represent physical Ollama instances in your infrastructure.
Backend Object Structure
{
"Identifier": "gpu-server-1",
"Name": "Primary GPU Server",
"Hostname": "192.168.1.100",
"Port": 11434,
"Ssl": false,
"UnhealthyThreshold": 3,
"HealthyThreshold": 2,
"HealthCheckMethod": "GET",
"HealthCheckUrl": "/",
"MaxParallelRequests": 8,
"RateLimitRequestsThreshold": 20,
"AllowEmbeddings": true,
"AllowCompletions": true,
"BearerToken": null,
"Querystring": "",
"Headers": {},
"PinnedEmbeddingsProperties": {
"options": {
"num_ctx": 512
}
},
"PinnedCompletionsProperties": {
"options": {
"num_ctx": 4096,
"temperature": 0.8
}
},
"LogRequestFull": false,
"LogRequestBody": false,
"LogResponseBody": false,
"Active": true
}Backend Properties
| Property | Type | Default | Description |
|---|---|---|---|
Identifier | string | Required | Unique identifier for this backend |
Name | string | Required | Human-readable name |
Hostname | string | Required | Backend server hostname/IP |
Port | integer | 11434 | Backend server port |
Ssl | boolean | false | Use HTTPS for backend communication |
UnhealthyThreshold | integer | 2 | Failed checks before marking unhealthy |
HealthyThreshold | integer | 2 | Successful checks before marking healthy |
HealthCheckMethod | string | "GET" | HTTP method for health checks, either GET or HEAD |
HealthCheckUrl | string | "/" | URL path for health checks |
MaxParallelRequests | integer | 4 | Maximum concurrent requests |
RateLimitRequestsThreshold | integer | 10 | Rate limiting threshold |
AllowEmbeddings | boolean | true | Allow embeddings API requests |
AllowCompletions | boolean | true | Allow completions API requests |
BearerToken | string | null | Bearer token to attach to each request |
Querystring | string | null | Querystring to attach to each request. Do not include a leading ?, and separate each key-value pair with & |
Headers | dictionary | {} | Dictionary containing key-value pairs to attach as headers to each request |
PinnedCompletionsProperties | object | {} | Key-value pairs merged into completions requests |
LogRequestFull | boolean | false | Log complete requests |
LogRequestBody | boolean | false | Log request bodies |
LogResponseBody | boolean | false | Log response bodies |
Active | boolean | true | Whether backend is active |
Health Check Configuration
Health checks validate backend availability:
- Method: HTTP method (
GET,HEAD) - URL: Path to check (e.g.,
/,/api/version,/health) - Thresholds: Number of consecutive successes/failures to change state
Common health check endpoints:
HEAD /: Basic connectivity check for OllamaGET /health: Basic connectivity check for vLLM
Rate Limiting
Backends can enforce rate limits:
- Requests exceeding
RateLimitRequestsThresholdreceive HTTP 429 - Rate limiting is per backend, not global
- Helps protect individual Ollama instances from overload
Security Controls
Backend security controls provide additional layers of request filtering and parameter enforcement:
Request Type Controls
AllowEmbeddings: Controls whether this backend can process embeddings requestsAllowCompletions: Controls whether this backend can process completion requests
Requests are only routed to backends that allow the specific request type. This enables:
- Dedicated embeddings servers that only handle embeddings requests:
- Ollama API:
/api/embed - OpenAI API:
/v1/embeddings
- Ollama API:
- Completion-only servers that only handle completion requests:
- Ollama API:
/api/generate,/api/chat - OpenAI API:
/v1/completions,/v1/chat/completions
- Ollama API:
- Multi-tenant isolation by request type
Pinned Properties
Backend pinned properties provide server-level parameter enforcement:
PinnedEmbeddingsProperties: Applied to all embeddings requests routed to this backendPinnedCompletionsProperties: Applied to all completion requests routed to this backend
Backend pinned properties are merged after frontend pinned properties, allowing for:
- Server-specific resource limits:
{"options": {"num_ctx": 1024}} - Hardware-optimized settings:
{"options": {"num_gpu": 2}} - Backend-specific model overrides:
{"model": "server-optimized-model"}
The merge order is: Client Request → Frontend Pinned Properties → Backend Pinned Properties, with later values taking precedence.
docker run -d \
-e OLLAMAFLOW_PORT=8080 \
-e OLLAMAFLOW_ADMIN_TOKEN=my-secure-token \
-e ASPNETCORE_ENVIRONMENT=Production \
-p 8080:8080 \
jchristn/ollamaflowConfiguration Examples
Basic Single Backend
Minimal configuration for testing:
{
"Webserver": {
"Port": 43411
},
"AdminBearerTokens": ["test-token"]
}Frontend/Backend via API:
# Create backend
curl -X PUT -H "Authorization: Bearer test-token" \
-H "Content-Type: application/json" \
-d '{"Identifier": "local", "Hostname": "localhost", "Port": 11434}' \
http://localhost:43411/v1.0/backends
>>>>>>> 7087d02bef34583fd5434e8e29aad06934988889
### Frontend Properties
| Property | Type | Default | Description |
| ----------------------------- | ------- | -------------- | ------------------------------------------------ |
| `Identifier` | string | Required | Unique identifier for this frontend |
| `Name` | string | Required | Human-readable name |
| `Hostname` | string | `"*"` | Hostname pattern (* for catch-all) |
| `TimeoutMs` | integer | `60000` | Request timeout in milliseconds |
| `LoadBalancing` | enum | `"RoundRobin"` | Load balancing algorithm |
| `BlockHttp10` | boolean | `true` | Reject HTTP/1.0 requests |
| `MaxRequestBodySize` | integer | `536870912` | Max request size in bytes (512MB) |
| `Backends` | array | `[]` | List of backend identifiers |
| `RequiredModels` | array | `[]` | Models that must be available |
| `AllowEmbeddings` | boolean | `true` | Allow embeddings API requests |
| `AllowCompletions` | boolean | `true` | Allow completions API requests |
| `PinnedEmbeddingsProperties` | object | `{}` | Key-value pairs merged into embeddings requests |
| `PinnedCompletionsProperties` | object | `{}` | Key-value pairs merged into completions requests |
| `UseStickySessions` | boolean | `false` | Enable session stickiness |
| `StickySessionExpirationMs` | integer | `1800000` | Session timeout (30 minutes, min: 10s, max: 24h) |
| `LogRequestFull` | boolean | `false` | Log complete requests |
| `LogRequestBody` | boolean | `false` | Log request bodies |
| `LogResponseBody` | boolean | `false` | Log response bodies |
| `Active` | boolean | `true` | Whether frontend is active |
### Load Balancing Options
* `"RoundRobin"`: Cycle through backends sequentially
* `"Random"`: Randomly select from healthy backends
### Hostname Patterns
* `"*"`: Match all hostnames (catch-all)
* `"api.company.com"`: Exact hostname match
* Multiple frontends can exist with different hostname patterns
### Security Controls
Frontend security controls enable fine-grained access control and request parameter enforcement:
#### Request Type Controls
* **`AllowEmbeddings`**: Controls whether embeddings API endpoints are accessible through this frontend
* Ollama API: `/api/embed`
* OpenAI API: `/v1/embeddings`
* **`AllowCompletions`**: Controls whether completion API endpoints are accessible through this frontend
* Ollama API: `/api/generate`, `/api/chat`
* OpenAI API: `/v1/completions`, `/v1/chat/completions`
For a request to succeed, both the frontend and at least one assigned backend must allow the request type.
#### Pinned Properties
Pinned properties allow administrators to enforce specific parameters in requests, providing security compliance and standardization:
* **`PinnedEmbeddingsProperties`**: Key-value pairs automatically merged into all embeddings requests
* **`PinnedCompletionsProperties`**: Key-value pairs automatically merged into all completion requests
Common use cases:
* Enforce maximum context size: `{"options": {"num_ctx": 2048}}`
* Standardize temperature settings: `{"options": {"temperature": 0.7}}`
* Override model selection: `{"model": "approved-model:latest"}`
* Set organizational defaults: `{"options": {"top_p": 0.9, "top_k": 40}}`
Properties are merged with client requests, with pinned properties taking precedence over client-specified values.
## Backend Configuration
Backends represent physical Ollama instances in your infrastructure.
### Backend Object Structure
```json
{
"Identifier": "gpu-server-1",
"Name": "Primary GPU Server",
"Hostname": "192.168.1.100",
"Port": 11434,
"Ssl": false,
"UnhealthyThreshold": 3,
"HealthyThreshold": 2,
"HealthCheckMethod": "GET",
"HealthCheckUrl": "/",
"MaxParallelRequests": 8,
"RateLimitRequestsThreshold": 20,
"AllowEmbeddings": true,
"AllowCompletions": true,
"PinnedEmbeddingsProperties": {
"options": {
"num_ctx": 512
}
},
"PinnedCompletionsProperties": {
"options": {
"num_ctx": 4096,
"temperature": 0.8
}
},
"BearerToken": "sk-your-api-key",
"Querystring": "api_version=2024-01",
"Headers": {
"X-Custom-Header": "custom-value"
},
"LogRequestFull": false,
"LogRequestBody": false,
"LogResponseBody": false,
"Active": true
}Backend Properties
| Property | Type | Default | Description |
|---|---|---|---|
Identifier | string | Required | Unique identifier for this backend |
Name | string | Required | Human-readable name |
Hostname | string | Required | Backend server hostname/IP |
Port | integer | 11434 | Backend server port |
Ssl | boolean | false | Use HTTPS for backend communication |
UnhealthyThreshold | integer | 2 | Failed checks before marking unhealthy |
HealthyThreshold | integer | 2 | Successful checks before marking healthy |
HealthCheckMethod | string | "GET" | HTTP method for health checks, either GET or HEAD |
HealthCheckUrl | string | "/" | URL path for health checks |
MaxParallelRequests | integer | 4 | Maximum concurrent requests |
RateLimitRequestsThreshold | integer | 10 | Rate limiting threshold |
AllowEmbeddings | boolean | true | Allow embeddings API requests |
AllowCompletions | boolean | true | Allow completions API requests |
PinnedEmbeddingsProperties | object | {} | Key-value pairs merged into embeddings requests |
PinnedCompletionsProperties | object | {} | Key-value pairs merged into completions requests |
BearerToken | string | null | Bearer token for Authorization header |
Querystring | string | null | Querystring appended to backend URLs |
Headers | object | {} | Custom headers added to backend requests |
LogRequestFull | boolean | false | Log complete requests |
LogRequestBody | boolean | false | Log request bodies |
LogResponseBody | boolean | false | Log response bodies |
Active | boolean | true | Whether backend is active |
Health Check Configuration
Health checks validate backend availability:
- Method: HTTP method (
GET,HEAD) - URL: Path to check (e.g.,
/,/api/version,/health) - Thresholds: Number of consecutive successes/failures to change state
Common health check endpoints:
HEAD /: Basic connectivity check for OllamaGET /health: Basic connectivity check for vLLM
Rate Limiting
Backends can enforce rate limits:
- Requests exceeding
RateLimitRequestsThresholdreceive HTTP 429 - Rate limiting is per backend, not global
- Helps protect individual Ollama instances from overload
Security Controls
Backend security controls provide additional layers of request filtering and parameter enforcement:
Request Type Controls
AllowEmbeddings: Controls whether this backend can process embeddings requestsAllowCompletions: Controls whether this backend can process completion requests
Requests are only routed to backends that allow the specific request type. This enables:
- Dedicated embeddings servers that only handle embeddings requests:
- Ollama API:
/api/embed - OpenAI API:
/v1/embeddings
- Ollama API:
- Completion-only servers that only handle completion requests:
- Ollama API:
/api/generate,/api/chat - OpenAI API:
/v1/completions,/v1/chat/completions
- Ollama API:
- Multi-tenant isolation by request type
Pinned Properties
Backend pinned properties provide server-level parameter enforcement:
PinnedEmbeddingsProperties: Applied to all embeddings requests routed to this backendPinnedCompletionsProperties: Applied to all completion requests routed to this backend
Backend pinned properties are merged after frontend pinned properties, allowing for:
- Server-specific resource limits:
{"options": {"num_ctx": 1024}} - Hardware-optimized settings:
{"options": {"num_gpu": 2}} - Backend-specific model overrides:
{"model": "server-optimized-model"}
The merge order is: Client Request → Frontend Pinned Properties → Backend Pinned Properties, with later values taking precedence.
Request Customization
Backends support additional request customization options for communicating with upstream services that require authentication or special parameters:
-
BearerToken: If set, OllamaFlow automatically adds anAuthorization: Bearer {token}header to all requests sent to this backend. Useful for:- OpenAI-compatible APIs requiring API keys
- Azure OpenAI Service authentication
- Custom inference endpoints with bearer authentication
-
Querystring: If set, the specified querystring is appended to all URLs when communicating with this backend. Do not include the leading?character. Separate multiple key-value pairs with ampersands (e.g.,foo=bar&key=val). Useful for:- API versioning:
api-version=2024-01 - Deployment targeting:
deployment=gpt-4 - Custom routing parameters
- API versioning:
-
Headers: A dictionary of custom headers added to all requests sent to this backend. Useful for:- Organization identification:
X-Organization-Id: org-123 - Custom routing headers:
X-Custom-Region: us-east - Compliance headers:
X-Audit-Id: audit-456
- Organization identification:
Example configuration for Azure OpenAI:
{
"Identifier": "azure-openai",
"Name": "Azure OpenAI Service",
"Hostname": "my-resource.openai.azure.com",
"Port": 443,
"Ssl": true,
"BearerToken": "your-azure-api-key",
"Querystring": "api-version=2024-02-15-preview",
"Headers": {
"X-MS-Region": "eastus"
}
}Configuration Examples
Basic Single Backend
Minimal configuration for testing:
{
"Webserver": {
"Port": 43411
},
"AdminBearerTokens": ["test-token"]
}Frontend/Backend via API:
# Create backend
curl -X PUT -H "Authorization: Bearer test-token" \
-H "Content-Type: application/json" \
-d '{"Identifier": "local", "Hostname": "localhost", "Port": 11434}' \
http://localhost:43411/v1.0/backends
# Create frontend
curl -X PUT -H "Authorization: Bearer test-token" \
-H "Content-Type: application/json" \
-d '{"Identifier": "main", "Backends": ["local"]}' \
http://localhost:43411/v1.0/frontendsProduction Multi-Backend
Production configuration with multiple GPU servers:
{
"Webserver": {
"Hostname": "*",
"Port": 43411,
"Ssl": {
"Enable": true,
"CertificateFile": "/etc/ssl/ollamaflow.crt",
"CertificatePassword": "cert-password"
}
},
"Logging": {
"LogDirectory": "/var/log/ollamaflow/",
"ConsoleLogging": false,
"MinimumSeverity": 1
},
"DatabaseFilename": "/var/lib/ollamaflow/ollamaflow.db",
"AdminBearerTokens": [
"secure-production-token-1",
"secure-production-token-2"
]
}Backends configuration:
# GPU servers
for i in {1..4}; do
curl -X PUT -H "Authorization: Bearer secure-production-token-1" \
-H "Content-Type: application/json" \
-d "{
\"Identifier\": \"gpu-$i\",
\"Name\": \"GPU Server $i\",
\"Hostname\": \"gpu$i.company.internal\",
\"Port\": 11434,
\"MaxParallelRequests\": 8,
\"HealthCheckUrl\": \"/api/version\"
}" \
http://localhost:43411/v1.0/backends
done
# Production frontend
curl -X PUT -H "Authorization: Bearer secure-production-token-1" \
-H "Content-Type: application/json" \
-d '{
"Identifier": "production",
"Name": "Production AI Inference",
"Hostname": "ai.company.com",
"LoadBalancing": "RoundRobin",
"Backends": ["gpu-1", "gpu-2", "gpu-3", "gpu-4"],
"RequiredModels": ["llama3:8b", "mistral:7b", "codellama:13b"],
"TimeoutMs": 120000
}' \
http://localhost:43411/v1.0/frontendsDevelopment Environment
Development setup with debugging enabled:
{
"Webserver": {
"Port": 43411,
"Debug": {
"Routing": true,
"Requests": true,
"Responses": false
}
},
"Logging": {
"ConsoleLogging": true,
"EnableColors": true,
"MinimumSeverity": 0
},
"AdminBearerTokens": ["dev-token"]
}Configuration Validation
OllamaFlow validates configuration on startup:
Common Validation Errors
- Invalid Port Range: Ports must be 1-65535
- Missing Required Fields: Identifier, Hostname required for backends
- Duplicate Identifiers: Frontend/Backend IDs must be unique
- Invalid Load Balancing: Must be "RoundRobin" or "Random"
- Invalid Hostnames: Must be valid hostname or "*"
Configuration Test
Validate configuration without starting the server:
# Test configuration file
dotnet OllamaFlow.Server.dll --validate-config
# Test with specific config
OLLAMAFLOW_CONFIG=/path/to/config.json dotnet OllamaFlow.Server.dll --validate-configMigration and Backup
Database Backup
Backup the SQLite database regularly:
# Simple copy (stop OllamaFlow first)
cp ollamaflow.db ollamaflow.db.backup
# Online backup (while running)
sqlite3 ollamaflow.db ".backup /path/to/backup.db"Configuration Migration
When upgrading OllamaFlow:
- Backup Configuration: Save current
ollamaflow.json - Backup Database: Save current
ollamaflow.db - Review Changes: Check for new configuration options
- Test Upgrade: Test in non-production environment first
Export/Import Configuration
Export current configuration for replication:
# Export all frontends
curl -H "Authorization: Bearer token" \
http://localhost:43411/v1.0/frontends > frontends.json
# Export all backends
curl -H "Authorization: Bearer token" \
http://localhost:43411/v1.0/backends > backends.jsonImport configuration to new instance:
# Import backends first
cat backends.json | jq '.[]' | while read backend; do
curl -X PUT -H "Authorization: Bearer token" \
-H "Content-Type: application/json" \
-d "$backend" \
http://new-host:43411/v1.0/backends
done
# Then import frontends
cat frontends.json | jq '.[]' | while read frontend; do
curl -X PUT -H "Authorization: Bearer token" \
-H "Content-Type: application/json" \
-d "$frontend" \
http://new-host:43411/v1.0/frontends
doneNext Steps
- Review API Reference for programmatic configuration
- Explore Deployment Options for your infrastructure
- Check Monitoring and Observability for production insights
