# Configuration Reference This comprehensive guide covers all configuration options available in OllamaFlow, including the main settings file (`ollamaflow.json`) and the structure of key objects like frontends and backends. ## Main Configuration File OllamaFlow uses a JSON configuration file (`ollamaflow.json`) that defines server settings, logging, and authentication. On first startup, a default configuration is created if none exists. ### Default Configuration Location * **Docker**: `/app/ollamaflow.json` * **Bare Metal**: Same directory as executable ### Complete Configuration Example ```json { "Logging": { "Servers": [ { "Hostname": "127.0.0.1", "Port": 514, "RandomizePorts": false, "MinimumPort": 65000, "MaximumPort": 65535 } ], "LogDirectory": "./logs/", "LogFilename": "ollamaflow.log", "ConsoleLogging": true, "EnableColors": true, "MinimumSeverity": 1 }, "Webserver": { "Hostname": "*", "Port": 43411, "IO": { "StreamBufferSize": 65536, "MaxRequests": 1024, "ReadTimeoutMs": 10000, "MaxIncomingHeadersSize": 65536, "EnableKeepAlive": false }, "Ssl": { "Enable": false, "MutuallyAuthenticate": false, "AcceptInvalidCertificates": true, "CertificateFile": "", "CertificatePassword": "" }, "Headers": { "IncludeContentLength": true, "DefaultHeaders": { "Access-Control-Allow-Origin": "*", "Access-Control-Allow-Methods": "OPTIONS, HEAD, GET, PUT, POST, DELETE, PATCH", "Access-Control-Allow-Headers": "*", "Access-Control-Expose-Headers": "", "Accept": "*/*", "Accept-Language": "en-US, en", "Accept-Charset": "ISO-8859-1, utf-8", "Cache-Control": "no-cache", "Connection": "close", "Host": "localhost:43411" } }, "AccessControl": { "DenyList": {}, "PermitList": {}, "Mode": "DefaultPermit" }, "Debug": { "AccessControl": false, "Routing": false, "Requests": false, "Responses": false } }, "DatabaseFilename": "ollamaflow.db", "AdminBearerTokens": [ "your-secure-admin-token" ] } ``` ## Configuration Sections ### Logging Settings Controls how OllamaFlow logs information and errors. | Setting | Type | Default | Description | | ----------------- | ------- | ------------------ | ---------------------------------------------------- | | `LogDirectory` | string | `"./logs/"` | Directory for log files | | `LogFilename` | string | `"ollamaflow.log"` | Base filename for logs | | `ConsoleLogging` | boolean | `true` | Enable console output | | `EnableColors` | boolean | `true` | Enable colored console output | | `MinimumSeverity` | integer | `1` | Minimum log level (0=Debug, 1=Info, 2=Warn, 3=Error) | #### Syslog Servers Optional remote logging configuration: ```json { "Servers": [ { "Hostname": "syslog.company.com", "Port": 514, "RandomizePorts": false, "MinimumPort": 65000, "MaximumPort": 65535 } ] } ``` ### Webserver Settings Configures the HTTP server that handles all requests. #### Basic Settings | Setting | Type | Default | Description | | ---------- | ------- | ------- | ------------------------------------ | | `Hostname` | string | `"*"` | Bind hostname (* for all interfaces) | | `Port` | integer | `43411` | TCP port to listen on | #### IO Settings Controls request handling and performance: | Setting | Type | Default | Description | | ------------------------ | ------- | ------- | ------------------------------------ | | `StreamBufferSize` | integer | `65536` | Buffer size for streaming responses | | `MaxRequests` | integer | `1024` | Maximum concurrent requests | | `ReadTimeoutMs` | integer | `10000` | Request read timeout in milliseconds | | `MaxIncomingHeadersSize` | integer | `65536` | Maximum size of request headers | | `EnableKeepAlive` | boolean | `false` | Enable HTTP keep-alive connections | #### SSL Settings HTTPS configuration for secure connections: | Setting | Type | Default | Description | | --------------------------- | ------- | ------- | -------------------------------- | | `Enable` | boolean | `false` | Enable HTTPS | | `MutuallyAuthenticate` | boolean | `false` | Require client certificates | | `AcceptInvalidCertificates` | boolean | `true` | Accept self-signed certificates | | `CertificateFile` | string | `""` | Path to SSL certificate file | | `CertificatePassword` | string | `""` | Certificate password if required | #### Headers Settings Default HTTP headers and CORS configuration: ```json { "IncludeContentLength": true, "DefaultHeaders": { "Access-Control-Allow-Origin": "*", "Access-Control-Allow-Methods": "OPTIONS, HEAD, GET, PUT, POST, DELETE, PATCH", "Access-Control-Allow-Headers": "*" } } ``` #### Access Control IP-based access control (optional): ```json { "DenyList": { "192.168.1.100": "Blocked IP", "10.0.0.0/8": "Blocked network" }, "PermitList": { "192.168.1.0/24": "Allowed network" }, "Mode": "DefaultPermit" } ``` Modes: * `DefaultPermit`: Allow all except denied IPs * `DefaultDeny`: Deny all except permitted IPs ### Database Settings | Setting | Type | Default | Description | | ------------------ | ------ | ----------------- | ------------------------- | | `DatabaseFilename` | string | `"ollamaflow.db"` | SQLite database file path | ### Authentication Settings | Setting | Type | Default | Description | | ------------------- | ----- | --------------------- | ---------------------------------- | | `AdminBearerTokens` | array | `["ollamaflowadmin"]` | Valid bearer tokens for admin APIs | ## Frontend Configuration Frontends are virtual Ollama endpoints that clients connect to. They are stored in the database and managed via the API. ### Frontend Object Structure ```json { "Identifier": "production-frontend", "Name": "Production AI Inference", "Hostname": "ai.company.com", "TimeoutMs": 90000, "LoadBalancing": "RoundRobin", "BlockHttp10": true, "MaxRequestBodySize": 1073741824, "Backends": ["gpu-1", "gpu-2", "gpu-3"], "RequiredModels": ["llama3:8b", "mistral:7b"], "AllowEmbeddings": true, "AllowCompletions": true, "PinnedEmbeddingsProperties": { "model": "nomic-embed-text", "options": { "temperature": 0.1 } }, "PinnedCompletionsProperties": { "options": { "temperature": 0.7, "num_ctx": 2048 } }, "LogRequestFull": false, "LogRequestBody": false, "LogResponseBody": false, "UseStickySessions": false, "StickySessionExpirationMs": 1800000, "Active": true } ``` ### Frontend Properties | Property | Type | Default | Description | | ----------------------------- | ------- | -------------- | ------------------------------------------------ | | `Identifier` | string | Required | Unique identifier for this frontend | | `Name` | string | Required | Human-readable name | | `Hostname` | string | `"*"` | Hostname pattern (* for catch-all) | | `TimeoutMs` | integer | `60000` | Request timeout in milliseconds | | `LoadBalancing` | enum | `"RoundRobin"` | Load balancing algorithm | | `BlockHttp10` | boolean | `true` | Reject HTTP/1.0 requests | | `MaxRequestBodySize` | integer | `536870912` | Max request size in bytes (512MB) | | `Backends` | array | `[]` | List of backend identifiers | | `RequiredModels` | array | `[]` | Models that must be available | | `AllowEmbeddings` | boolean | `true` | Allow embeddings API requests | | `AllowCompletions` | boolean | `true` | Allow completions API requests | | `PinnedEmbeddingsProperties` | object | `{}` | Key-value pairs merged into embeddings requests | | `PinnedCompletionsProperties` | object | `{}` | Key-value pairs merged into completions requests | | `UseStickySessions` | boolean | `false` | Enable session stickiness | | `StickySessionExpirationMs` | integer | `1800000` | Session timeout (30 minutes, min: 10s, max: 24h) | | `LogRequestFull` | boolean | `false` | Log complete requests | | `LogRequestBody` | boolean | `false` | Log request bodies | | `LogResponseBody` | boolean | `false` | Log response bodies | | `Active` | boolean | `true` | Whether frontend is active | ### Load Balancing Options * `"RoundRobin"`: Cycle through backends sequentially * `"Random"`: Randomly select from healthy backends ### Hostname Patterns * `"*"`: Match all hostnames (catch-all) * `"api.company.com"`: Exact hostname match * Multiple frontends can exist with different hostname patterns ### Security Controls Frontend security controls enable fine-grained access control and request parameter enforcement: #### Request Type Controls * **`AllowEmbeddings`**: Controls whether embeddings API endpoints are accessible through this frontend * Ollama API: `/api/embed` * OpenAI API: `/v1/embeddings` * **`AllowCompletions`**: Controls whether completion API endpoints are accessible through this frontend * Ollama API: `/api/generate`, `/api/chat` * OpenAI API: `/v1/completions`, `/v1/chat/completions` For a request to succeed, both the frontend and at least one assigned backend must allow the request type. #### Pinned Properties Pinned properties allow administrators to enforce specific parameters in requests, providing security compliance and standardization: * **`PinnedEmbeddingsProperties`**: Key-value pairs automatically merged into all embeddings requests * **`PinnedCompletionsProperties`**: Key-value pairs automatically merged into all completion requests Common use cases: * Enforce maximum context size: `{"options": {"num_ctx": 2048}}` * Standardize temperature settings: `{"options": {"temperature": 0.7}}` * Override model selection: `{"model": "approved-model:latest"}` * Set organizational defaults: `{"options": {"top_p": 0.9, "top_k": 40}}` Properties are merged with client requests, with pinned properties taking precedence over client-specified values. ## Backend Configuration Backends represent physical Ollama instances in your infrastructure. ### Backend Object Structure ```json { "Identifier": "gpu-server-1", "Name": "Primary GPU Server", "Hostname": "192.168.1.100", "Port": 11434, "Ssl": false, "UnhealthyThreshold": 3, "HealthyThreshold": 2, "HealthCheckMethod": "GET", "HealthCheckUrl": "/", "MaxParallelRequests": 8, "RateLimitRequestsThreshold": 20, "AllowEmbeddings": true, "AllowCompletions": true, "PinnedEmbeddingsProperties": { "options": { "num_ctx": 512 } }, "PinnedCompletionsProperties": { "options": { "num_ctx": 4096, "temperature": 0.8 } }, "LogRequestFull": false, "LogRequestBody": false, "LogResponseBody": false, "Active": true } ``` ### Backend Properties | Property | Type | Default | Description | | ----------------------------- | ------- | -------- | ----------------------------------------------------- | | `Identifier` | string | Required | Unique identifier for this backend | | `Name` | string | Required | Human-readable name | | `Hostname` | string | Required | Backend server hostname/IP | | `Port` | integer | `11434` | Backend server port | | `Ssl` | boolean | `false` | Use HTTPS for backend communication | | `UnhealthyThreshold` | integer | `2` | Failed checks before marking unhealthy | | `HealthyThreshold` | integer | `2` | Successful checks before marking healthy | | `HealthCheckMethod` | string | `"GET"` | HTTP method for health checks, either `GET` or `HEAD` | | `HealthCheckUrl` | string | `"/"` | URL path for health checks | | `MaxParallelRequests` | integer | `4` | Maximum concurrent requests | | `RateLimitRequestsThreshold` | integer | `10` | Rate limiting threshold | | `AllowEmbeddings` | boolean | `true` | Allow embeddings API requests | | `AllowCompletions` | boolean | `true` | Allow completions API requests | | `PinnedEmbeddingsProperties` | object | `{}` | Key-value pairs merged into embeddings requests | | `PinnedCompletionsProperties` | object | `{}` | Key-value pairs merged into completions requests | | `LogRequestFull` | boolean | `false` | Log complete requests | | `LogRequestBody` | boolean | `false` | Log request bodies | | `LogResponseBody` | boolean | `false` | Log response bodies | | `Active` | boolean | `true` | Whether backend is active | ### Health Check Configuration Health checks validate backend availability: * **Method**: HTTP method (`GET`, `HEAD`) * **URL**: Path to check (e.g., `/`, `/api/version`, `/health`) * **Thresholds**: Number of consecutive successes/failures to change state Common health check endpoints: * `HEAD /`: Basic connectivity check for Ollama * `GET /health`: Basic connectivity check for vLLM ### Rate Limiting Backends can enforce rate limits: * Requests exceeding `RateLimitRequestsThreshold` receive HTTP 429 * Rate limiting is per backend, not global * Helps protect individual Ollama instances from overload ### Security Controls Backend security controls provide additional layers of request filtering and parameter enforcement: #### Request Type Controls * **`AllowEmbeddings`**: Controls whether this backend can process embeddings requests * **`AllowCompletions`**: Controls whether this backend can process completion requests Requests are only routed to backends that allow the specific request type. This enables: * Dedicated embeddings servers that only handle embeddings requests: * Ollama API: `/api/embed` * OpenAI API: `/v1/embeddings` * Completion-only servers that only handle completion requests: * Ollama API: `/api/generate`, `/api/chat` * OpenAI API: `/v1/completions`, `/v1/chat/completions` * Multi-tenant isolation by request type #### Pinned Properties Backend pinned properties provide server-level parameter enforcement: * **`PinnedEmbeddingsProperties`**: Applied to all embeddings requests routed to this backend * **`PinnedCompletionsProperties`**: Applied to all completion requests routed to this backend Backend pinned properties are merged after frontend pinned properties, allowing for: * Server-specific resource limits: `{"options": {"num_ctx": 1024}}` * Hardware-optimized settings: `{"options": {"num_gpu": 2}}` * Backend-specific model overrides: `{"model": "server-optimized-model"}` The merge order is: Client Request → Frontend Pinned Properties → Backend Pinned Properties, with later values taking precedence.
```bash docker run -d \ -e OLLAMAFLOW_PORT=8080 \ -e OLLAMAFLOW_ADMIN_TOKEN=my-secure-token \ -e ASPNETCORE_ENVIRONMENT=Production \ -p 8080:8080 \ jchristn/ollamaflow ``` ## Configuration Examples ### Basic Single Backend Minimal configuration for testing: ```json { "Webserver": { "Port": 43411 }, "AdminBearerTokens": ["test-token"] } ``` Frontend/Backend via API: ```bash # Create backend curl -X PUT -H "Authorization: Bearer test-token" \ -H "Content-Type: application/json" \ -d '{"Identifier": "local", "Hostname": "localhost", "Port": 11434}' \ http://localhost:43411/v1.0/backends # Create frontend curl -X PUT -H "Authorization: Bearer test-token" \ -H "Content-Type: application/json" \ -d '{"Identifier": "main", "Backends": ["local"]}' \ http://localhost:43411/v1.0/frontends ``` ### Production Multi-Backend Production configuration with multiple GPU servers: ```json { "Webserver": { "Hostname": "*", "Port": 43411, "Ssl": { "Enable": true, "CertificateFile": "/etc/ssl/ollamaflow.crt", "CertificatePassword": "cert-password" } }, "Logging": { "LogDirectory": "/var/log/ollamaflow/", "ConsoleLogging": false, "MinimumSeverity": 1 }, "DatabaseFilename": "/var/lib/ollamaflow/ollamaflow.db", "AdminBearerTokens": [ "secure-production-token-1", "secure-production-token-2" ] } ``` Backends configuration: ```bash # GPU servers for i in {1..4}; do curl -X PUT -H "Authorization: Bearer secure-production-token-1" \ -H "Content-Type: application/json" \ -d "{ \"Identifier\": \"gpu-$i\", \"Name\": \"GPU Server $i\", \"Hostname\": \"gpu$i.company.internal\", \"Port\": 11434, \"MaxParallelRequests\": 8, \"HealthCheckUrl\": \"/api/version\" }" \ http://localhost:43411/v1.0/backends done # Production frontend curl -X PUT -H "Authorization: Bearer secure-production-token-1" \ -H "Content-Type: application/json" \ -d '{ "Identifier": "production", "Name": "Production AI Inference", "Hostname": "ai.company.com", "LoadBalancing": "RoundRobin", "Backends": ["gpu-1", "gpu-2", "gpu-3", "gpu-4"], "RequiredModels": ["llama3:8b", "mistral:7b", "codellama:13b"], "TimeoutMs": 120000 }' \ http://localhost:43411/v1.0/frontends ``` ### Development Environment Development setup with debugging enabled: ```json { "Webserver": { "Port": 43411, "Debug": { "Routing": true, "Requests": true, "Responses": false } }, "Logging": { "ConsoleLogging": true, "EnableColors": true, "MinimumSeverity": 0 }, "AdminBearerTokens": ["dev-token"] } ``` ## Configuration Validation OllamaFlow validates configuration on startup: ### Common Validation Errors 1. **Invalid Port Range**: Ports must be 1-65535 2. **Missing Required Fields**: Identifier, Hostname required for backends 3. **Duplicate Identifiers**: Frontend/Backend IDs must be unique 4. **Invalid Load Balancing**: Must be "RoundRobin" or "Random" 5. **Invalid Hostnames**: Must be valid hostname or "*" ### Configuration Test Validate configuration without starting the server: ```bash # Test configuration file dotnet OllamaFlow.Server.dll --validate-config # Test with specific config OLLAMAFLOW_CONFIG=/path/to/config.json dotnet OllamaFlow.Server.dll --validate-config ``` ## Migration and Backup ### Database Backup Backup the SQLite database regularly: ```bash # Simple copy (stop OllamaFlow first) cp ollamaflow.db ollamaflow.db.backup # Online backup (while running) sqlite3 ollamaflow.db ".backup /path/to/backup.db" ``` ### Configuration Migration When upgrading OllamaFlow: 1. **Backup Configuration**: Save current `ollamaflow.json` 2. **Backup Database**: Save current `ollamaflow.db` 3. **Review Changes**: Check for new configuration options 4. **Test Upgrade**: Test in non-production environment first ### Export/Import Configuration Export current configuration for replication: ```bash # Export all frontends curl -H "Authorization: Bearer token" \ http://localhost:43411/v1.0/frontends > frontends.json # Export all backends curl -H "Authorization: Bearer token" \ http://localhost:43411/v1.0/backends > backends.json ``` Import configuration to new instance: ```bash # Import backends first cat backends.json | jq '.[]' | while read backend; do curl -X PUT -H "Authorization: Bearer token" \ -H "Content-Type: application/json" \ -d "$backend" \ http://new-host:43411/v1.0/backends done # Then import frontends cat frontends.json | jq '.[]' | while read frontend; do curl -X PUT -H "Authorization: Bearer token" \ -H "Content-Type: application/json" \ -d "$frontend" \ http://new-host:43411/v1.0/frontends done ``` ## Next Steps * Review [API Reference](api-reference.md) for programmatic configuration * Explore [Deployment Options](deployment-options.md) for your infrastructure * Check [Monitoring and Observability](monitoring.md) for production insights ## Main Configuration File OllamaFlow uses a JSON configuration file (`ollamaflow.json`) that defines server settings, logging, and authentication. On first startup, a default configuration is created if none exists. ### Default Configuration Location * **Docker**: `/app/ollamaflow.json` * **Bare Metal**: Same directory as executable * **Custom**: Specify with `OLLAMAFLOW_CONFIG` environment variable ### Complete Configuration Example ```json { "Logging": { "Servers": [ { "Hostname": "127.0.0.1", "Port": 514, "RandomizePorts": false, "MinimumPort": 65000, "MaximumPort": 65535 } ], "LogDirectory": "./logs/", "LogFilename": "ollamaflow.log", "ConsoleLogging": true, "EnableColors": true, "MinimumSeverity": 1 }, "Webserver": { "Hostname": "*", "Port": 43411, "IO": { "StreamBufferSize": 65536, "MaxRequests": 1024, "ReadTimeoutMs": 10000, "MaxIncomingHeadersSize": 65536, "EnableKeepAlive": false }, "Ssl": { "Enable": false, "MutuallyAuthenticate": false, "AcceptInvalidCertificates": true, "CertificateFile": "", "CertificatePassword": "" }, "Headers": { "IncludeContentLength": true, "DefaultHeaders": { "Access-Control-Allow-Origin": "*", "Access-Control-Allow-Methods": "OPTIONS, HEAD, GET, PUT, POST, DELETE, PATCH", "Access-Control-Allow-Headers": "*", "Access-Control-Expose-Headers": "", "Accept": "*/*", "Accept-Language": "en-US, en", "Accept-Charset": "ISO-8859-1, utf-8", "Cache-Control": "no-cache", "Connection": "close", "Host": "localhost:43411" } }, "AccessControl": { "DenyList": {}, "PermitList": {}, "Mode": "DefaultPermit" }, "Debug": { "AccessControl": false, "Routing": false, "Requests": false, "Responses": false } }, "DatabaseFilename": "ollamaflow.db", "AdminBearerTokens": [ "your-secure-admin-token" ] } ``` ## Configuration Sections ### Logging Settings Controls how OllamaFlow logs information and errors. | Setting | Type | Default | Description | | ----------------- | ------- | ------------------ | ---------------------------------------------------- | | `LogDirectory` | string | `"./logs/"` | Directory for log files | | `LogFilename` | string | `"ollamaflow.log"` | Base filename for logs | | `ConsoleLogging` | boolean | `true` | Enable console output | | `EnableColors` | boolean | `true` | Enable colored console output | | `MinimumSeverity` | integer | `1` | Minimum log level (0=Debug, 1=Info, 2=Warn, 3=Error) | #### Syslog Servers Optional remote logging configuration: ```json { "Servers": [ { "Hostname": "syslog.company.com", "Port": 514, "RandomizePorts": false, "MinimumPort": 65000, "MaximumPort": 65535 } ] } ``` ### Webserver Settings Configures the HTTP server that handles all requests. #### Basic Settings | Setting | Type | Default | Description | | ---------- | ------- | ------- | ------------------------------------ | | `Hostname` | string | `"*"` | Bind hostname (* for all interfaces) | | `Port` | integer | `43411` | TCP port to listen on | #### IO Settings Controls request handling and performance: | Setting | Type | Default | Description | | ------------------------ | ------- | ------- | ------------------------------------ | | `StreamBufferSize` | integer | `65536` | Buffer size for streaming responses | | `MaxRequests` | integer | `1024` | Maximum concurrent requests | | `ReadTimeoutMs` | integer | `10000` | Request read timeout in milliseconds | | `MaxIncomingHeadersSize` | integer | `65536` | Maximum size of request headers | | `EnableKeepAlive` | boolean | `false` | Enable HTTP keep-alive connections | #### SSL Settings HTTPS configuration for secure connections: | Setting | Type | Default | Description | | --------------------------- | ------- | ------- | -------------------------------- | | `Enable` | boolean | `false` | Enable HTTPS | | `MutuallyAuthenticate` | boolean | `false` | Require client certificates | | `AcceptInvalidCertificates` | boolean | `true` | Accept self-signed certificates | | `CertificateFile` | string | `""` | Path to SSL certificate file | | `CertificatePassword` | string | `""` | Certificate password if required | #### Headers Settings Default HTTP headers and CORS configuration: ```json { "IncludeContentLength": true, "DefaultHeaders": { "Access-Control-Allow-Origin": "*", "Access-Control-Allow-Methods": "OPTIONS, HEAD, GET, PUT, POST, DELETE, PATCH", "Access-Control-Allow-Headers": "*" } } ``` #### Access Control IP-based access control (optional): ```json { "DenyList": { "192.168.1.100": "Blocked IP", "10.0.0.0/8": "Blocked network" }, "PermitList": { "192.168.1.0/24": "Allowed network" }, "Mode": "DefaultPermit" } ``` Modes: * `DefaultPermit`: Allow all except denied IPs * `DefaultDeny`: Deny all except permitted IPs ### Database Settings | Setting | Type | Default | Description | | ------------------ | ------ | ----------------- | ------------------------- | | `DatabaseFilename` | string | `"ollamaflow.db"` | SQLite database file path | ### Authentication Settings | Setting | Type | Default | Description | | ------------------- | ----- | --------------------- | ---------------------------------- | | `AdminBearerTokens` | array | `["ollamaflowadmin"]` | Valid bearer tokens for admin APIs | ## Frontend Configuration Frontends are virtual Ollama endpoints that clients connect to. They are stored in the database and managed via the API. ### Frontend Object Structure ```json { "Identifier": "production-frontend", "Name": "Production AI Inference", "Hostname": "ai.company.com", "TimeoutMs": 90000, "LoadBalancing": "RoundRobin", "BlockHttp10": true, "MaxRequestBodySize": 1073741824, "Backends": ["gpu-1", "gpu-2", "gpu-3"], "RequiredModels": ["llama3:8b", "mistral:7b"], "AllowEmbeddings": true, "AllowCompletions": true, "PinnedEmbeddingsProperties": { "model": "nomic-embed-text", "options": { "temperature": 0.1 } }, "PinnedCompletionsProperties": { "options": { "temperature": 0.7, "num_ctx": 2048 } }, "LogRequestFull": false, "LogRequestBody": false, "LogResponseBody": false, "UseStickySessions": false, "StickySessionExpirationMs": 1800000, "Active": true } ``` ### Frontend Properties | Property | Type | Default | Description | | ----------------------------- | ------- | -------------- | ------------------------------------------------ | | `Identifier` | string | Required | Unique identifier for this frontend | | `Name` | string | Required | Human-readable name | | `Hostname` | string | `"*"` | Hostname pattern (* for catch-all) | | `TimeoutMs` | integer | `60000` | Request timeout in milliseconds | | `LoadBalancing` | enum | `"RoundRobin"` | Load balancing algorithm | | `BlockHttp10` | boolean | `true` | Reject HTTP/1.0 requests | | `MaxRequestBodySize` | integer | `536870912` | Max request size in bytes (512MB) | | `Backends` | array | `[]` | List of backend identifiers | | `RequiredModels` | array | `[]` | Models that must be available | | `AllowEmbeddings` | boolean | `true` | Allow embeddings API requests | | `AllowCompletions` | boolean | `true` | Allow completions API requests | | `PinnedEmbeddingsProperties` | object | `{}` | Key-value pairs merged into embeddings requests | | `PinnedCompletionsProperties` | object | `{}` | Key-value pairs merged into completions requests | | `UseStickySessions` | boolean | `false` | Enable session stickiness | | `StickySessionExpirationMs` | integer | `1800000` | Session timeout (30 minutes, min: 10s, max: 24h) | | `LogRequestFull` | boolean | `false` | Log complete requests | | `LogRequestBody` | boolean | `false` | Log request bodies | | `LogResponseBody` | boolean | `false` | Log response bodies | | `Active` | boolean | `true` | Whether frontend is active | ### Load Balancing Options * `"RoundRobin"`: Cycle through backends sequentially * `"Random"`: Randomly select from healthy backends ### Hostname Patterns * `"*"`: Match all hostnames (catch-all) * `"api.company.com"`: Exact hostname match * Multiple frontends can exist with different hostname patterns ### Security Controls Frontend security controls enable fine-grained access control and request parameter enforcement: #### Request Type Controls * **`AllowEmbeddings`**: Controls whether embeddings API endpoints are accessible through this frontend * Ollama API: `/api/embed` * OpenAI API: `/v1/embeddings` * **`AllowCompletions`**: Controls whether completion API endpoints are accessible through this frontend * Ollama API: `/api/generate`, `/api/chat` * OpenAI API: `/v1/completions`, `/v1/chat/completions` For a request to succeed, both the frontend and at least one assigned backend must allow the request type. #### Pinned Properties Pinned properties allow administrators to enforce specific parameters in requests, providing security compliance and standardization: * **`PinnedEmbeddingsProperties`**: Key-value pairs automatically merged into all embeddings requests * **`PinnedCompletionsProperties`**: Key-value pairs automatically merged into all completion requests Common use cases: * Enforce maximum context size: `{"options": {"num_ctx": 2048}}` * Standardize temperature settings: `{"options": {"temperature": 0.7}}` * Override model selection: `{"model": "approved-model:latest"}` * Set organizational defaults: `{"options": {"top_p": 0.9, "top_k": 40}}` Properties are merged with client requests, with pinned properties taking precedence over client-specified values. ## Backend Configuration Backends represent physical Ollama instances in your infrastructure. ### Backend Object Structure ```json { "Identifier": "gpu-server-1", "Name": "Primary GPU Server", "Hostname": "192.168.1.100", "Port": 11434, "Ssl": false, "UnhealthyThreshold": 3, "HealthyThreshold": 2, "HealthCheckMethod": "GET", "HealthCheckUrl": "/api/version", "MaxParallelRequests": 8, "RateLimitRequestsThreshold": 20, "AllowEmbeddings": true, "AllowCompletions": true, "PinnedEmbeddingsProperties": { "options": { "num_ctx": 512 } }, "PinnedCompletionsProperties": { "options": { "num_ctx": 4096, "temperature": 0.8 } }, "LogRequestFull": false, "LogRequestBody": false, "LogResponseBody": false, "Active": true } ``` ### Backend Properties | Property | Type | Default | Description | | ----------------------------- | ------- | -------- | ------------------------------------------------ | | `Identifier` | string | Required | Unique identifier for this backend | | `Name` | string | Required | Human-readable name | | `Hostname` | string | Required | Backend server hostname/IP | | `Port` | integer | `11434` | Backend server port | | `Ssl` | boolean | `false` | Use HTTPS for backend communication | | `UnhealthyThreshold` | integer | `2` | Failed checks before marking unhealthy | | `HealthyThreshold` | integer | `2` | Successful checks before marking healthy | | `HealthCheckMethod` | string | `"GET"` | HTTP method for health checks | | `HealthCheckUrl` | string | `"/"` | URL path for health checks | | `MaxParallelRequests` | integer | `4` | Maximum concurrent requests | | `RateLimitRequestsThreshold` | integer | `10` | Rate limiting threshold | | `AllowEmbeddings` | boolean | `true` | Allow embeddings API requests | | `AllowCompletions` | boolean | `true` | Allow completions API requests | | `PinnedEmbeddingsProperties` | object | `{}` | Key-value pairs merged into embeddings requests | | `PinnedCompletionsProperties` | object | `{}` | Key-value pairs merged into completions requests | | `LogRequestFull` | boolean | `false` | Log complete requests | | `LogRequestBody` | boolean | `false` | Log request bodies | | `LogResponseBody` | boolean | `false` | Log response bodies | | `Active` | boolean | `true` | Whether backend is active | ### Health Check Configuration Health checks validate backend availability: * **Method**: HTTP method (GET, HEAD, POST) * **URL**: Path to check (e.g., `/`, `/api/version`, `/health`) * **Thresholds**: Number of consecutive successes/failures to change state Common health check endpoints: * `/`: Basic connectivity check * `/api/version`: Ollama version endpoint * `/api/tags`: Model listing endpoint ### Rate Limiting Backends can enforce rate limits: * Requests exceeding `RateLimitRequestsThreshold` receive HTTP 429 * Rate limiting is per backend, not global * Helps protect individual Ollama instances from overload ### Security Controls Backend security controls provide additional layers of request filtering and parameter enforcement: #### Request Type Controls * **`AllowEmbeddings`**: Controls whether this backend can process embeddings requests * **`AllowCompletions`**: Controls whether this backend can process completion requests Requests are only routed to backends that allow the specific request type. This enables: * Dedicated embeddings servers that only handle embeddings requests: * Ollama API: `/api/embed` * OpenAI API: `/v1/embeddings` * Completion-only servers that only handle completion requests: * Ollama API: `/api/generate`, `/api/chat` * OpenAI API: `/v1/completions`, `/v1/chat/completions` * Multi-tenant isolation by request type #### Pinned Properties Backend pinned properties provide server-level parameter enforcement: * **`PinnedEmbeddingsProperties`**: Applied to all embeddings requests routed to this backend * **`PinnedCompletionsProperties`**: Applied to all completion requests routed to this backend Backend pinned properties are merged after frontend pinned properties, allowing for: * Server-specific resource limits: `{"options": {"num_ctx": 1024}}` * Hardware-optimized settings: `{"options": {"num_gpu": 2}}` * Backend-specific model overrides: `{"model": "server-optimized-model"}` The merge order is: Client Request → Frontend Pinned Properties → Backend Pinned Properties, with later values taking precedence. ## Environment Variables Override configuration with environment variables: | Variable | Description | Example | | ------------------------ | --------------------------- | ----------------------------- | | `OLLAMAFLOW_CONFIG` | Configuration file path | `/etc/ollamaflow/config.json` | | `OLLAMAFLOW_PORT` | Override webserver port | `8080` | | `OLLAMAFLOW_HOSTNAME` | Override webserver hostname | `0.0.0.0` | | `OLLAMAFLOW_DATABASE` | Override database file path | `/data/ollamaflow.db` | | `OLLAMAFLOW_ADMIN_TOKEN` | Override admin token | `secure-production-token` | | `ASPNETCORE_ENVIRONMENT` | .NET environment | `Production` | ### Docker Environment Example ```bash docker run -d \ -e OLLAMAFLOW_PORT=8080 \ -e OLLAMAFLOW_ADMIN_TOKEN=my-secure-token \ -e ASPNETCORE_ENVIRONMENT=Production \ -p 8080:8080 \ jchristn/ollamaflow ``` ## Configuration Examples ### Basic Single Backend Minimal configuration for testing: ```json { "Webserver": { "Port": 43411 }, "AdminBearerTokens": ["test-token"] } ``` Frontend/Backend via API: ```bash # Create backend curl -X PUT -H "Authorization: Bearer test-token" \ -H "Content-Type: application/json" \ -d '{"Identifier": "local", "Hostname": "localhost", "Port": 11434}' \ http://localhost:43411/v1.0/backends # Create frontend curl -X PUT -H "Authorization: Bearer test-token" \ -H "Content-Type: application/json" \ -d '{"Identifier": "main", "Backends": ["local"]}' \ http://localhost:43411/v1.0/frontends ``` ### Production Multi-Backend Production configuration with multiple GPU servers: ```json { "Webserver": { "Hostname": "*", "Port": 43411, "Ssl": { "Enable": true, "CertificateFile": "/etc/ssl/ollamaflow.crt", "CertificatePassword": "cert-password" } }, "Logging": { "LogDirectory": "/var/log/ollamaflow/", "ConsoleLogging": false, "MinimumSeverity": 1 }, "DatabaseFilename": "/var/lib/ollamaflow/ollamaflow.db", "AdminBearerTokens": [ "secure-production-token-1", "secure-production-token-2" ] } ``` Backends configuration: ```bash # GPU servers for i in {1..4}; do curl -X PUT -H "Authorization: Bearer secure-production-token-1" \ -H "Content-Type: application/json" \ -d "{ \"Identifier\": \"gpu-$i\", \"Name\": \"GPU Server $i\", \"Hostname\": \"gpu$i.company.internal\", \"Port\": 11434, \"MaxParallelRequests\": 8, \"HealthCheckUrl\": \"/api/version\" }" \ http://localhost:43411/v1.0/backends done # Production frontend curl -X PUT -H "Authorization: Bearer secure-production-token-1" \ -H "Content-Type: application/json" \ -d '{ "Identifier": "production", "Name": "Production AI Inference", "Hostname": "ai.company.com", "LoadBalancing": "RoundRobin", "Backends": ["gpu-1", "gpu-2", "gpu-3", "gpu-4"], "RequiredModels": ["llama3:8b", "mistral:7b", "codellama:13b"], "TimeoutMs": 120000 }' \ http://localhost:43411/v1.0/frontends ``` ### Development Environment Development setup with debugging enabled: ```json { "Webserver": { "Port": 43411, "Debug": { "Routing": true, "Requests": true, "Responses": false } }, "Logging": { "ConsoleLogging": true, "EnableColors": true, "MinimumSeverity": 0 }, "AdminBearerTokens": ["dev-token"] } ``` ## Configuration Validation OllamaFlow validates configuration on startup: ### Common Validation Errors 1. **Invalid Port Range**: Ports must be 1-65535 2. **Missing Required Fields**: Identifier, Hostname required for backends 3. **Duplicate Identifiers**: Frontend/Backend IDs must be unique 4. **Invalid Load Balancing**: Must be "RoundRobin" or "Random" 5. **Invalid Hostnames**: Must be valid hostname or "*" ### Configuration Test Validate configuration without starting the server: ```bash # Test configuration file dotnet OllamaFlow.Server.dll --validate-config # Test with specific config OLLAMAFLOW_CONFIG=/path/to/config.json dotnet OllamaFlow.Server.dll --validate-config ``` ## Migration and Backup ### Database Backup Backup the SQLite database regularly: ```bash # Simple copy (stop OllamaFlow first) cp ollamaflow.db ollamaflow.db.backup # Online backup (while running) sqlite3 ollamaflow.db ".backup /path/to/backup.db" ``` ### Configuration Migration When upgrading OllamaFlow: 1. **Backup Configuration**: Save current `ollamaflow.json` 2. **Backup Database**: Save current `ollamaflow.db` 3. **Review Changes**: Check for new configuration options 4. **Test Upgrade**: Test in non-production environment first ### Export/Import Configuration Export current configuration for replication: ```bash # Export all frontends curl -H "Authorization: Bearer token" \ http://localhost:43411/v1.0/frontends > frontends.json # Export all backends curl -H "Authorization: Bearer token" \ http://localhost:43411/v1.0/backends > backends.json ``` Import configuration to new instance: ```bash # Import backends first cat backends.json | jq '.[]' | while read backend; do curl -X PUT -H "Authorization: Bearer token" \ -H "Content-Type: application/json" \ -d "$backend" \ http://new-host:43411/v1.0/backends done # Then import frontends cat frontends.json | jq '.[]' | while read frontend; do curl -X PUT -H "Authorization: Bearer token" \ -H "Content-Type: application/json" \ -d "$frontend" \ http://new-host:43411/v1.0/frontends done ``` ## Next Steps * Review [API Reference](api-reference.md) for programmatic configuration * Explore [Deployment Options](deployment-options.md) for your infrastructure * Check [Monitoring and Observability](monitoring.md) for production insights ## Main Configuration File OllamaFlow uses a JSON configuration file (`ollamaflow.json`) that defines server settings, logging, and authentication. On first startup, a default configuration is created if none exists. ### Default Configuration Location * **Docker**: `/app/ollamaflow.json` * **Bare Metal**: Same directory as executable * **Custom**: Specify with `OLLAMAFLOW_CONFIG` environment variable ### Complete Configuration Example ```json { "Logging": { "Servers": [ { "Hostname": "127.0.0.1", "Port": 514, "RandomizePorts": false, "MinimumPort": 65000, "MaximumPort": 65535 } ], "LogDirectory": "./logs/", "LogFilename": "ollamaflow.log", "ConsoleLogging": true, "EnableColors": true, "MinimumSeverity": 1 }, "Webserver": { "Hostname": "*", "Port": 43411, "IO": { "StreamBufferSize": 65536, "MaxRequests": 1024, "ReadTimeoutMs": 10000, "MaxIncomingHeadersSize": 65536, "EnableKeepAlive": false }, "Ssl": { "Enable": false, "MutuallyAuthenticate": false, "AcceptInvalidCertificates": true, "CertificateFile": "", "CertificatePassword": "" }, "Headers": { "IncludeContentLength": true, "DefaultHeaders": { "Access-Control-Allow-Origin": "*", "Access-Control-Allow-Methods": "OPTIONS, HEAD, GET, PUT, POST, DELETE, PATCH", "Access-Control-Allow-Headers": "*", "Access-Control-Expose-Headers": "", "Accept": "*/*", "Accept-Language": "en-US, en", "Accept-Charset": "ISO-8859-1, utf-8", "Cache-Control": "no-cache", "Connection": "close", "Host": "localhost:43411" } }, "AccessControl": { "DenyList": {}, "PermitList": {}, "Mode": "DefaultPermit" }, "Debug": { "AccessControl": false, "Routing": false, "Requests": false, "Responses": false } }, "DatabaseFilename": "ollamaflow.db", "AdminBearerTokens": [ "your-secure-admin-token" ], "StickyHeaders": [ "x-conversation-id", "x-thread-id" ] } ``` ## Configuration Sections ### Logging Settings Controls how OllamaFlow logs information and errors. | Setting | Type | Default | Description | | ----------------- | ------- | ------------------ | ---------------------------------------------------- | | `LogDirectory` | string | `"./logs/"` | Directory for log files | | `LogFilename` | string | `"ollamaflow.log"` | Base filename for logs | | `ConsoleLogging` | boolean | `true` | Enable console output | | `EnableColors` | boolean | `true` | Enable colored console output | | `MinimumSeverity` | integer | `1` | Minimum log level (0=Debug, 1=Info, 2=Warn, 3=Error) | #### Syslog Servers Optional remote logging configuration: ```json { "Servers": [ { "Hostname": "syslog.company.com", "Port": 514, "RandomizePorts": false, "MinimumPort": 65000, "MaximumPort": 65535 } ] } ``` ### Webserver Settings Configures the HTTP server that handles all requests. #### Basic Settings | Setting | Type | Default | Description | | ---------- | ------- | ------- | ------------------------------------ | | `Hostname` | string | `"*"` | Bind hostname (* for all interfaces) | | `Port` | integer | `43411` | TCP port to listen on | #### IO Settings Controls request handling and performance: | Setting | Type | Default | Description | | ------------------------ | ------- | ------- | ------------------------------------ | | `StreamBufferSize` | integer | `65536` | Buffer size for streaming responses | | `MaxRequests` | integer | `1024` | Maximum concurrent requests | | `ReadTimeoutMs` | integer | `10000` | Request read timeout in milliseconds | | `MaxIncomingHeadersSize` | integer | `65536` | Maximum size of request headers | | `EnableKeepAlive` | boolean | `false` | Enable HTTP keep-alive connections | #### SSL Settings HTTPS configuration for secure connections: | Setting | Type | Default | Description | | --------------------------- | ------- | ------- | -------------------------------- | | `Enable` | boolean | `false` | Enable HTTPS | | `MutuallyAuthenticate` | boolean | `false` | Require client certificates | | `AcceptInvalidCertificates` | boolean | `true` | Accept self-signed certificates | | `CertificateFile` | string | `""` | Path to SSL certificate file | | `CertificatePassword` | string | `""` | Certificate password if required | #### Headers Settings Default HTTP headers and CORS configuration: ```json { "IncludeContentLength": true, "DefaultHeaders": { "Access-Control-Allow-Origin": "*", "Access-Control-Allow-Methods": "OPTIONS, HEAD, GET, PUT, POST, DELETE, PATCH", "Access-Control-Allow-Headers": "*" } } ``` #### Access Control IP-based access control (optional): ```json { "DenyList": { "192.168.1.100": "Blocked IP", "10.0.0.0/8": "Blocked network" }, "PermitList": { "192.168.1.0/24": "Allowed network" }, "Mode": "DefaultPermit" } ``` Modes: * `DefaultPermit`: Allow all except denied IPs * `DefaultDeny`: Deny all except permitted IPs ### Database Settings | Setting | Type | Default | Description | | ------------------ | ------ | ----------------- | ------------------------- | | `DatabaseFilename` | string | `"ollamaflow.db"` | SQLite database file path | ### Authentication Settings | Setting | Type | Default | Description | | ------------------- | ----- | --------------------- | ---------------------------------- | | `AdminBearerTokens` | array | `["ollamaflowadmin"]` | Valid bearer tokens for admin APIs | ### Sticky Headers The `StickyHeaders` string array specifies on which headers to match to uniquely identify a client when using session stickiness. If you are not using session stickiness, set this to an empty array. A case-insensitive comparison is used, meaning `x-conversation-id` and `X-Conversation-ID` are considered the same while evaluating headers. If no sticky headers are defined and session stickiness is enabled, the client IP address will be used as the client identifier. ## Frontend Configuration Frontends are virtual Ollama endpoints that clients connect to. They are stored in the database and managed via the API. ### Frontend Object Structure ```json { "Identifier": "production-frontend", "Name": "Production AI Inference", "Hostname": "ai.company.com", "TimeoutMs": 90000, "LoadBalancing": "RoundRobin", "BlockHttp10": true, "MaxRequestBodySize": 1073741824, "Backends": ["gpu-1", "gpu-2", "gpu-3"], "RequiredModels": ["llama3:8b", "mistral:7b"], "LogRequestFull": false, "LogRequestBody": false, "LogResponseBody": false, "UseStickySessions": false, "StickySessionExpirationMs": 1800000, "Active": true } ``` ### Frontend Properties | Property | Type | Default | Description | | --------------------------- | ------- | -------------- | ------------------------------------------------ | | `Identifier` | string | Required | Unique identifier for this frontend | | `Name` | string | Required | Human-readable name | | `Hostname` | string | `"*"` | Hostname pattern (* for catch-all) | | `TimeoutMs` | integer | `60000` | Request timeout in milliseconds | | `LoadBalancing` | enum | `"RoundRobin"` | Load balancing algorithm | | `BlockHttp10` | boolean | `true` | Reject HTTP/1.0 requests | | `MaxRequestBodySize` | integer | `536870912` | Max request size in bytes (512MB) | | `Backends` | array | `[]` | List of backend identifiers | | `RequiredModels` | array | `[]` | Models that must be available | | `UseStickySessions` | boolean | `false` | Enable session stickiness | | `StickySessionExpirationMs` | integer | `1800000` | Session timeout (30 minutes, min: 10s, max: 24h) | | `LogRequestFull` | boolean | `false` | Log complete requests | | `LogRequestBody` | boolean | `false` | Log request bodies | | `LogResponseBody` | boolean | `false` | Log response bodies | | `Active` | boolean | `true` | Whether frontend is active | ### Load Balancing Options * `"RoundRobin"`: Cycle through backends sequentially * `"Random"`: Randomly select from healthy backends ### Hostname Patterns * `"*"`: Match all hostnames (catch-all) * `"api.company.com"`: Exact hostname match * Multiple frontends can exist with different hostname patterns ## Backend Configuration Backends represent physical Ollama instances in your infrastructure. ### Backend Object Structure ```json { "Identifier": "gpu-server-1", "Name": "Primary GPU Server", "Hostname": "192.168.1.100", "Port": 11434, "Ssl": false, "UnhealthyThreshold": 3, "HealthyThreshold": 2, "HealthCheckMethod": "GET", "HealthCheckUrl": "/api/version", "MaxParallelRequests": 8, "RateLimitRequestsThreshold": 20, "ApiFormat": "Ollama", "LogRequestFull": false, "LogRequestBody": false, "LogResponseBody": false, "Active": true } ``` ### Backend Properties | Property | Type | Default | Description | | ---------------------------- | ------- | -------- | ----------------------------------------------- | | `Identifier` | string | Required | Unique identifier for this backend | | `Name` | string | Required | Human-readable name | | `Hostname` | string | Required | Backend server hostname/IP | | `Port` | integer | `11434` | Backend server port | | `Ssl` | boolean | `false` | Use HTTPS for backend communication | | `UnhealthyThreshold` | integer | `2` | Failed checks before marking unhealthy | | `HealthyThreshold` | integer | `2` | Successful checks before marking healthy | | `HealthCheckMethod` | string | `"GET"` | HTTP method for health checks | | `HealthCheckUrl` | string | `"/"` | URL path for health checks | | `MaxParallelRequests` | integer | `4` | Maximum concurrent requests | | `RateLimitRequestsThreshold` | integer | `10` | Rate limiting threshold | | `ApiFormat` | string | `Ollama` | Backend API format, either `Ollama` or `OpenAI` | | `LogRequestFull` | boolean | `false` | Log complete requests | | `LogRequestBody` | boolean | `false` | Log request bodies | | `LogResponseBody` | boolean | `false` | Log response bodies | | `Active` | boolean | `true` | Whether backend is active | ### Health Check Configuration Health checks validate backend availability: * **Method**: HTTP method (GET, HEAD, POST) * **URL**: Path to check (e.g., `/`, `/api/version`, `/health`) * **Thresholds**: Number of consecutive successes/failures to change state **IMPORTANT**: vLLM expects healthchecks on `GET /health` Common health check endpoints: * `/`: Basic connectivity check * `/api/version`: Ollama version endpoint * `/api/tags`: Model listing endpoint ### Rate Limiting Backends can enforce rate limits: * Requests exceeding `RateLimitRequestsThreshold` receive HTTP 429 * Rate limiting is per backend, not global * Helps protect individual Ollama instances from overload ## Environment Variables Override configuration with environment variables: | Variable | Description | Example | | ------------------------ | --------------------------- | ----------------------------- | | `OLLAMAFLOW_CONFIG` | Configuration file path | `/etc/ollamaflow/config.json` | | `OLLAMAFLOW_PORT` | Override webserver port | `8080` | | `OLLAMAFLOW_HOSTNAME` | Override webserver hostname | `0.0.0.0` | | `OLLAMAFLOW_DATABASE` | Override database file path | `/data/ollamaflow.db` | | `OLLAMAFLOW_ADMIN_TOKEN` | Override admin token | `secure-production-token` | | `ASPNETCORE_ENVIRONMENT` | .NET environment | `Production` | ### Docker Environment Example ```bash docker run -d \ -e OLLAMAFLOW_PORT=8080 \ -e OLLAMAFLOW_ADMIN_TOKEN=my-secure-token \ -e ASPNETCORE_ENVIRONMENT=Production \ -p 8080:8080 \ jchristn/ollamaflow ``` ## Configuration Examples ### Basic Single Backend Minimal configuration for testing: ```json { "Webserver": { "Port": 43411 }, "AdminBearerTokens": ["test-token"] } ``` Frontend/Backend via API: ```bash # Create backend curl -X PUT -H "Authorization: Bearer test-token" \ -H "Content-Type: application/json" \ -d '{"Identifier": "local", "Hostname": "localhost", "Port": 11434}' \ http://localhost:43411/v1.0/backends # Create frontend curl -X PUT -H "Authorization: Bearer test-token" \ -H "Content-Type: application/json" \ -d '{"Identifier": "main", "Backends": ["local"]}' \ http://localhost:43411/v1.0/frontends ``` ### Production Multi-Backend Production configuration with multiple GPU servers: ```json { "Webserver": { "Hostname": "*", "Port": 43411, "Ssl": { "Enable": true, "CertificateFile": "/etc/ssl/ollamaflow.crt", "CertificatePassword": "cert-password" } }, "Logging": { "LogDirectory": "/var/log/ollamaflow/", "ConsoleLogging": false, "MinimumSeverity": 1 }, "DatabaseFilename": "/var/lib/ollamaflow/ollamaflow.db", "AdminBearerTokens": [ "secure-production-token-1", "secure-production-token-2" ] } ``` Backends configuration: ```bash # GPU servers for i in {1..4}; do curl -X PUT -H "Authorization: Bearer secure-production-token-1" \ -H "Content-Type: application/json" \ -d "{ \"Identifier\": \"gpu-$i\", \"Name\": \"GPU Server $i\", \"Hostname\": \"gpu$i.company.internal\", \"Port\": 11434, \"MaxParallelRequests\": 8, \"HealthCheckUrl\": \"/api/version\" }" \ http://localhost:43411/v1.0/backends done # Production frontend curl -X PUT -H "Authorization: Bearer secure-production-token-1" \ -H "Content-Type: application/json" \ -d '{ "Identifier": "production", "Name": "Production AI Inference", "Hostname": "ai.company.com", "LoadBalancing": "RoundRobin", "Backends": ["gpu-1", "gpu-2", "gpu-3", "gpu-4"], "RequiredModels": ["llama3:8b", "mistral:7b", "codellama:13b"], "TimeoutMs": 120000 }' \ http://localhost:43411/v1.0/frontends ``` ### Development Environment Development setup with debugging enabled: ```json { "Webserver": { "Port": 43411, "Debug": { "Routing": true, "Requests": true, "Responses": false } }, "Logging": { "ConsoleLogging": true, "EnableColors": true, "MinimumSeverity": 0 }, "AdminBearerTokens": ["dev-token"] } ``` ## Configuration Validation OllamaFlow validates configuration on startup: ### Common Validation Errors 1. **Invalid Port Range**: Ports must be 1-65535 2. **Missing Required Fields**: Identifier, Hostname required for backends 3. **Duplicate Identifiers**: Frontend/Backend IDs must be unique 4. **Invalid Load Balancing**: Must be "RoundRobin" or "Random" 5. **Invalid Hostnames**: Must be valid hostname or "*" ### Configuration Test Validate configuration without starting the server: ```bash # Test configuration file dotnet OllamaFlow.Server.dll --validate-config # Test with specific config OLLAMAFLOW_CONFIG=/path/to/config.json dotnet OllamaFlow.Server.dll --validate-config ``` ## Migration and Backup ### Database Backup Backup the SQLite database regularly: ```bash # Simple copy (stop OllamaFlow first) cp ollamaflow.db ollamaflow.db.backup # Online backup (while running) sqlite3 ollamaflow.db ".backup /path/to/backup.db" ``` ### Configuration Migration When upgrading OllamaFlow: 1. **Backup Configuration**: Save current `ollamaflow.json` 2. **Backup Database**: Save current `ollamaflow.db` 3. **Review Changes**: Check for new configuration options 4. **Test Upgrade**: Test in non-production environment first ### Export/Import Configuration Export current configuration for replication: ```bash # Export all frontends curl -H "Authorization: Bearer token" \ http://localhost:43411/v1.0/frontends > frontends.json # Export all backends curl -H "Authorization: Bearer token" \ http://localhost:43411/v1.0/backends > backends.json ``` Import configuration to new instance: ```bash # Import backends first cat backends.json | jq '.[]' | while read backend; do curl -X PUT -H "Authorization: Bearer token" \ -H "Content-Type: application/json" \ -d "$backend" \ http://new-host:43411/v1.0/backends done # Then import frontends cat frontends.json | jq '.[]' | while read frontend; do curl -X PUT -H "Authorization: Bearer token" \ -H "Content-Type: application/json" \ -d "$frontend" \ http://new-host:43411/v1.0/frontends done ```