API Reference

Configuration Reference

This comprehensive guide covers all configuration options available in OllamaFlow, including the main settings file (ollamaflow.json) and the structure of key objects like frontends and backends.

Main Configuration File

OllamaFlow uses a JSON configuration file (ollamaflow.json) that defines server settings, logging, and authentication. On first startup, a default configuration is created if none exists.

Default Configuration Location

  • Docker: /app/ollamaflow.json
  • Bare Metal: Same directory as executable

Complete Configuration Example

{
  "Logging": {
    "Servers": [
      {
        "Hostname": "127.0.0.1",
        "Port": 514,
        "RandomizePorts": false,
        "MinimumPort": 65000,
        "MaximumPort": 65535
      }
    ],
    "LogDirectory": "./logs/",
    "LogFilename": "ollamaflow.log",
    "ConsoleLogging": true,
    "EnableColors": true,
    "MinimumSeverity": 1
  },
  "Webserver": {
    "Hostname": "*",
    "Port": 43411,
    "IO": {
      "StreamBufferSize": 65536,
      "MaxRequests": 1024,
      "ReadTimeoutMs": 10000,
      "MaxIncomingHeadersSize": 65536,
      "EnableKeepAlive": false
    },
    "Ssl": {
      "Enable": false,
      "MutuallyAuthenticate": false,
      "AcceptInvalidCertificates": true,
      "CertificateFile": "",
      "CertificatePassword": ""
    },
    "Headers": {
      "IncludeContentLength": true,
      "DefaultHeaders": {
        "Access-Control-Allow-Origin": "*",
        "Access-Control-Allow-Methods": "OPTIONS, HEAD, GET, PUT, POST, DELETE, PATCH",
        "Access-Control-Allow-Headers": "*",
        "Access-Control-Expose-Headers": "",
        "Accept": "*/*",
        "Accept-Language": "en-US, en",
        "Accept-Charset": "ISO-8859-1, utf-8",
        "Cache-Control": "no-cache",
        "Connection": "close",
        "Host": "localhost:43411"
      }
    },
    "AccessControl": {
      "DenyList": {},
      "PermitList": {},
      "Mode": "DefaultPermit"
    },
    "Debug": {
      "AccessControl": false,
      "Routing": false,
      "Requests": false,
      "Responses": false
    }
  },
  "DatabaseFilename": "ollamaflow.db",
  "AdminBearerTokens": [
    "your-secure-admin-token"
  ]
}

Configuration Sections

Logging Settings

Controls how OllamaFlow logs information and errors.

SettingTypeDefaultDescription
LogDirectorystring"./logs/"Directory for log files
LogFilenamestring"ollamaflow.log"Base filename for logs
ConsoleLoggingbooleantrueEnable console output
EnableColorsbooleantrueEnable colored console output
MinimumSeverityinteger1Minimum log level (0=Debug, 1=Info, 2=Warn, 3=Error)

Syslog Servers

Optional remote logging configuration:

{
  "Servers": [
    {
      "Hostname": "syslog.company.com",
      "Port": 514,
      "RandomizePorts": false,
      "MinimumPort": 65000,
      "MaximumPort": 65535
    }
  ]
}

Webserver Settings

Configures the HTTP server that handles all requests.

Basic Settings

SettingTypeDefaultDescription
Hostnamestring"*"Bind hostname (* for all interfaces)
Portinteger43411TCP port to listen on

IO Settings

Controls request handling and performance:

SettingTypeDefaultDescription
StreamBufferSizeinteger65536Buffer size for streaming responses
MaxRequestsinteger1024Maximum concurrent requests
ReadTimeoutMsinteger10000Request read timeout in milliseconds
MaxIncomingHeadersSizeinteger65536Maximum size of request headers
EnableKeepAlivebooleanfalseEnable HTTP keep-alive connections

SSL Settings

HTTPS configuration for secure connections:

SettingTypeDefaultDescription
EnablebooleanfalseEnable HTTPS
MutuallyAuthenticatebooleanfalseRequire client certificates
AcceptInvalidCertificatesbooleantrueAccept self-signed certificates
CertificateFilestring""Path to SSL certificate file
CertificatePasswordstring""Certificate password if required

Headers Settings

Default HTTP headers and CORS configuration:

{
  "IncludeContentLength": true,
  "DefaultHeaders": {
    "Access-Control-Allow-Origin": "*",
    "Access-Control-Allow-Methods": "OPTIONS, HEAD, GET, PUT, POST, DELETE, PATCH",
    "Access-Control-Allow-Headers": "*"
  }
}

Access Control

IP-based access control (optional):

{
  "DenyList": {
    "192.168.1.100": "Blocked IP",
    "10.0.0.0/8": "Blocked network"
  },
  "PermitList": {
    "192.168.1.0/24": "Allowed network"
  },
  "Mode": "DefaultPermit"
}

Modes:

  • DefaultPermit: Allow all except denied IPs
  • DefaultDeny: Deny all except permitted IPs

Database Settings

SettingTypeDefaultDescription
DatabaseFilenamestring"ollamaflow.db"SQLite database file path

Authentication Settings

SettingTypeDefaultDescription
AdminBearerTokensarray["ollamaflowadmin"]Valid bearer tokens for admin APIs

Frontend Configuration

Frontends are virtual Ollama endpoints that clients connect to. They are stored in the database and managed via the API.

Frontend Object Structure

{
  "Identifier": "production-frontend",
  "Name": "Production AI Inference",
  "Hostname": "ai.company.com",
  "TimeoutMs": 90000,
  "LoadBalancing": "RoundRobin",
  "BlockHttp10": true,
  "MaxRequestBodySize": 1073741824,
  "Backends": ["gpu-1", "gpu-2", "gpu-3"],
  "RequiredModels": ["llama3:8b", "mistral:7b"],
  "AllowEmbeddings": true,
  "AllowCompletions": true,
  "PinnedEmbeddingsProperties": {
    "model": "nomic-embed-text",
    "options": {
      "temperature": 0.1
    }
  },
  "PinnedCompletionsProperties": {
    "options": {
      "temperature": 0.7,
      "num_ctx": 2048
    }
  },
  "LogRequestFull": false,
  "LogRequestBody": false,
  "LogResponseBody": false,
  "UseStickySessions": false,
  "StickySessionExpirationMs": 1800000,
  "Active": true
}

Frontend Properties

PropertyTypeDefaultDescription
IdentifierstringRequiredUnique identifier for this frontend
NamestringRequiredHuman-readable name
Hostnamestring"*"Hostname pattern (* for catch-all)
TimeoutMsinteger60000Request timeout in milliseconds
LoadBalancingenum"RoundRobin"Load balancing algorithm
BlockHttp10booleantrueReject HTTP/1.0 requests
MaxRequestBodySizeinteger536870912Max request size in bytes (512MB)
Backendsarray[]List of backend identifiers
RequiredModelsarray[]Models that must be available
AllowEmbeddingsbooleantrueAllow embeddings API requests
AllowCompletionsbooleantrueAllow completions API requests
PinnedEmbeddingsPropertiesobject{}Key-value pairs merged into embeddings requests
PinnedCompletionsPropertiesobject{}Key-value pairs merged into completions requests
UseStickySessionsbooleanfalseEnable session stickiness
StickySessionExpirationMsinteger1800000Session timeout (30 minutes, min: 10s, max: 24h)
LogRequestFullbooleanfalseLog complete requests
LogRequestBodybooleanfalseLog request bodies
LogResponseBodybooleanfalseLog response bodies
ActivebooleantrueWhether frontend is active

Load Balancing Options

  • "RoundRobin": Cycle through backends sequentially
  • "Random": Randomly select from healthy backends

Hostname Patterns

  • "*": Match all hostnames (catch-all)
  • "api.company.com": Exact hostname match
  • Multiple frontends can exist with different hostname patterns

Security Controls

Frontend security controls enable fine-grained access control and request parameter enforcement:

Request Type Controls

  • AllowEmbeddings: Controls whether embeddings API endpoints are accessible through this frontend
    • Ollama API: /api/embed
    • OpenAI API: /v1/embeddings
  • AllowCompletions: Controls whether completion API endpoints are accessible through this frontend
    • Ollama API: /api/generate, /api/chat
    • OpenAI API: /v1/completions, /v1/chat/completions

For a request to succeed, both the frontend and at least one assigned backend must allow the request type.

Pinned Properties

Pinned properties allow administrators to enforce specific parameters in requests, providing security compliance and standardization:

  • PinnedEmbeddingsProperties: Key-value pairs automatically merged into all embeddings requests
  • PinnedCompletionsProperties: Key-value pairs automatically merged into all completion requests

Common use cases:

  • Enforce maximum context size: {"options": {"num_ctx": 2048}}
  • Standardize temperature settings: {"options": {"temperature": 0.7}}
  • Override model selection: {"model": "approved-model:latest"}
  • Set organizational defaults: {"options": {"top_p": 0.9, "top_k": 40}}

Properties are merged with client requests, with pinned properties taking precedence over client-specified values.

Backend Configuration

Backends represent physical Ollama instances in your infrastructure.

Backend Object Structure

{
  "Identifier": "gpu-server-1",
  "Name": "Primary GPU Server",
  "Hostname": "192.168.1.100",
  "Port": 11434,
  "Ssl": false,
  "UnhealthyThreshold": 3,
  "HealthyThreshold": 2,
  "HealthCheckMethod": "GET",
  "HealthCheckUrl": "/",
  "MaxParallelRequests": 8,
  "RateLimitRequestsThreshold": 20,
  "AllowEmbeddings": true,
  "AllowCompletions": true,
  "PinnedEmbeddingsProperties": {
    "options": {
      "num_ctx": 512
    }
  },
  "PinnedCompletionsProperties": {
    "options": {
      "num_ctx": 4096,
      "temperature": 0.8
    }
  },
  "LogRequestFull": false,
  "LogRequestBody": false,
  "LogResponseBody": false,
  "Active": true
}

Backend Properties

PropertyTypeDefaultDescription
IdentifierstringRequiredUnique identifier for this backend
NamestringRequiredHuman-readable name
HostnamestringRequiredBackend server hostname/IP
Portinteger11434Backend server port
SslbooleanfalseUse HTTPS for backend communication
UnhealthyThresholdinteger2Failed checks before marking unhealthy
HealthyThresholdinteger2Successful checks before marking healthy
HealthCheckMethodstring"GET"HTTP method for health checks, either GET or HEAD
HealthCheckUrlstring"/"URL path for health checks
MaxParallelRequestsinteger4Maximum concurrent requests
RateLimitRequestsThresholdinteger10Rate limiting threshold
AllowEmbeddingsbooleantrueAllow embeddings API requests
AllowCompletionsbooleantrueAllow completions API requests
PinnedEmbeddingsPropertiesobject{}Key-value pairs merged into embeddings requests
PinnedCompletionsPropertiesobject{}Key-value pairs merged into completions requests
LogRequestFullbooleanfalseLog complete requests
LogRequestBodybooleanfalseLog request bodies
LogResponseBodybooleanfalseLog response bodies
ActivebooleantrueWhether backend is active

Health Check Configuration

Health checks validate backend availability:

  • Method: HTTP method (GET, HEAD)
  • URL: Path to check (e.g., /, /api/version, /health)
  • Thresholds: Number of consecutive successes/failures to change state

Common health check endpoints:

  • HEAD /: Basic connectivity check for Ollama
  • GET /health: Basic connectivity check for vLLM

Rate Limiting

Backends can enforce rate limits:

  • Requests exceeding RateLimitRequestsThreshold receive HTTP 429
  • Rate limiting is per backend, not global
  • Helps protect individual Ollama instances from overload

Security Controls

Backend security controls provide additional layers of request filtering and parameter enforcement:

Request Type Controls

  • AllowEmbeddings: Controls whether this backend can process embeddings requests
  • AllowCompletions: Controls whether this backend can process completion requests

Requests are only routed to backends that allow the specific request type. This enables:

  • Dedicated embeddings servers that only handle embeddings requests:
    • Ollama API: /api/embed
    • OpenAI API: /v1/embeddings
  • Completion-only servers that only handle completion requests:
    • Ollama API: /api/generate, /api/chat
    • OpenAI API: /v1/completions, /v1/chat/completions
  • Multi-tenant isolation by request type

Pinned Properties

Backend pinned properties provide server-level parameter enforcement:

  • PinnedEmbeddingsProperties: Applied to all embeddings requests routed to this backend
  • PinnedCompletionsProperties: Applied to all completion requests routed to this backend

Backend pinned properties are merged after frontend pinned properties, allowing for:

  • Server-specific resource limits: {"options": {"num_ctx": 1024}}
  • Hardware-optimized settings: {"options": {"num_gpu": 2}}
  • Backend-specific model overrides: {"model": "server-optimized-model"}

The merge order is: Client Request → Frontend Pinned Properties → Backend Pinned Properties, with later values taking precedence.


docker run -d \
  -e OLLAMAFLOW_PORT=8080 \
  -e OLLAMAFLOW_ADMIN_TOKEN=my-secure-token \
  -e ASPNETCORE_ENVIRONMENT=Production \
  -p 8080:8080 \
  jchristn/ollamaflow

Configuration Examples

Basic Single Backend

Minimal configuration for testing:

{
  "Webserver": {
    "Port": 43411
  },
  "AdminBearerTokens": ["test-token"]
}

Frontend/Backend via API:

# Create backend
curl -X PUT -H "Authorization: Bearer test-token" \
  -H "Content-Type: application/json" \
  -d '{"Identifier": "local", "Hostname": "localhost", "Port": 11434}' \
  http://localhost:43411/v1.0/backends

# Create frontend
curl -X PUT -H "Authorization: Bearer test-token" \
  -H "Content-Type: application/json" \
  -d '{"Identifier": "main", "Backends": ["local"]}' \
  http://localhost:43411/v1.0/frontends

Production Multi-Backend

Production configuration with multiple GPU servers:

{
  "Webserver": {
    "Hostname": "*",
    "Port": 43411,
    "Ssl": {
      "Enable": true,
      "CertificateFile": "/etc/ssl/ollamaflow.crt",
      "CertificatePassword": "cert-password"
    }
  },
  "Logging": {
    "LogDirectory": "/var/log/ollamaflow/",
    "ConsoleLogging": false,
    "MinimumSeverity": 1
  },
  "DatabaseFilename": "/var/lib/ollamaflow/ollamaflow.db",
  "AdminBearerTokens": [
    "secure-production-token-1",
    "secure-production-token-2"
  ]
}

Backends configuration:

# GPU servers
for i in {1..4}; do
  curl -X PUT -H "Authorization: Bearer secure-production-token-1" \
    -H "Content-Type: application/json" \
    -d "{
      \"Identifier\": \"gpu-$i\",
      \"Name\": \"GPU Server $i\",
      \"Hostname\": \"gpu$i.company.internal\",
      \"Port\": 11434,
      \"MaxParallelRequests\": 8,
      \"HealthCheckUrl\": \"/api/version\"
    }" \
    http://localhost:43411/v1.0/backends
done

# Production frontend
curl -X PUT -H "Authorization: Bearer secure-production-token-1" \
  -H "Content-Type: application/json" \
  -d '{
    "Identifier": "production",
    "Name": "Production AI Inference",
    "Hostname": "ai.company.com",
    "LoadBalancing": "RoundRobin",
    "Backends": ["gpu-1", "gpu-2", "gpu-3", "gpu-4"],
    "RequiredModels": ["llama3:8b", "mistral:7b", "codellama:13b"],
    "TimeoutMs": 120000
  }' \
  http://localhost:43411/v1.0/frontends

Development Environment

Development setup with debugging enabled:

{
  "Webserver": {
    "Port": 43411,
    "Debug": {
      "Routing": true,
      "Requests": true,
      "Responses": false
    }
  },
  "Logging": {
    "ConsoleLogging": true,
    "EnableColors": true,
    "MinimumSeverity": 0
  },
  "AdminBearerTokens": ["dev-token"]
}

Configuration Validation

OllamaFlow validates configuration on startup:

Common Validation Errors

  1. Invalid Port Range: Ports must be 1-65535
  2. Missing Required Fields: Identifier, Hostname required for backends
  3. Duplicate Identifiers: Frontend/Backend IDs must be unique
  4. Invalid Load Balancing: Must be "RoundRobin" or "Random"
  5. Invalid Hostnames: Must be valid hostname or "*"

Configuration Test

Validate configuration without starting the server:

# Test configuration file
dotnet OllamaFlow.Server.dll --validate-config

# Test with specific config
OLLAMAFLOW_CONFIG=/path/to/config.json dotnet OllamaFlow.Server.dll --validate-config

Migration and Backup

Database Backup

Backup the SQLite database regularly:

# Simple copy (stop OllamaFlow first)
cp ollamaflow.db ollamaflow.db.backup

# Online backup (while running)
sqlite3 ollamaflow.db ".backup /path/to/backup.db"

Configuration Migration

When upgrading OllamaFlow:

  1. Backup Configuration: Save current ollamaflow.json
  2. Backup Database: Save current ollamaflow.db
  3. Review Changes: Check for new configuration options
  4. Test Upgrade: Test in non-production environment first

Export/Import Configuration

Export current configuration for replication:

# Export all frontends
curl -H "Authorization: Bearer token" \
  http://localhost:43411/v1.0/frontends > frontends.json

# Export all backends
curl -H "Authorization: Bearer token" \
  http://localhost:43411/v1.0/backends > backends.json

Import configuration to new instance:

# Import backends first
cat backends.json | jq '.[]' | while read backend; do
  curl -X PUT -H "Authorization: Bearer token" \
    -H "Content-Type: application/json" \
    -d "$backend" \
    http://new-host:43411/v1.0/backends
done

# Then import frontends
cat frontends.json | jq '.[]' | while read frontend; do
  curl -X PUT -H "Authorization: Bearer token" \
    -H "Content-Type: application/json" \
    -d "$frontend" \
    http://new-host:43411/v1.0/frontends
done

Next Steps

Main Configuration File

OllamaFlow uses a JSON configuration file (ollamaflow.json) that defines server settings, logging, and authentication. On first startup, a default configuration is created if none exists.

Default Configuration Location

  • Docker: /app/ollamaflow.json
  • Bare Metal: Same directory as executable
  • Custom: Specify with OLLAMAFLOW_CONFIG environment variable

Complete Configuration Example

{
  "Logging": {
    "Servers": [
      {
        "Hostname": "127.0.0.1",
        "Port": 514,
        "RandomizePorts": false,
        "MinimumPort": 65000,
        "MaximumPort": 65535
      }
    ],
    "LogDirectory": "./logs/",
    "LogFilename": "ollamaflow.log",
    "ConsoleLogging": true,
    "EnableColors": true,
    "MinimumSeverity": 1
  },
  "Webserver": {
    "Hostname": "*",
    "Port": 43411,
    "IO": {
      "StreamBufferSize": 65536,
      "MaxRequests": 1024,
      "ReadTimeoutMs": 10000,
      "MaxIncomingHeadersSize": 65536,
      "EnableKeepAlive": false
    },
    "Ssl": {
      "Enable": false,
      "MutuallyAuthenticate": false,
      "AcceptInvalidCertificates": true,
      "CertificateFile": "",
      "CertificatePassword": ""
    },
    "Headers": {
      "IncludeContentLength": true,
      "DefaultHeaders": {
        "Access-Control-Allow-Origin": "*",
        "Access-Control-Allow-Methods": "OPTIONS, HEAD, GET, PUT, POST, DELETE, PATCH",
        "Access-Control-Allow-Headers": "*",
        "Access-Control-Expose-Headers": "",
        "Accept": "*/*",
        "Accept-Language": "en-US, en",
        "Accept-Charset": "ISO-8859-1, utf-8",
        "Cache-Control": "no-cache",
        "Connection": "close",
        "Host": "localhost:43411"
      }
    },
    "AccessControl": {
      "DenyList": {},
      "PermitList": {},
      "Mode": "DefaultPermit"
    },
    "Debug": {
      "AccessControl": false,
      "Routing": false,
      "Requests": false,
      "Responses": false
    }
  },
  "DatabaseFilename": "ollamaflow.db",
  "AdminBearerTokens": [
    "your-secure-admin-token"
  ]
}

Configuration Sections

Logging Settings

Controls how OllamaFlow logs information and errors.

SettingTypeDefaultDescription
LogDirectorystring"./logs/"Directory for log files
LogFilenamestring"ollamaflow.log"Base filename for logs
ConsoleLoggingbooleantrueEnable console output
EnableColorsbooleantrueEnable colored console output
MinimumSeverityinteger1Minimum log level (0=Debug, 1=Info, 2=Warn, 3=Error)

Syslog Servers

Optional remote logging configuration:

{
  "Servers": [
    {
      "Hostname": "syslog.company.com",
      "Port": 514,
      "RandomizePorts": false,
      "MinimumPort": 65000,
      "MaximumPort": 65535
    }
  ]
}

Webserver Settings

Configures the HTTP server that handles all requests.

Basic Settings

SettingTypeDefaultDescription
Hostnamestring"*"Bind hostname (* for all interfaces)
Portinteger43411TCP port to listen on

IO Settings

Controls request handling and performance:

SettingTypeDefaultDescription
StreamBufferSizeinteger65536Buffer size for streaming responses
MaxRequestsinteger1024Maximum concurrent requests
ReadTimeoutMsinteger10000Request read timeout in milliseconds
MaxIncomingHeadersSizeinteger65536Maximum size of request headers
EnableKeepAlivebooleanfalseEnable HTTP keep-alive connections

SSL Settings

HTTPS configuration for secure connections:

SettingTypeDefaultDescription
EnablebooleanfalseEnable HTTPS
MutuallyAuthenticatebooleanfalseRequire client certificates
AcceptInvalidCertificatesbooleantrueAccept self-signed certificates
CertificateFilestring""Path to SSL certificate file
CertificatePasswordstring""Certificate password if required

Headers Settings

Default HTTP headers and CORS configuration:

{
  "IncludeContentLength": true,
  "DefaultHeaders": {
    "Access-Control-Allow-Origin": "*",
    "Access-Control-Allow-Methods": "OPTIONS, HEAD, GET, PUT, POST, DELETE, PATCH",
    "Access-Control-Allow-Headers": "*"
  }
}

Access Control

IP-based access control (optional):

{
  "DenyList": {
    "192.168.1.100": "Blocked IP",
    "10.0.0.0/8": "Blocked network"
  },
  "PermitList": {
    "192.168.1.0/24": "Allowed network"
  },
  "Mode": "DefaultPermit"
}

Modes:

  • DefaultPermit: Allow all except denied IPs
  • DefaultDeny: Deny all except permitted IPs

Database Settings

SettingTypeDefaultDescription
DatabaseFilenamestring"ollamaflow.db"SQLite database file path

Authentication Settings

SettingTypeDefaultDescription
AdminBearerTokensarray["ollamaflowadmin"]Valid bearer tokens for admin APIs

Frontend Configuration

Frontends are virtual Ollama endpoints that clients connect to. They are stored in the database and managed via the API.

Frontend Object Structure

{
  "Identifier": "production-frontend",
  "Name": "Production AI Inference",
  "Hostname": "ai.company.com",
  "TimeoutMs": 90000,
  "LoadBalancing": "RoundRobin",
  "BlockHttp10": true,
  "MaxRequestBodySize": 1073741824,
  "Backends": ["gpu-1", "gpu-2", "gpu-3"],
  "RequiredModels": ["llama3:8b", "mistral:7b"],
  "AllowEmbeddings": true,
  "AllowCompletions": true,
  "PinnedEmbeddingsProperties": {
    "model": "nomic-embed-text",
    "options": {
      "temperature": 0.1
    }
  },
  "PinnedCompletionsProperties": {
    "options": {
      "temperature": 0.7,
      "num_ctx": 2048
    }
  },
  "LogRequestFull": false,
  "LogRequestBody": false,
  "LogResponseBody": false,
  "UseStickySessions": false,
  "StickySessionExpirationMs": 1800000,
  "Active": true
}

Frontend Properties

PropertyTypeDefaultDescription
IdentifierstringRequiredUnique identifier for this frontend
NamestringRequiredHuman-readable name
Hostnamestring"*"Hostname pattern (* for catch-all)
TimeoutMsinteger60000Request timeout in milliseconds
LoadBalancingenum"RoundRobin"Load balancing algorithm
BlockHttp10booleantrueReject HTTP/1.0 requests
MaxRequestBodySizeinteger536870912Max request size in bytes (512MB)
Backendsarray[]List of backend identifiers
RequiredModelsarray[]Models that must be available
AllowEmbeddingsbooleantrueAllow embeddings API requests
AllowCompletionsbooleantrueAllow completions API requests
PinnedEmbeddingsPropertiesobject{}Key-value pairs merged into embeddings requests
PinnedCompletionsPropertiesobject{}Key-value pairs merged into completions requests
UseStickySessionsbooleanfalseEnable session stickiness
StickySessionExpirationMsinteger1800000Session timeout (30 minutes, min: 10s, max: 24h)
LogRequestFullbooleanfalseLog complete requests
LogRequestBodybooleanfalseLog request bodies
LogResponseBodybooleanfalseLog response bodies
ActivebooleantrueWhether frontend is active

Load Balancing Options

  • "RoundRobin": Cycle through backends sequentially
  • "Random": Randomly select from healthy backends

Hostname Patterns

  • "*": Match all hostnames (catch-all)
  • "api.company.com": Exact hostname match
  • Multiple frontends can exist with different hostname patterns

Security Controls

Frontend security controls enable fine-grained access control and request parameter enforcement:

Request Type Controls

  • AllowEmbeddings: Controls whether embeddings API endpoints are accessible through this frontend
    • Ollama API: /api/embed
    • OpenAI API: /v1/embeddings
  • AllowCompletions: Controls whether completion API endpoints are accessible through this frontend
    • Ollama API: /api/generate, /api/chat
    • OpenAI API: /v1/completions, /v1/chat/completions

For a request to succeed, both the frontend and at least one assigned backend must allow the request type.

Pinned Properties

Pinned properties allow administrators to enforce specific parameters in requests, providing security compliance and standardization:

  • PinnedEmbeddingsProperties: Key-value pairs automatically merged into all embeddings requests
  • PinnedCompletionsProperties: Key-value pairs automatically merged into all completion requests

Common use cases:

  • Enforce maximum context size: {"options": {"num_ctx": 2048}}
  • Standardize temperature settings: {"options": {"temperature": 0.7}}
  • Override model selection: {"model": "approved-model:latest"}
  • Set organizational defaults: {"options": {"top_p": 0.9, "top_k": 40}}

Properties are merged with client requests, with pinned properties taking precedence over client-specified values.

Backend Configuration

Backends represent physical Ollama instances in your infrastructure.

Backend Object Structure

{
  "Identifier": "gpu-server-1",
  "Name": "Primary GPU Server",
  "Hostname": "192.168.1.100",
  "Port": 11434,
  "Ssl": false,
  "UnhealthyThreshold": 3,
  "HealthyThreshold": 2,
  "HealthCheckMethod": "GET",
  "HealthCheckUrl": "/api/version",
  "MaxParallelRequests": 8,
  "RateLimitRequestsThreshold": 20,
  "AllowEmbeddings": true,
  "AllowCompletions": true,
  "PinnedEmbeddingsProperties": {
    "options": {
      "num_ctx": 512
    }
  },
  "PinnedCompletionsProperties": {
    "options": {
      "num_ctx": 4096,
      "temperature": 0.8
    }
  },
  "LogRequestFull": false,
  "LogRequestBody": false,
  "LogResponseBody": false,
  "Active": true
}

Backend Properties

PropertyTypeDefaultDescription
IdentifierstringRequiredUnique identifier for this backend
NamestringRequiredHuman-readable name
HostnamestringRequiredBackend server hostname/IP
Portinteger11434Backend server port
SslbooleanfalseUse HTTPS for backend communication
UnhealthyThresholdinteger2Failed checks before marking unhealthy
HealthyThresholdinteger2Successful checks before marking healthy
HealthCheckMethodstring"GET"HTTP method for health checks
HealthCheckUrlstring"/"URL path for health checks
MaxParallelRequestsinteger4Maximum concurrent requests
RateLimitRequestsThresholdinteger10Rate limiting threshold
AllowEmbeddingsbooleantrueAllow embeddings API requests
AllowCompletionsbooleantrueAllow completions API requests
PinnedEmbeddingsPropertiesobject{}Key-value pairs merged into embeddings requests
PinnedCompletionsPropertiesobject{}Key-value pairs merged into completions requests
LogRequestFullbooleanfalseLog complete requests
LogRequestBodybooleanfalseLog request bodies
LogResponseBodybooleanfalseLog response bodies
ActivebooleantrueWhether backend is active

Health Check Configuration

Health checks validate backend availability:

  • Method: HTTP method (GET, HEAD, POST)
  • URL: Path to check (e.g., /, /api/version, /health)
  • Thresholds: Number of consecutive successes/failures to change state

Common health check endpoints:

  • /: Basic connectivity check
  • /api/version: Ollama version endpoint
  • /api/tags: Model listing endpoint

Rate Limiting

Backends can enforce rate limits:

  • Requests exceeding RateLimitRequestsThreshold receive HTTP 429
  • Rate limiting is per backend, not global
  • Helps protect individual Ollama instances from overload

Security Controls

Backend security controls provide additional layers of request filtering and parameter enforcement:

Request Type Controls

  • AllowEmbeddings: Controls whether this backend can process embeddings requests
  • AllowCompletions: Controls whether this backend can process completion requests

Requests are only routed to backends that allow the specific request type. This enables:

  • Dedicated embeddings servers that only handle embeddings requests:
    • Ollama API: /api/embed
    • OpenAI API: /v1/embeddings
  • Completion-only servers that only handle completion requests:
    • Ollama API: /api/generate, /api/chat
    • OpenAI API: /v1/completions, /v1/chat/completions
  • Multi-tenant isolation by request type

Pinned Properties

Backend pinned properties provide server-level parameter enforcement:

  • PinnedEmbeddingsProperties: Applied to all embeddings requests routed to this backend
  • PinnedCompletionsProperties: Applied to all completion requests routed to this backend

Backend pinned properties are merged after frontend pinned properties, allowing for:

  • Server-specific resource limits: {"options": {"num_ctx": 1024}}
  • Hardware-optimized settings: {"options": {"num_gpu": 2}}
  • Backend-specific model overrides: {"model": "server-optimized-model"}

The merge order is: Client Request → Frontend Pinned Properties → Backend Pinned Properties, with later values taking precedence.

Environment Variables

Override configuration with environment variables:

VariableDescriptionExample
OLLAMAFLOW_CONFIGConfiguration file path/etc/ollamaflow/config.json
OLLAMAFLOW_PORTOverride webserver port8080
OLLAMAFLOW_HOSTNAMEOverride webserver hostname0.0.0.0
OLLAMAFLOW_DATABASEOverride database file path/data/ollamaflow.db
OLLAMAFLOW_ADMIN_TOKENOverride admin tokensecure-production-token
ASPNETCORE_ENVIRONMENT.NET environmentProduction

Docker Environment Example

docker run -d \
  -e OLLAMAFLOW_PORT=8080 \
  -e OLLAMAFLOW_ADMIN_TOKEN=my-secure-token \
  -e ASPNETCORE_ENVIRONMENT=Production \
  -p 8080:8080 \
  jchristn/ollamaflow

Configuration Examples

Basic Single Backend

Minimal configuration for testing:

{
  "Webserver": {
    "Port": 43411
  },
  "AdminBearerTokens": ["test-token"]
}

Frontend/Backend via API:

# Create backend
curl -X PUT -H "Authorization: Bearer test-token" \
  -H "Content-Type: application/json" \
  -d '{"Identifier": "local", "Hostname": "localhost", "Port": 11434}' \
  http://localhost:43411/v1.0/backends

# Create frontend
curl -X PUT -H "Authorization: Bearer test-token" \
  -H "Content-Type: application/json" \
  -d '{"Identifier": "main", "Backends": ["local"]}' \
  http://localhost:43411/v1.0/frontends

Production Multi-Backend

Production configuration with multiple GPU servers:

{
  "Webserver": {
    "Hostname": "*",
    "Port": 43411,
    "Ssl": {
      "Enable": true,
      "CertificateFile": "/etc/ssl/ollamaflow.crt",
      "CertificatePassword": "cert-password"
    }
  },
  "Logging": {
    "LogDirectory": "/var/log/ollamaflow/",
    "ConsoleLogging": false,
    "MinimumSeverity": 1
  },
  "DatabaseFilename": "/var/lib/ollamaflow/ollamaflow.db",
  "AdminBearerTokens": [
    "secure-production-token-1",
    "secure-production-token-2"
  ]
}

Backends configuration:

# GPU servers
for i in {1..4}; do
  curl -X PUT -H "Authorization: Bearer secure-production-token-1" \
    -H "Content-Type: application/json" \
    -d "{
      \"Identifier\": \"gpu-$i\",
      \"Name\": \"GPU Server $i\",
      \"Hostname\": \"gpu$i.company.internal\",
      \"Port\": 11434,
      \"MaxParallelRequests\": 8,
      \"HealthCheckUrl\": \"/api/version\"
    }" \
    http://localhost:43411/v1.0/backends
done

# Production frontend
curl -X PUT -H "Authorization: Bearer secure-production-token-1" \
  -H "Content-Type: application/json" \
  -d '{
    "Identifier": "production",
    "Name": "Production AI Inference",
    "Hostname": "ai.company.com",
    "LoadBalancing": "RoundRobin",
    "Backends": ["gpu-1", "gpu-2", "gpu-3", "gpu-4"],
    "RequiredModels": ["llama3:8b", "mistral:7b", "codellama:13b"],
    "TimeoutMs": 120000
  }' \
  http://localhost:43411/v1.0/frontends

Development Environment

Development setup with debugging enabled:

{
  "Webserver": {
    "Port": 43411,
    "Debug": {
      "Routing": true,
      "Requests": true,
      "Responses": false
    }
  },
  "Logging": {
    "ConsoleLogging": true,
    "EnableColors": true,
    "MinimumSeverity": 0
  },
  "AdminBearerTokens": ["dev-token"]
}

Configuration Validation

OllamaFlow validates configuration on startup:

Common Validation Errors

  1. Invalid Port Range: Ports must be 1-65535
  2. Missing Required Fields: Identifier, Hostname required for backends
  3. Duplicate Identifiers: Frontend/Backend IDs must be unique
  4. Invalid Load Balancing: Must be "RoundRobin" or "Random"
  5. Invalid Hostnames: Must be valid hostname or "*"

Configuration Test

Validate configuration without starting the server:

# Test configuration file
dotnet OllamaFlow.Server.dll --validate-config

# Test with specific config
OLLAMAFLOW_CONFIG=/path/to/config.json dotnet OllamaFlow.Server.dll --validate-config

Migration and Backup

Database Backup

Backup the SQLite database regularly:

# Simple copy (stop OllamaFlow first)
cp ollamaflow.db ollamaflow.db.backup

# Online backup (while running)
sqlite3 ollamaflow.db ".backup /path/to/backup.db"

Configuration Migration

When upgrading OllamaFlow:

  1. Backup Configuration: Save current ollamaflow.json
  2. Backup Database: Save current ollamaflow.db
  3. Review Changes: Check for new configuration options
  4. Test Upgrade: Test in non-production environment first

Export/Import Configuration

Export current configuration for replication:

# Export all frontends
curl -H "Authorization: Bearer token" \
  http://localhost:43411/v1.0/frontends > frontends.json

# Export all backends
curl -H "Authorization: Bearer token" \
  http://localhost:43411/v1.0/backends > backends.json

Import configuration to new instance:

# Import backends first
cat backends.json | jq '.[]' | while read backend; do
  curl -X PUT -H "Authorization: Bearer token" \
    -H "Content-Type: application/json" \
    -d "$backend" \
    http://new-host:43411/v1.0/backends
done

# Then import frontends
cat frontends.json | jq '.[]' | while read frontend; do
  curl -X PUT -H "Authorization: Bearer token" \
    -H "Content-Type: application/json" \
    -d "$frontend" \
    http://new-host:43411/v1.0/frontends
done

Next Steps

Main Configuration File

OllamaFlow uses a JSON configuration file (ollamaflow.json) that defines server settings, logging, and authentication. On first startup, a default configuration is created if none exists.

Default Configuration Location

  • Docker: /app/ollamaflow.json
  • Bare Metal: Same directory as executable
  • Custom: Specify with OLLAMAFLOW_CONFIG environment variable

Complete Configuration Example

{
  "Logging": {
    "Servers": [
      {
        "Hostname": "127.0.0.1",
        "Port": 514,
        "RandomizePorts": false,
        "MinimumPort": 65000,
        "MaximumPort": 65535
      }
    ],
    "LogDirectory": "./logs/",
    "LogFilename": "ollamaflow.log",
    "ConsoleLogging": true,
    "EnableColors": true,
    "MinimumSeverity": 1
  },
  "Webserver": {
    "Hostname": "*",
    "Port": 43411,
    "IO": {
      "StreamBufferSize": 65536,
      "MaxRequests": 1024,
      "ReadTimeoutMs": 10000,
      "MaxIncomingHeadersSize": 65536,
      "EnableKeepAlive": false
    },
    "Ssl": {
      "Enable": false,
      "MutuallyAuthenticate": false,
      "AcceptInvalidCertificates": true,
      "CertificateFile": "",
      "CertificatePassword": ""
    },
    "Headers": {
      "IncludeContentLength": true,
      "DefaultHeaders": {
        "Access-Control-Allow-Origin": "*",
        "Access-Control-Allow-Methods": "OPTIONS, HEAD, GET, PUT, POST, DELETE, PATCH",
        "Access-Control-Allow-Headers": "*",
        "Access-Control-Expose-Headers": "",
        "Accept": "*/*",
        "Accept-Language": "en-US, en",
        "Accept-Charset": "ISO-8859-1, utf-8",
        "Cache-Control": "no-cache",
        "Connection": "close",
        "Host": "localhost:43411"
      }
    },
    "AccessControl": {
      "DenyList": {},
      "PermitList": {},
      "Mode": "DefaultPermit"
    },
    "Debug": {
      "AccessControl": false,
      "Routing": false,
      "Requests": false,
      "Responses": false
    }
  },
  "DatabaseFilename": "ollamaflow.db",
  "AdminBearerTokens": [
    "your-secure-admin-token"
  ],
  "StickyHeaders": [
    "x-conversation-id",
    "x-thread-id"
  ]
}

Configuration Sections

Logging Settings

Controls how OllamaFlow logs information and errors.

SettingTypeDefaultDescription
LogDirectorystring"./logs/"Directory for log files
LogFilenamestring"ollamaflow.log"Base filename for logs
ConsoleLoggingbooleantrueEnable console output
EnableColorsbooleantrueEnable colored console output
MinimumSeverityinteger1Minimum log level (0=Debug, 1=Info, 2=Warn, 3=Error)

Syslog Servers

Optional remote logging configuration:

{
  "Servers": [
    {
      "Hostname": "syslog.company.com",
      "Port": 514,
      "RandomizePorts": false,
      "MinimumPort": 65000,
      "MaximumPort": 65535
    }
  ]
}

Webserver Settings

Configures the HTTP server that handles all requests.

Basic Settings

SettingTypeDefaultDescription
Hostnamestring"*"Bind hostname (* for all interfaces)
Portinteger43411TCP port to listen on

IO Settings

Controls request handling and performance:

SettingTypeDefaultDescription
StreamBufferSizeinteger65536Buffer size for streaming responses
MaxRequestsinteger1024Maximum concurrent requests
ReadTimeoutMsinteger10000Request read timeout in milliseconds
MaxIncomingHeadersSizeinteger65536Maximum size of request headers
EnableKeepAlivebooleanfalseEnable HTTP keep-alive connections

SSL Settings

HTTPS configuration for secure connections:

SettingTypeDefaultDescription
EnablebooleanfalseEnable HTTPS
MutuallyAuthenticatebooleanfalseRequire client certificates
AcceptInvalidCertificatesbooleantrueAccept self-signed certificates
CertificateFilestring""Path to SSL certificate file
CertificatePasswordstring""Certificate password if required

Headers Settings

Default HTTP headers and CORS configuration:

{
  "IncludeContentLength": true,
  "DefaultHeaders": {
    "Access-Control-Allow-Origin": "*",
    "Access-Control-Allow-Methods": "OPTIONS, HEAD, GET, PUT, POST, DELETE, PATCH",
    "Access-Control-Allow-Headers": "*"
  }
}

Access Control

IP-based access control (optional):

{
  "DenyList": {
    "192.168.1.100": "Blocked IP",
    "10.0.0.0/8": "Blocked network"
  },
  "PermitList": {
    "192.168.1.0/24": "Allowed network"
  },
  "Mode": "DefaultPermit"
}

Modes:

  • DefaultPermit: Allow all except denied IPs
  • DefaultDeny: Deny all except permitted IPs

Database Settings

SettingTypeDefaultDescription
DatabaseFilenamestring"ollamaflow.db"SQLite database file path

Authentication Settings

SettingTypeDefaultDescription
AdminBearerTokensarray["ollamaflowadmin"]Valid bearer tokens for admin APIs

Sticky Headers

The StickyHeaders string array specifies on which headers to match to uniquely identify a client when using session stickiness. If you are not using session stickiness, set this to an empty array. A case-insensitive comparison is used, meaning x-conversation-id and X-Conversation-ID are considered the same while evaluating headers.

If no sticky headers are defined and session stickiness is enabled, the client IP address will be used as the client identifier.

Frontend Configuration

Frontends are virtual Ollama endpoints that clients connect to. They are stored in the database and managed via the API.

Frontend Object Structure

{
  "Identifier": "production-frontend",
  "Name": "Production AI Inference",
  "Hostname": "ai.company.com",
  "TimeoutMs": 90000,
  "LoadBalancing": "RoundRobin",
  "BlockHttp10": true,
  "MaxRequestBodySize": 1073741824,
  "Backends": ["gpu-1", "gpu-2", "gpu-3"],
  "RequiredModels": ["llama3:8b", "mistral:7b"],
  "LogRequestFull": false,
  "LogRequestBody": false,
  "LogResponseBody": false,
  "UseStickySessions": false,
  "StickySessionExpirationMs": 1800000,
  "Active": true
}

Frontend Properties

PropertyTypeDefaultDescription
IdentifierstringRequiredUnique identifier for this frontend
NamestringRequiredHuman-readable name
Hostnamestring"*"Hostname pattern (* for catch-all)
TimeoutMsinteger60000Request timeout in milliseconds
LoadBalancingenum"RoundRobin"Load balancing algorithm
BlockHttp10booleantrueReject HTTP/1.0 requests
MaxRequestBodySizeinteger536870912Max request size in bytes (512MB)
Backendsarray[]List of backend identifiers
RequiredModelsarray[]Models that must be available
UseStickySessionsbooleanfalseEnable session stickiness
StickySessionExpirationMsinteger1800000Session timeout (30 minutes, min: 10s, max: 24h)
LogRequestFullbooleanfalseLog complete requests
LogRequestBodybooleanfalseLog request bodies
LogResponseBodybooleanfalseLog response bodies
ActivebooleantrueWhether frontend is active

Load Balancing Options

  • "RoundRobin": Cycle through backends sequentially
  • "Random": Randomly select from healthy backends

Hostname Patterns

  • "*": Match all hostnames (catch-all)
  • "api.company.com": Exact hostname match
  • Multiple frontends can exist with different hostname patterns

Backend Configuration

Backends represent physical Ollama instances in your infrastructure.

Backend Object Structure

{
  "Identifier": "gpu-server-1",
  "Name": "Primary GPU Server",
  "Hostname": "192.168.1.100",
  "Port": 11434,
  "Ssl": false,
  "UnhealthyThreshold": 3,
  "HealthyThreshold": 2,
  "HealthCheckMethod": "GET",
  "HealthCheckUrl": "/api/version",
  "MaxParallelRequests": 8,
  "RateLimitRequestsThreshold": 20,
  "ApiFormat": "Ollama",
  "LogRequestFull": false,
  "LogRequestBody": false,
  "LogResponseBody": false,
  "Active": true
}

Backend Properties

PropertyTypeDefaultDescription
IdentifierstringRequiredUnique identifier for this backend
NamestringRequiredHuman-readable name
HostnamestringRequiredBackend server hostname/IP
Portinteger11434Backend server port
SslbooleanfalseUse HTTPS for backend communication
UnhealthyThresholdinteger2Failed checks before marking unhealthy
HealthyThresholdinteger2Successful checks before marking healthy
HealthCheckMethodstring"GET"HTTP method for health checks
HealthCheckUrlstring"/"URL path for health checks
MaxParallelRequestsinteger4Maximum concurrent requests
RateLimitRequestsThresholdinteger10Rate limiting threshold
ApiFormatstringOllamaBackend API format, either Ollama or OpenAI
LogRequestFullbooleanfalseLog complete requests
LogRequestBodybooleanfalseLog request bodies
LogResponseBodybooleanfalseLog response bodies
ActivebooleantrueWhether backend is active

Health Check Configuration

Health checks validate backend availability:

  • Method: HTTP method (GET, HEAD, POST)
  • URL: Path to check (e.g., /, /api/version, /health)
  • Thresholds: Number of consecutive successes/failures to change state

IMPORTANT: vLLM expects healthchecks on GET /health

Common health check endpoints:

  • /: Basic connectivity check
  • /api/version: Ollama version endpoint
  • /api/tags: Model listing endpoint

Rate Limiting

Backends can enforce rate limits:

  • Requests exceeding RateLimitRequestsThreshold receive HTTP 429
  • Rate limiting is per backend, not global
  • Helps protect individual Ollama instances from overload

Environment Variables

Override configuration with environment variables:

VariableDescriptionExample
OLLAMAFLOW_CONFIGConfiguration file path/etc/ollamaflow/config.json
OLLAMAFLOW_PORTOverride webserver port8080
OLLAMAFLOW_HOSTNAMEOverride webserver hostname0.0.0.0
OLLAMAFLOW_DATABASEOverride database file path/data/ollamaflow.db
OLLAMAFLOW_ADMIN_TOKENOverride admin tokensecure-production-token
ASPNETCORE_ENVIRONMENT.NET environmentProduction

Docker Environment Example

docker run -d \
  -e OLLAMAFLOW_PORT=8080 \
  -e OLLAMAFLOW_ADMIN_TOKEN=my-secure-token \
  -e ASPNETCORE_ENVIRONMENT=Production \
  -p 8080:8080 \
  jchristn/ollamaflow

Configuration Examples

Basic Single Backend

Minimal configuration for testing:

{
  "Webserver": {
    "Port": 43411
  },
  "AdminBearerTokens": ["test-token"]
}

Frontend/Backend via API:

# Create backend
curl -X PUT -H "Authorization: Bearer test-token" \
  -H "Content-Type: application/json" \
  -d '{"Identifier": "local", "Hostname": "localhost", "Port": 11434}' \
  http://localhost:43411/v1.0/backends

# Create frontend
curl -X PUT -H "Authorization: Bearer test-token" \
  -H "Content-Type: application/json" \
  -d '{"Identifier": "main", "Backends": ["local"]}' \
  http://localhost:43411/v1.0/frontends

Production Multi-Backend

Production configuration with multiple GPU servers:

{
  "Webserver": {
    "Hostname": "*",
    "Port": 43411,
    "Ssl": {
      "Enable": true,
      "CertificateFile": "/etc/ssl/ollamaflow.crt",
      "CertificatePassword": "cert-password"
    }
  },
  "Logging": {
    "LogDirectory": "/var/log/ollamaflow/",
    "ConsoleLogging": false,
    "MinimumSeverity": 1
  },
  "DatabaseFilename": "/var/lib/ollamaflow/ollamaflow.db",
  "AdminBearerTokens": [
    "secure-production-token-1",
    "secure-production-token-2"
  ]
}

Backends configuration:

# GPU servers
for i in {1..4}; do
  curl -X PUT -H "Authorization: Bearer secure-production-token-1" \
    -H "Content-Type: application/json" \
    -d "{
      \"Identifier\": \"gpu-$i\",
      \"Name\": \"GPU Server $i\",
      \"Hostname\": \"gpu$i.company.internal\",
      \"Port\": 11434,
      \"MaxParallelRequests\": 8,
      \"HealthCheckUrl\": \"/api/version\"
    }" \
    http://localhost:43411/v1.0/backends
done

# Production frontend
curl -X PUT -H "Authorization: Bearer secure-production-token-1" \
  -H "Content-Type: application/json" \
  -d '{
    "Identifier": "production",
    "Name": "Production AI Inference",
    "Hostname": "ai.company.com",
    "LoadBalancing": "RoundRobin",
    "Backends": ["gpu-1", "gpu-2", "gpu-3", "gpu-4"],
    "RequiredModels": ["llama3:8b", "mistral:7b", "codellama:13b"],
    "TimeoutMs": 120000
  }' \
  http://localhost:43411/v1.0/frontends

Development Environment

Development setup with debugging enabled:

{
  "Webserver": {
    "Port": 43411,
    "Debug": {
      "Routing": true,
      "Requests": true,
      "Responses": false
    }
  },
  "Logging": {
    "ConsoleLogging": true,
    "EnableColors": true,
    "MinimumSeverity": 0
  },
  "AdminBearerTokens": ["dev-token"]
}

Configuration Validation

OllamaFlow validates configuration on startup:

Common Validation Errors

  1. Invalid Port Range: Ports must be 1-65535
  2. Missing Required Fields: Identifier, Hostname required for backends
  3. Duplicate Identifiers: Frontend/Backend IDs must be unique
  4. Invalid Load Balancing: Must be "RoundRobin" or "Random"
  5. Invalid Hostnames: Must be valid hostname or "*"

Configuration Test

Validate configuration without starting the server:

# Test configuration file
dotnet OllamaFlow.Server.dll --validate-config

# Test with specific config
OLLAMAFLOW_CONFIG=/path/to/config.json dotnet OllamaFlow.Server.dll --validate-config

Migration and Backup

Database Backup

Backup the SQLite database regularly:

# Simple copy (stop OllamaFlow first)
cp ollamaflow.db ollamaflow.db.backup

# Online backup (while running)
sqlite3 ollamaflow.db ".backup /path/to/backup.db"

Configuration Migration

When upgrading OllamaFlow:

  1. Backup Configuration: Save current ollamaflow.json
  2. Backup Database: Save current ollamaflow.db
  3. Review Changes: Check for new configuration options
  4. Test Upgrade: Test in non-production environment first

Export/Import Configuration

Export current configuration for replication:

# Export all frontends
curl -H "Authorization: Bearer token" \
  http://localhost:43411/v1.0/frontends > frontends.json

# Export all backends
curl -H "Authorization: Bearer token" \
  http://localhost:43411/v1.0/backends > backends.json

Import configuration to new instance:

# Import backends first
cat backends.json | jq '.[]' | while read backend; do
  curl -X PUT -H "Authorization: Bearer token" \
    -H "Content-Type: application/json" \
    -d "$backend" \
    http://new-host:43411/v1.0/backends
done

# Then import frontends
cat frontends.json | jq '.[]' | while read frontend; do
  curl -X PUT -H "Authorization: Bearer token" \
    -H "Content-Type: application/json" \
    -d "$frontend" \
    http://new-host:43411/v1.0/frontends
done