Are conversations encrypted before leaving the server?

Yes. The Privatemode proxy runs on the same host as Open WebUI and encrypts every prompt before it is transmitted. On startup it also performs remote attestation to verify that the Privatemode backend is genuine. Requests are decrypted only inside the confidential computing environment.

Open WebUI shows "No models available" — what do I check?

Verify the proxy is running and reachable from the Open WebUI container: docker exec open-webui curl -s http://privatemode-proxy:8080/v1/models If you see a connection error, confirm both containers are on the privatemode-net network: docker network inspect privatemode-net Also check the proxy logs for attestation errors: docker logs privatemode-proxy

How do I add HTTPS for external access?

Put Open WebUI behind a reverse proxy such as nginx with TLS termination. The nginx location block should proxy_pass to http://localhost:3000 and include proxy_buffering off to support streaming responses. The connection between Open WebUI and the Privatemode proxy runs on the internal Docker network and does not need separate TLS. The proxy already handles end-to-end encryption of all inference traffic. See details.

The proxy fails to start with a certificate error — what is wrong?

Your network likely uses TLS inspection. Mount a custom CA bundle into the proxy container using -v /path/to/ca-bundle.crt:/etc/ssl/certs/custom-ca.crt:ro and set the environment variable SSL_CERT_FILE=/etc/ssl/certs/custom-ca.crt in the docker run command. Also make sure you are running the latest proxy image: docker pull ghcr.io/edgelesssys/privatemode/privatemode-proxy:latest.

Integrating Privatemode AI with Open WebUI

Q: Does Privatemode retain conversation data?

No. Privatemode is designed to never learn from your data. Inference requests are processed inside a confidential computing enclave and no prompts, responses, or metadata are retained after processing.

Open WebUI gives your whole team a self-hosted, ChatGPT-like chat interface accessible from any browser. This guide shows you how to run Open WebUI and the Privatemode proxy together on a local endpoint, so every conversation is end-to-end encrypted before it leaves your local network.

Introduction

Open WebUI integration

Benefits

Why integrating Open WebUI

Introduction

A private AI chat server for your whole team

Self-hosted interface. Cloud-grade models.

Open WebUI runs on your server as a web app that your team accesses from any browser. Connecting it directly to a cloud LLM provider, however, exposes every prompt and response to that provider. This guide swaps that direct connection with Confidential AI, so traffic stays end-to-end encrypted.

Encrypted before leaving your environment

The Privatemode AI proxy runs alongside Open WebUI on your internal server and intercepts every prompt. It verifies the backend through remote attestation, encrypts the request, and only then sends it out. By the time your data reaches the cloud, no one upstream can read it. Not even us.

Hardware-enforced on arrival

Requests are processed inside a confidential computing environment that isolates them at the hardware level, sealed off from the host, the hypervisor, and our own operators. The full stack is open source and remotely attestable, so the guarantees are verifiable, not just promised.

Benefits

Why use Privatemode AI with Open WebUI?

End-to-end encrypted inference

Every prompt and response passes through the local proxy, which encrypts it before it leaves the maschine. Inference runs inside a hardware-enforced confidential computing environment on the Privatemode backend. Neither the infrastructure provider nor Privatemode can access your conversations.

State-of-the-art models, no GPU required

Open WebUI gets access to state-of-the-art LLMs through Privatemode, all running inside a confidential computing environment. No model hosting or GPU infrastructure needed on your side.

Shared prompt cache for teams

The proxy's shared prompt cache reduces latency for repeated or similar queries across all users. Teams see faster responses on common workloads without any additional configuration.

How to get started

How to set up Privatemode in Open WebUI

This tutorial walks you through deploying the Privatemode AI proxy alongside Open WebUI on a server your team can reach over your local network. From the moment a prompt leaves the proxy until the response comes back, traffic is end-to-end encrypted and only decrypted inside the backend's confidential computing environment.

Get your API key

If you don't have a Privatemode API key yet, you can generate one for free here.

Create a Docker network

Both containers communicate over a dedicated bridge network. This keeps the Privatemode proxy off the host network, so it is only reachable by containers you explicitly add to this network.

Start the Privatemode proxy

The proxy performs remote attestation on startup to verify the Privatemode backend, then transparently encrypts all requests before they leave the server.

--sharedPromptCache enables a shared cache across all Open WebUI users, reducing latency on repeated queries.

The -v proxy-logs volume persists the attestation log so you can audit which backend versions were trusted over time. The proxy has no exposed host port: it is reachable only by containers on privatemode-net.

Start Open WebUI

OPENAI_API_BASE_URL points Open WebUI at the proxy using the container name as hostname — no IP address needed. OPENAI_API_KEY can be any non-empty string; the proxy handles authentication to Privatemode using the key from Step 1. The -v open-webui-data volume persists user accounts and chat history across container restarts.

Done!

Open http://your-server:3000 in a browser and create an admin account. Open WebUI auto-discovers available models from the Privatemode proxy. Select a model and start chatting.

FAQ

Frequently asked questions about using Privatemode with Open WebUI

Yes. Open WebUI supports multiple user accounts, admin management and separate chat histories. The shared prompt cache also reduces latency for queries that share common prefixes across users, making team-wide usage more efficient.