
GitLab Duo brings AI features like code suggestions, chat, code review, and agentic flows into your GitLab instance. With GitLab Duo Self-Hosted you can serve those features from a model you control. This guide enables you to back Duo with Privatemode, so prompts and source code stay end-to-end encrypted and confidential.
Introduction

GitLab Duo brings code suggestions, chat, code review, and agentic flows into your instance, but those features send your prompts and source code to a model provider. For teams working on proprietary code, regulated software, or sensitive IP, that means your codebase leaves your control.
Privatemode provides an OpenAI-compatible API backed by state-of-the-art models running inside confidential computing environments. With GitLab Duo Self-Hosted you point Duo's AI Gateway at the Privatemode proxy, and your prompts, source code, and AI responses are encrypted end-to-end. No third party can see your data.
The Privatemode proxy encrypts all data before it leaves your network. On the server side, inference runs inside hardware-isolated confidential computing environments (AMD SEV / Intel TDX together with Nvidia Confidential Computing), and the proxy verifies server integrity through remote attestation before every session. Your code is never stored and never used for model training.
Benefits
Code suggestions, chat, code review, and agentic flows all run through the Privatemode proxy, which encrypts every prompt and every file before it leaves your network. Inference runs inside confidential computing environments, and the proxy verifies the server through remote attestation before each session. Your source code stays confidential, even during inference.
With GitLab Duo Self-Hosted, you assign your own model to every feature, so nothing routes to a GitLab-managed gateway. The proxy exposes a standard OpenAI-compatible API, so Duo's AI Gateway connects to it like any other endpoint.
Because model selection lives on the instance, the GitLab Workflow extension for VS Code and JetBrains inherits the same setup. In-editor chat and code suggestions route through the same gateway and proxy.
Overview
Duo Self-Hosted talks to a self-hosted AI Gateway, which forwards inference to any OpenAI-compatible endpoint. The Privatemode proxy is that endpoint; it exposes a standard /v1 API and handles the encryption and attestation against the Privatemode service.
You run the AI Gateway and the proxy yourself. The steps below are deployment-agnostic (Linux package, Docker, Helm/Kubernetes); follow the linked docs for the specifics of your setup.
Step 0
Before you start, make sure you have the following ready:
Hybrid vs. fully self-hosted: any feature left on a GitLab-managed model routes to GitLab's hosted gateway, not yours. To keep everything confidential, assign your self-hosted model to every feature you use (step 3).
Step 1
If you don't have a Privatemode API key yet, you can generate one for free here.
Follow the Privatemode API quickstart. Enabling the shared prompt cache (--sharedPromptCache) is recommended for this workload.
The proxy serves an OpenAI-compatible API on :8080/v1. Make sure it's reachable from the AI Gateway and note that URL (e.g. http://privatemode-proxy:8080/v1 on a shared Docker network) — you'll need it in step 3.
Confirm the proxy is up and which models it serves before moving on. The returned model IDs are what you'll enter as the Model identifier in step 3.
Step 2
Install per Install the GitLab AI Gateway. Key points:
self-hosted-vX.Y.*-ee tag matching your GitLab major.minor(patch need not match). Re-check on each GitLab upgrade.localhost for the gateway.AIGW_SELF_SIGNED_JWT__* (the gateway itself, used by e.g. Duo Chat) and DUO_WORKFLOW_SELF_SIGNED_JWT__* (the Agent Platform service).REQUESTS_CA_BUNDLE or the container's CA bundle.The gateway prefetches GitLab's JWKS keys at startup; if GitLab isn't reachable yet, the Agent Platform (gRPC) half can fail to start. Bring GitLab up first, or have your orchestration wait for it.
Step 3
In Admin → GitLab Duo → Self-hosted (configure docs):

Set the Local AI Gateway URL (and the Agent Platform service URL if used). Leave TLS off only if the gateway serves plain HTTP.
If that URL is a private IP or internal hostname, add it to the outbound requests allowlist or GitLab may block it.

Add self-hosted model, with:
/v1 URL (e.g. http://privatemode-proxy:8080/v1), reachable from the gatewayAPIkimi-latest (see Privatemode models)Generic OpenAI-compatible models are a beta feature; GitLab doesn't guarantee support for model-specific issues.
Select the model for every feature you use (Code Suggestions, Chat, …) so none falls back to a GitLab-managed model.

Run the health check in Admin → GitLab Duo and/or gitlab-rake gitlab:duo:verify_self_hosted_setup. Then try Duo Chat or a Code Suggestion.
Step 4
Foundational flows like Code Review run as CI jobs, so they need a GitLab Runner:
gitlab--duo (two dashes) — flow jobs won't be picked up without it.Then enable Allow flow execution and Allow foundational flows at the instance/group/project level, plus the specific flow (e.g. Code Review, also under a project's Settings → Merge requests). Test by assigning @GitLabDuo as an MR reviewer, or assigning an issue to Duo and confirming it opens a merge request.
Step 5
Since model selection lives on the instance, the GitLab Workflow extension (VS Code, JetBrains, …) inherits this setup: point it at your instance, sign in, and in-editor Chat and Code Suggestions route through the same gateway and Privatemode proxy — confidential AI directly in the editor.
