Attach Gateway
Version 0.1 Author: Hammad Tariq
1 Problem statement
Multi‑agent apps need a single way to verify who the end‑user is and safely pass that identity (plus limited scopes) between very different LLM engines and orchestration frameworks.
– Google A2A Issue #19
Today:
Local engines (Ollama / vLLM) ship no auth → devs hide them behind random ports.
Cross‑agent hand‑off (A2A, MCP) lacks a portable identity token & memory handle.
Goal: one drop‑in side‑car that gives OIDC‑backed SSO and a session header that all agents can share – with a stubbed memory mirror for recall/eval demos.
2 MVP success criteria
End‑to‑end request (curl → gateway → Ollama) with JWT auth
✅
LangChain notebook works via gateway
✅
/tasks/send
A2A endpoint proxies to same engine
✅
Prompt/response mirrored to Weaviate
✅
Docker‑compose demo + README
✅
3 High‑level architecture
flowchart LR
subgraph Edge
C[Client / Agent]
C -->|Bearer JWT| GW(Attach Gateway)
end
GW -->|HTTP| E[LLM Engine]
GW -->|"X-Attach-\*"| E
GW -.->|async mirror| M["Memory Stub (Weaviate)"]
Gateway pipeline
Auth – verify JWT (OIDC) / HMAC.
Session –
session_id = sha256(user.sub + user‑agent)
Headers out –
X-Attach-User
,X-Attach-Session
,X-Attach-Agent?
.Mirror – non‑blocking stream → memory stub (also accepts
/v1/logs
).Proxy – reverse‑proxy to target engine.
4 Component breakdown
auth/
OIDC JWT validation, JWKS cache
python‑jose
, httpx
middleware/
session header injection & mirror trigger
fastapi
, asyncio
proxy/
streaming reverse‑proxy to engine
httpx.AsyncClient
a2a/
Minimal /tasks/send
& /tasks/status
handlers
fastapi
mem/
Async writer to Weaviate REST
weaviate‑client
4.1 Default engine target (Ollama)
Local dev:
ENGINE_URL=http://ollama:11434
— locks down a laptop's Ollama instance with OIDC JWT SSO in under a minute.Prod swap: repoint
ENGINE_URL
tohttp://vllm:8000
,https://api.openai.com
, etc.; no code change required.
4.2 Auth works independently
Auth can be used without enabling memory through
MEM_BACKEND=none
5 API surface (v0)
5.1 Ingress
/tasks/send
POST
Bearer JWT (chat:run
)
Body ⇒ proxy to /api/chat
on engine
/{path:path}
GET/POST
Bearer JWT (chat:run
)
Raw proxy
5.2 Headers added
X-Attach-User: <sub>
X-Attach-Session: <uuid>
### 5.3 Example curl against Ollama through gateway
```bash
# Fetch a short‑lived user JWT via device‑flow helper
export OLLAMA_TOKEN=$(./scripts/dev_login.sh)
# Make a protected request via the gateway (8080)
curl -H "Authorization: Bearer $OLLAMA_TOKEN" \
-d '{"prompt":"Hello"}' \
http://localhost:8080/api/chat
# Gateway ➜ validates JWT ➜ stamps X‑Attach‑User/Session ➜ proxies to Ollama :11434
5.4 Task queue schemes
http://<service> # default HTTP call
temporal://<Workflow> # execute Temporal workflow
6 Security notes
JWT signature (RS256) verified offline via JWKS.
aud
claim matched against envOIDC_AUD
.Session ID not guessable (
sha256
→ hex).Mirror process redacts secrets before storage.
7 Future work
Swap Weaviate for Attach Store v1 (Git‑like, policy guards).
Add DID‑JWT verifier module.
Rate‑limit & billing hooks.
Helm chart & K8s side‑car injector.
8 Open questions
Use separate header for Agent‑token or nested JWT?
Include streaming metrics in mirror payload now or later?
Naming:
attach-gateway
vsa2a-auth-gw
?
Note on A2A‑only mode
The same JWT validator powers both direct Ollama calls and A2A /tasks/send
requests. If you skip JWT wiring at Ollama, the A2A call cannot propagate a verifiable user identity downstream, defeating delegated auth.
Last updated