MCP AWS AI OAuth 2.0 Zero-Trust

Enterprise AI agent integration

Enterprise Consulting, 2026

Problem

An energy trading client needed to verify whether vendor invoices matched the settlement data in their legacy enterprise platform (30+ years old). This reconciliation was entirely manual. The goal was to build an AI agent that could autonomously query the platform's SOAP/XML API, pull settlement records, and compare them against invoice data.

No infrastructure existed. No playbook existed. MCP (Model Context Protocol) had only been released months earlier. Connecting it to a decades-old enterprise system through enterprise security guardrails had no reference architecture, no tutorial, no precedent. Every decision was researched and built from scratch.

Approach

I designed, built, secured, and deployed the complete cloud infrastructure for this system. The architecture went through three major redesigns as real-world constraints surfaced. V1 used direct internet exposure with TLS and an Elastic IP. Corporate security automation immediately reverted it. V2 used a Cloudflare Tunnel for outbound-only connectivity. V3 (final) uses a Cloudflare reverse proxy at the edge with TLS termination at origin, inbound port 443 restricted to exactly two proxy IPs.

The application server runs on a Windows Server EC2 instance with a four-drive EBS storage layout (1.85 TB across 4 gp3 volumes) mirroring the client's production configuration: separate drives for OS/application, SQL Server installation, transaction logs, and data files. I provisioned the instance, configured drives, transferred and restored a 187 GB database backup, installed the required runtimes, and coordinated the application deployment with the senior infrastructure lead.

The MCP server runs on a separate Linux EC2 in the same VPC. It uses Streamable HTTP transport (not SSE or STDIO, which were the original assumptions and had to be corrected during integration). I deployed a TLS-terminating reverse proxy in front of it, built JWT validation middleware for OAuth 2.0 token verification against Cognito's JWKS endpoint, and created structured JSON logging that passed enterprise security review.

All access to both instances uses AWS Systems Manager (Fleet Manager for the Windows box, Session Manager for Linux). Zero SSH, zero RDP, zero inbound ports for management. I set up the OAuth 2.0 authorization flow with PKCE through AWS Cognito, created the OIDC integration package for the AI platform team, and handled the full security review process including a 251-page vulnerability scan with 176 findings.

Infrastructure architecture

AWS VPC (single account) MCP server EC2 (Linux) Reverse proxy (TLS :443) MCP server (Streamable HTTP :3000) JWT validation (auth module) Structured JSON logging .env (credentials, gitignored) systemd service (auto-restart) Application server EC2 (Windows) IIS + enterprise application SOAP/XML web services SQL Server (database engine) EBS storage (1.85 TB, 4 gp3 volumes): C: 250 GB OS + app binaries D: 300 GB SQL install + backups E: 300 GB Transaction logs F: 1 TB Data files NTLM auth (service account) SOAP/XML (private VPC) Cloudflare WAF + TLS DDoS filtering :443 AI Platform Tool calling AWS Cognito OAuth 2.0 / OIDC / PKCE Token request (PKCE flow) JWKS validation AWS Systems Manager (zero inbound ports)

Security request flow

1. AI platform requests token OAuth 2.0 Authorization Code + PKCE 2. Cognito issues signed JWT RS256, includes issuer + audience claims 3. Request hits Cloudflare edge WAF rules filter malicious traffic. DDoS protection. Managed TLS. 4. Security group allows only 2 proxy IPs Inbound :443 restricted to Cloudflare proxy addresses only 5. TLS termination at origin, forward to :3000 Reverse proxy decrypts, passes to MCP server on localhost 6. MCP server validates JWT Fetches JWKS from Cognito, verifies RS256 signature, issuer, audience, expiry 7. SOAP call to application server (private VPC) NTLM auth with service account, queries settlement data 3 architecture redesigns V1: Direct exposure (blocked by security) V2: Cloudflare Tunnel (outbound-only) V3: Reverse proxy + restricted SG (final) Security review 251-page vulnerability scan (176 findings) Passed on first submission 5 conditions, all addressed

Tech stack

Python FastMCP Streamable HTTP AWS EC2 VPC EBS AWS Cognito IAM Systems Manager Cloudflare WAF OAuth 2.0 / OIDC / PKCE JWT (RS256) SOAP/XML NTLM SQL Server Windows Server IIS systemd

What made this hard

MCP was released in November 2024. Connecting it to a 30-year-old enterprise system through enterprise security guardrails had no reference architecture. The architecture was redesigned three times in response to constraints that surfaced during deployment: corporate security automation that auto-reverted public-facing security groups, the edge proxy team configuring a reverse proxy instead of the requested tunnel, and the AI platform team requiring OIDC/PKCE instead of API tokens.

Each redesign was handled without delays to the project timeline. The senior infrastructure lead estimated the application server setup would take 1-2 weeks. It was operational in days. The security review (251-page scan, 176 findings) passed on the first submission.

Status

End-to-end execution validated: MCP server authenticated via NTLM, sent SOAP POST to the application server, invoked a web method, and confirmed server-side execution through database log entries. PoC approved.

← All projects Next project →