Enterprise AI Agent Integration

Problem

An energy trading client needed to verify whether vendor invoices matched the settlement data in their legacy enterprise platform (30+ years old). This reconciliation was entirely manual. The goal was to build an AI agent that could autonomously query the platform's SOAP/XML API, pull settlement records, and compare them against invoice data.

No infrastructure existed. No playbook existed. MCP (Model Context Protocol) had only been released months earlier. Connecting it to a decades-old enterprise system through enterprise security guardrails had no reference architecture, no tutorial, no precedent. Every decision was researched and built from scratch.

Approach

I designed, built, secured, and deployed the complete cloud infrastructure for this system. The architecture went through three major redesigns as real-world constraints surfaced. V1 used direct internet exposure with TLS and an Elastic IP. Corporate security automation immediately reverted it. V2 used a Cloudflare Tunnel for outbound-only connectivity. V3 (final) uses a Cloudflare reverse proxy at the edge with TLS termination at origin, inbound port 443 restricted to exactly two proxy IPs.

The application server runs on a Windows Server EC2 instance with a four-drive EBS storage layout (1.85 TB across 4 gp3 volumes) mirroring the client's production configuration: separate drives for OS/application, SQL Server installation, transaction logs, and data files. I provisioned the instance, configured drives, transferred and restored a 187 GB database backup, installed the required runtimes, and coordinated the application deployment with the senior infrastructure lead.

The MCP server runs on a separate Linux EC2 in the same VPC. It uses Streamable HTTP transport (not SSE or STDIO, which were the original assumptions and had to be corrected during integration). I deployed a TLS-terminating reverse proxy in front of it, built JWT validation middleware for OAuth 2.0 token verification against Cognito's JWKS endpoint, and created structured JSON logging that passed enterprise security review.

All access to both instances uses AWS Systems Manager (Fleet Manager for the Windows box, Session Manager for Linux). Zero SSH, zero RDP, zero inbound ports for management. I set up the OAuth 2.0 authorization flow with PKCE through AWS Cognito, created the OIDC integration package for the AI platform team, and handled the full security review process including a 251-page vulnerability scan with 176 findings.

Infrastructure architecture

Security request flow

Tech stack

Python FastMCP Streamable HTTP AWS EC2 VPC EBS AWS Cognito IAM Systems Manager Cloudflare WAF OAuth 2.0 / OIDC / PKCE JWT (RS256) SOAP/XML NTLM SQL Server Windows Server IIS systemd

What made this hard

MCP was released in November 2024. Connecting it to a 30-year-old enterprise system through enterprise security guardrails had no reference architecture. The architecture was redesigned three times in response to constraints that surfaced during deployment: corporate security automation that auto-reverted public-facing security groups, the edge proxy team configuring a reverse proxy instead of the requested tunnel, and the AI platform team requiring OIDC/PKCE instead of API tokens.

Each redesign was handled without delays to the project timeline. The senior infrastructure lead estimated the application server setup would take 1-2 weeks. It was operational in days. The security review (251-page scan, 176 findings) passed on the first submission.

Status

End-to-end execution validated: MCP server authenticated via NTLM, sent SOAP POST to the application server, invoked a web method, and confirmed server-side execution through database log entries. PoC approved.