Anomaly Detection + Governed Data Access
MCP Query Governance Platform
Designed a two-part platform: Sentinel (Part 1) aggregates per-principal features, trains an anomaly detector with a SageMaker RCF–swappable interface, and emits notify-only alerts with explainable scores; Governed MCP (Part 2) serves a cataloged query path with param validation, server-side RLS injection, and full audit logging. Local MVP uses SQLite + IsolationForest; production maps to RDS, Lambda, EventBridge, and SageMaker.
Challenge
Programmatic database access through MCP and APIs creates shadow query paths: abusive principals can fan out scans, bypass governance, and exfiltrate data without traditional BI audit trails catching them in time.
Solution
Designed a two-part platform: Sentinel (Part 1) aggregates per-principal features, trains an anomaly detector with a SageMaker RCF–swappable interface, and emits notify-only alerts with explainable scores; Governed MCP (Part 2) serves a cataloged query path with param validation, server-side RLS injection, and full audit logging. Local MVP uses SQLite + IsolationForest; production maps to RDS, Lambda, EventBridge, and SageMaker.
Impact Metrics
Results
- • 2/2 injected abusers detected (shadow-abuser-01 via ML + hard cap; shadow-abuser-02 ML-only under cap)
- • 0 false positives on baseline analyst principals
- • ~4σ anomaly scores on abusers with 3.0σ default threshold
- • Governed query path demonstrates allowed + rejected catalog calls
- • Notify-only enforcement with simulated Slack/EventBridge payloads
Business Impact
Shows how to govern agent and API database access without blocking innovation—catch abusive patterns early, route legitimate queries through auditable catalogs, and scale to enterprise AWS services.
Architecture
Portfolio MVP runs locally with zero AWS credentials. Detector backend swaps via DETECTOR_BACKEND=sagemaker.
Sentinel + Governed Access
A proven approach combining statistical rigor, automation, and AWS best practices.
Feature Aggregation
Roll query_audit logs into per-principal / 15-minute features: volume, bytes scanned, off-hours fan-out.
Anomaly Detection
Train IsolationForest on baseline window (RCF interface for SageMaker swap); hybrid ML + hard-cap rules.
Governed Catalog
Cataloged queries with validation, RLS injection, and executed SQL audit trail—no ad-hoc SQL from agents.
AWS Production Path
Documented CDK stack: RDS, Lambda feature_agg, SageMaker RCF, SNS/EventBridge, API Gateway + Cognito.
Scale & Scope
Technology Stack
- Python + pandas feature pipeline
- scikit-learn IsolationForest (SageMaker RCF–swappable)
- SQLite demo warehouse (RDS in production)
- Catalog + RLS executor (Lambda + API Gateway target)
Need a similar solution?
Let's replicate this success within your organization with a tailored engagement plan.