Anomaly Detection + Governed Data Access

MCP Query Governance Platform

Designed a two-part platform: Sentinel (Part 1) aggregates per-principal features, trains an anomaly detector with a SageMaker RCF–swappable interface, and emits notify-only alerts with explainable scores; Governed MCP (Part 2) serves a cataloged query path with param validation, server-side RLS injection, and full audit logging. Local MVP uses SQLite + IsolationForest; production maps to RDS, Lambda, EventBridge, and SageMaker.

★2/2 Abusers Detected

Start a project like this View source on GitHub

Challenge

Programmatic database access through MCP and APIs creates shadow query paths: abusive principals can fan out scans, bypass governance, and exfiltrate data without traditional BI audit trails catching them in time.

Solution

Impact Metrics

2/2

Abusers Detected

False Positives

3σ

ML Threshold

(RCF-ready)

Notify

Enforcement

-only (MVP)

Results

• 2/2 injected abusers detected (shadow-abuser-01 via ML + hard cap; shadow-abuser-02 ML-only under cap)
• 0 false positives on baseline analyst principals
• ~4σ anomaly scores on abusers with 3.0σ default threshold
• Governed query path demonstrates allowed + rejected catalog calls
• Notify-only enforcement with simulated Slack/EventBridge payloads

Business Impact

Shows how to govern agent and API database access without blocking innovation—catch abusive patterns early, route legitimate queries through auditable catalogs, and scale to enterprise AWS services.

Architecture

Portfolio MVP runs locally with zero AWS credentials. Detector backend swaps via DETECTOR_BACKEND=sagemaker.

Sentinel + Governed Access

A proven approach combining statistical rigor, automation, and AWS best practices.

Feature Aggregation

Roll query_audit logs into per-principal / 15-minute features: volume, bytes scanned, off-hours fan-out.

Anomaly Detection

Train IsolationForest on baseline window (RCF interface for SageMaker swap); hybrid ML + hard-cap rules.

Governed Catalog

Cataloged queries with validation, RLS injection, and executed SQL audit trail—no ad-hoc SQL from agents.

AWS Production Path

Documented CDK stack: RDS, Lambda feature_agg, SageMaker RCF, SNS/EventBridge, API Gateway + Cognito.

Scale & Scope

Principals Scored

synthetic principals — 10 normal + 2 injected abusers

Hybrid

Detection

detection — ML + hard cap rules

MVP_SPEC

Governance Path

— full AWS target architecture documented

Technology Stack

Python + pandas feature pipeline
scikit-learn IsolationForest (SageMaker RCF–swappable)
SQLite demo warehouse (RDS in production)
Catalog + RLS executor (Lambda + API Gateway target)

Need a similar solution?

Let's replicate this success within your organization with a tailored engagement plan.

Start a Conversation View Our Process