Three production AI security systems. Each exists because an assumption was failing somewhere, and the fix needed to be enforced in code, not in a prompt.
Sentinel
Problem
Pattern-based vulnerability scanners produce unverified findings at volume. Analysts spend hours validating low-confidence alerts that should have been rejected at the source.
Approach
A multi-agent system that treats vulnerability research as a hypothesis-testing loop. Agents generate findings, probe them against live targets, and record the actual response. A formal state machine then classifies each finding, no result reaches a report until the evidence justifies the claim.
Proof
Every result is classified as confirmed, refuted, or inconclusive based on actual HTTP probe data, not probability scores. Produces evidence-backed JSON and Markdown reports mapped to OWASP ASVS/WSTG and MITRE ATT&CK.
Constraints
Zero destructive actions enforced at system level, not model level
Read-only probing constraints applied before agent dispatch
No severity inflation, state machine rejects unverified escalations
Targets restricted to authorized VDP scope only
PythonAnthropic Claude APIMulti-agentDockerOWASPMITRE ATT&CK
Prompt-based safety controls are fundamentally unreliable. They can be bypassed, they cannot be audited, and they fail silently. For a tool processing medical and high-risk guidance for vulnerable users, a failed safety layer has real consequences.
Approach
A three-layer safety pipeline where every stage is enforced in code. Azure AI Content Safety screens for harm before any LLM is called. Prompt Shields detects jailbreaks at the infrastructure layer. Azure OpenAI extracts five deterministic signals. Anthropic Claude performs final reasoning. Every response then passes through a Python validation layer that enforces medical safeguards, frequency expansion, leaked-warning detection, and risk-level logic, regardless of what the model returned.
Proof
Deployed across 10 integrated Azure services. 14 attack vectors tested live against the production URL, all blocked. 28 custom telemetry events fire across the request lifecycle, producing evidence in App Insights that each safety layer is firing in production, not just designed. Placed 3rd internationally in the Microsoft AI Innovation Challenge, March 2026.
Constraints
SelfHarm severity ≥ 4, hardcoded 988 response, Claude never called
Prompt injection attempts, infrastructure-level block, Claude never called
Every model output schema-validated, 500 before anything reaches the user
Medical content cannot bypass disclaimer, enforced in Python
Rate-limited on every write endpoint
PythonFlaskAzure (10 services)Azure OpenAIContent SafetyPrompt ShieldsAnthropic ClaudeCosmos DB
AI reasoning agents hallucinate attack paths. They promote unverified techniques as validated, mix suggestion with verification, and overwrite confidence with each new response. Operators then act on reasoning that was never evidence-backed, and there is no record of what the system actually knew versus what it guessed.
Approach
A five-agent reasoning engine where every technique exists under a controlled trust lifecycle. Agents do not share state directly, all inter-agent communication passes through an append-only evidence log enforced at the database layer. A deterministic scoring formula weights execution evidence, source quality, consensus, environment match, recency, and stability. Shadow reasoning proposals can never auto-promote, they must re-enter through ingestion, pass scoring, and survive human review before any use.
Proof
134 tests. DB-level UPDATE/DELETE triggers block mutation of evidence records. The lab_validated status requires explicit human approval, it is never auto-assigned. Known v1 limitations are documented openly in the repo, not hidden.
Constraints
Evidence records cannot be updated or deleted, append-only at DB level
No agent can promote a technique to lab_validated, human review required
Shadow reasoning output is untrusted by default, re-ingestion required
Confidence scores derived from evidence, never stored as mutable state
PythonPydanticSQLiteAnthropic Claude APIMulti-agentTrust lifecycle
I spent nine years in enterprise environments where uptime, access control, and system reliability were not theoretical. At Spectrum, I administered identity and access lifecycle for 5,400+ field technicians across ARRIS, CSG, SageQuest, Kronos, Avaya, and Microsoft 365, and served as the final escalation point for 20-40+ incidents daily.
At Metro Net Fire Authority, I worked inside a controlled-access 24/7 facility supporting 911 dispatch, performed secure extraction of dispatch logs and communications in response to subpoenas within CJIS-aligned workflows, and re-engineered a fragmented QA reporting process into a VBA automation pipeline that reduced manual processing time by 75-80%. That background, where failure disrupts live operations, shaped how I design systems today. Every system I build reflects that accountability.