25+ years turning operational complexity into automated, measurable reliability. Today: petabyte/day SIEM & observability and production agentic-AI operations — autonomous systems that investigate, diagnose, and remediate, with humans kept in the loop for governance and audit.
Most data and AI platforms are the same problem underneath: ingest from many noisy sources, normalize and aggregate, turn signals into intent, and expose it all through a reliable, governable API. That pipeline — and the scale and compliance pressure around it — is my daily work.
High-volume ingestion, routing, and shaping across heterogeneous sources — controlling cost, noise, and fidelity. The difference between a demo and a product is what happens to the data on the way in.
I architect systems where I already know what breaks first as volume and customers grow — and the cost to fix it before it bites. SLOs, observability, and graceful degradation by default.
Agentic systems that analyze, predict, and act — with safety rails. The layer that turns raw aggregated data into decisions a product can charge for.
I ship automation into regulated, audited environments. When a platform is built on sensitive or third-party data, the data-rights and governance posture is as important as the model — and I'll flag it early.
Public, court, market & social sources — resilient collectors
Clean, dedupe, entity-resolve, standardize
Aggregate & model intent / demand trends
Reliable, versioned, governed access by geo
Campaigns, dashboards, behavioral analytics
Production autonomous systems with governance built in.
Petabyte-scale detection and telemetry.
High-volume ingestion to analytics.
Resilient, reproducible infrastructure.
Ship safely, roll back faster.
Senior judgment, prioritized roadmaps.
A working platform showcasing autonomous workflows across Confluence, JIRA, and Splunk: incident response, SRE on-call triage, change management, and knowledge sync — backed by a real observability stack (Grafana LGTM, Redis, queue orchestration).
Production frameworks for building AI-powered operational tooling — natural-language interfaces to Splunk and JIRA, with 2,000+ tests. The same agentic patterns that power autonomous investigation and remediation.
I work as a fractional technical advisor — architecture, SRE/reliability, and data. A typical start is a focused, fixed-fee discovery: current-state assessment, scalability & risk analysis, and a prioritized 90-day roadmap.