Research
Published Research
Original research and analysis on AI strategy, agentic systems, and enterprise AI adoption.
Agentic AI Eval Research — Phase 1, Part 1·March 2026
The blind spot in every AI eval framework
RAGAS, DeepEval, LangSmith, TruLens — mature frameworks, genuinely useful. But they were built for RAG. This is what they systematically miss, why it matters at 97M+ monthly MCP downloads, and what the research landscape confirms.
Agentic AI Eval Research — Phase 1, Part 2·March 2026
Your AI agent is lying to you — the taxonomy and the fix
19 failure modes across 4 layers of the tool execution pipeline — and three evaluation primitives to close the gap. The map your eval framework was never built to read, with the solution attached.