Research

Published Research

Original research and analysis on AI strategy, agentic systems, and enterprise AI adoption.

Agentic AI Eval Research — Phase 1, Part 1·March 2026

The blind spot in every AI eval framework

RAGAS, DeepEval, LangSmith, TruLens — mature frameworks, genuinely useful. But they were built for RAG. This is what they systematically miss, why it matters at 97M+ monthly MCP downloads, and what the research landscape confirms.

Viola Cao

Agentic AI Eval Research — Phase 1, Part 2·March 2026

APEX: Agentic Pipeline EXecution Diagnostic Framework

19 failure modes across 4 layers of the tool execution pipeline — and three evaluation primitives to close the gap. The map your eval framework was never built to read, with the solution attached.

Viola Cao