Home
News
Tech Grid
Data & Analytics
Data Processing Data Management Analytics Data Infrastructure Data Integration & ETL Data Governance & Quality Business Intelligence DataOps Data Lakes & Warehouses Data Quality Data Engineering Big Data
Enterprise Tech
Digital Transformation Enterprise Solutions Collaboration & Communication Low-Code/No-Code Automation IT Compliance & Governance Innovation Enterprise AI Data Management HR
Cybersecurity
Risk & Compliance Data Security Identity & Access Management Application Security Threat Detection & Incident Response Threat Intelligence AI Cloud Security Network Security Endpoint Security Edge AI
AI
Ethical AI Agentic AI Enterprise AI AI Assistants Innovation Generative AI Computer Vision Deep Learning Machine Learning Robotics & Automation LLMs Document Intelligence Business Intelligence Low-Code/No-Code Edge AI Automation NLP AI Cloud
Cloud
Cloud AI Cloud Migration Cloud Security Cloud Native Hybrid & Multicloud Cloud Architecture Edge Computing
IT & Networking
IT Automation Network Monitoring & Management IT Support & Service Management IT Infrastructure & Ops IT Compliance & Governance Hardware & Devices Virtualization End-User Computing Storage & Backup
Human Resource Technology Agentic AI Robotics & Automation Innovation Enterprise AI AI Assistants Enterprise Solutions Generative AI Regulatory & Compliance Network Security Collaboration & Communication Business Intelligence Leadership Artificial Intelligence Cloud
Finance
Insurance Investment Banking Financial Services Security Payments & Wallets Decentralized Finance Blockchain Cryptocurrency
HR
Talent Acquisition Workforce Management AI HCM HR Cloud Learning & Development Payroll & Benefits HR Analytics HR Automation Employee Experience Employee Wellness Remote Work Cybersecurity
Marketing
AI Customer Engagement Advertising Email Marketing CRM Customer Experience Data Management Sales Content Management Marketing Automation Digital Marketing Supply Chain Management Communications Business Intelligence Digital Experience SEO/SEM Digital Transformation Marketing Cloud Content Marketing E-commerce
Consumer Tech
Smart Home Technology Home Appliances Consumer Health AI
Interviews
Anecdotes
Think Stack
Press Releases
Articles
  • Agentic AI

Patronus AI Launches Scalable Supervision for AI Agents


Patronus AI Launches Scalable Supervision for AI Agents
  • by: Source Logo
  • |
  • June 19, 2025

Patronus AI has launched Percival, the first self-serve AI solution designed for scalable supervision of agentic systems. This tool automatically detects errors and suggests optimizations for failures in increasingly autonomous AI workflows, addressing the growing challenge of maintaining reliability as these systems scale.

Quick Intel

  • Patronus AI introduces Percival, a scalable supervision solution for agentic systems.

  • Percival automatically detects over 20 types of agentic system failures.

  • The solution analyzes execution traces to identify and help fix errors.

  • It uses an agent-based architecture for comprehensive error detection.

  • Percival's episodic memory system learns from past errors for improved detection.

  • The tool aims to reduce debugging time for AI engineers from hours to minutes.

Addressing the Challenges of Autonomous Agentic Systems

As AI systems evolve into autonomous agents capable of independently planning and executing complex tasks with minimal human supervision, organizations face new challenges in maintaining their reliability and control. While this advancement offers significant benefits across industries, the unpredictability of these systems poses serious concerns for developers and organizations.

Percival: An Intelligent Companion for AI Oversight

Percival acts as an intelligent companion, automatically identifying more than 20 different failure modes. These include incorrect tool usage, context misunderstanding, and planning errors. The tool analyzes execution traces to pinpoint long-term planning failures before they escalate into critical system breakdowns.

Streamlining Debugging and Maintaining Human Oversight

"AI agents are getting better at solving complex tasks, but their unpredictability presents serious challenges for developers and organizations," said Anand Kannappan, CEO and Co-founder of Patronus AI. "When developers spend hours tracing through agent workflows only to find that a decision made five steps ago caused the final error, they're not just losing time they're potentially losing control over their systems. Percival gives developers the ability to instantly understand and fix their AI agents, turning weeks of debugging into minutes while helping maintain essential human oversight as these systems grow more sophisticated."

Comprehensive Error Detection Through Agent-Based Architecture

The Percival platform utilizes an agent-based architecture, distinguishing it from single Large Language Model (LLM)-as-judge models. This design enables comprehensive error detection across four major categories:

  • Reasoning Errors: Including hallucinations, information processing, decision-making, and output generation errors.

  • System Execution Errors: Configuration, API issues, and resource management failures.

  • Planning and Coordination Errors: Context management and task orchestration failures.

  • Domain Specific Errors: Customized to specific workflow requirements.

Enhanced Reliability Through Episodic Memory

A key differentiator of Percival is its episodic memory system. This system learns from previous errors and adapts to changing input distributions, making future error detection more reliable and tailored to each organization's unique workflow.

Addressing the Unique Challenges of Agentic Systems

Unlike traditional evaluations designed for standalone LLMs, Percival tackles the specific challenges presented by agentic systems, where decisions made early in the process can lead to errors in later stages. The platform retains a memory of past failures, enabling customized benchmarking of agent systems.

Automating Debugging and Accelerating Development

Currently, AI engineers often spend several hours each week debugging lengthy agentic execution traces. Percival automates this time-consuming process, significantly reducing the human effort required to analyze large agentic traces and thereby accelerating development cycles.

Advancing Human Oversight in AI Workflows

Patronus AI's vision of maintaining human oversight over AI workflows takes a significant step forward with Percival. This solution represents a crucial advancement towards reliable automated debugging of complex agentic systems.

Collaboration for Responsible AI Scaling

"Emergence's recent breakthrough agents creating agents marks a pivotal moment not only in the evolution of adaptive, self-generating systems, but also in how such systems are governed and scaled responsibly which is precisely why we are collaborating with Patronus AI," said Satya Nitta, Co-founder and CEO of Emergence AI. "While innovation remains at our core, we have always been equally committed to governance, transparency, and responsible deployment. Our collaboration strengthens that commitment by adding further depth to how we interpret, evaluate, and refine our agent-based systems. Together, we're enhancing not just what's possible, but how safely and responsibly it's delivered at scale."

 

About Patronus AI

Patronus AI develops AI evaluation and optimization to help companies build top-tier AI products confidently. The company was founded by machine learning experts Anand Kannappan and Rebecca Qian.

News Disclaimer
  • Share