Workflow Agent vs Autonomous Agent architecture for RCA
When designing a root cause analysis system powered by large language models, two distinct approaches emerge: the Workflow Agent and the Autonomous Agent. I prototyped both methods to assess their viability for current needs. However, given the rapid evolution of LLM technology, the insights presented here may soon require updating.
https://www.anthropic.com/engineering/building-effective-agents
The motivation
Why did I develop the RCA Agent? While my passion for programming drives me, the pressing needs of our live site demanded a more efficient approach to root cause analysis. Our team frequently investigates system issues — whether reported by customers, flagged by colleagues, or uncovered during routine development. Although the live site offers invaluable learning opportunities, repeatedly addressing similar incidents can become tedious.
Despite the subtle differences in each investigation, our underlying approach remains consistent. Leveraging LLM solutions now allows us to automate many scenarios that were once labor-intensive. The RCA Agent was created to streamline our workflow — saving precious time and enabling me to focus on more engaging programming challenges.
What Should Be the Next Step?
The RCA Agent has proven to be remarkably stable. I conducted numerous experiments — including discussions with a friend from Microsoft Research — to refine and stabilize the system. Our approach involved several key strategies:
- Workflow Agent Architecture: Implementing a structured workflow helped streamline operations.
- Single Responsibility Decomposition: Breaking down the agent’s tasks into focused, single-responsibility components enhanced manageability.
- Multi-Agent Architecture: Distributing responsibilities among multiple specialized agents further improved stability.
Working with LLMs for these purposes can quickly become confusing. While delivering an impressive demo is relatively easy, ensuring long-term stability is the true challenge.
Downside of the Current Architecture
One notable limitation of our current setup is the need for domain knowledge injection. For instance, when working with logging data, it’s necessary to define the semantics for queries, where clauses, and event names. Although I simplified this process by enabling configuration through plain English using a simple DSL (which can be derived directly from natural language), this configuration step can still be a hurdle. In today’s AI-driven world, where users often prefer minimal manual setup, this requirement has prompted me to experiment with alternative architectures.
Autonomous Agent Architecture: Lessons Learned
Initially, my approach focused on achieving system stability before minimizing configuration requirements. At a colleague’s suggestion, I ventured into an extreme alternative: an autonomous agent architecture that leveraged AutoGen for a group chat–style interaction. The idea was to feed the agent the table metadata and let it operate independently.
However, the results were far from encouraging. The system failed to converge on a reliable solution, deviating significantly from our intended outcomes. We attempted several stabilization strategies:
- Injecting Domain Knowledge: Using Retrieval Augmented Generation (RAG) to provide context.
- Template Initialization: Supplying an initial query template and accompanying hints.
- Additional Guidance: Offering further hints throughout the process.
Unfortunately, none of these methods delivered the desired stability.
At a community event in Seattle, where I had the opportunity to speak and interact with industry experts, I learned that the current LLM models may not be well-suited for this type of autonomous architecture in our scenario.
Why Does Deep Research Work?
You might wonder why systems like Deep Research appear successful. The key difference is in their design: Deep Research is a specialized agent system, tuned specifically for such applications, as detailed in OpenAI’s introduction of Deep Research. This system incorporates planning and backtracking to reach a final answer, which is a stark contrast to the simple group chat mechanism we attempted. For our more complex use case — where substantial domain knowledge is essential — a more robust approach is needed before an autonomous agent model can be effectively implemented.
Workflow Agent Is Not Old-Fashioned
Despite the buzz around autonomous agent solutions — which often seem modern and flashy — the reality is that for most practical applications, a Workflow Agent remains the superior choice. In discussions with industry professionals, including a friend who assists numerous clients in implementing AI systems (with notable insights shared by Satya Nadella during the Ignite Keynote), it became clear that over 80% of use cases are best served by Workflow Agents, while less than 20% might benefit from an autonomous approach.
When Does an Autonomous Agent Fit?
Autonomous agents, with their innovative and less deterministic nature, are ideally suited for scenarios that call for novel solutions and don’t demand precise, deterministic outcomes. However, for many business applications, the reliability and structured reasoning provided by Workflow Agents make them the preferred choice.
The Power of a Structured Workflow
From my personal experiments, I’ve found that incorporating a simple Plan/Execution/Review loop — what I refer to as “reasoning steps” — significantly enhances the performance and quality of the system. Although this approach might lack the glitz of a group chat–style autonomous agent, a multi-agent Workflow Agent architecture proves to be immensely powerful, automating processes in ways that were once unimaginable in the pre-AI era.
Real-World Success: Toyota’s Example
Consider how Toyota leverages Azure Cosmos DB to power its multi-agent AI system, boosting productivity in vehicle design and beyond. This real-world example underscores the practical benefits of a Workflow Agent approach in complex, distributed systems.
A Recommendation: Durable Functions
For those looking to implement a Workflow Agent, I highly recommend Durable Functions. This serverless workflow system allows you to write workflows in code — eschewing the need for visual UI editors — and aligns perfectly with AI workflow agent scenarios. Given that current LLM frameworks aren’t yet optimized for distributed systems, Durable Functions offer a robust and scalable solution, as evidenced by numerous successful case studies and my own experience with integrating them into the RCA Agent.
In summary, while autonomous agents may capture the imagination, the structured and reliable Workflow Agent — with its proven track record in the majority of cases — remains a powerful and indispensable tool in the modern AI landscape.
Prediction
Current LLMs are not yet optimized for autonomous, agentic use. Today, a multi-agent workflow architecture covers roughly 80% of use cases, while autonomous agents suit only about 20%. However, if future LLM updates enable more robust agentic behavior, this balance could shift dramatically. Consider the following scenarios:
- Optimized Autonomous Agents: Future LLM iterations might support autonomous agent patterns flawlessly.
- Deep Research-Inspired Strategies: LLMs could be enhanced to support planning and backtracking strategies — similar to those employed in Deep Research — with new APIs capable of handling such workloads.
- Hybrid Approach for Determinism: Even with improved LLMs, a hybrid model that combines multi-agent workflow with reasoning steps may still be necessary to ensure deterministic outcomes.
As an end user, I remain cautious about predicting the future. The ultimate framework may depend on specific use cases, and the current 80/20 split could evolve significantly if, for example, the second scenario proves viable. In such a case, it’s uncertain which framework will ultimately dominate — or whether an entirely new API will redefine the landscape.
Investing in This Transient Era
Given the rapid pace of innovation, it may be risky to fully commit to either a multi-agent workflow or an autonomous agent approach at this time. Instead, one promising area for investment is Domain Knowledge Integration. Even in scenarios where autonomous agents might become more prevalent, incorporating deep domain knowledge remains essential for achieving reliable results.
Since critical insights often reside with subject matter experts, automating the extraction, vectorization, classification, and generation of business metadata can be transformative. For example, using embeddings to process and leverage this domain-specific information could significantly enhance system performance. This focus on domain knowledge is likely to be valuable across a wide range of business applications, regardless of the underlying agent architecture.
Evaluation
Evaluation represents a critical area where investing time and effort can yield significant benefits. Currently, our focus is on three key components:
- Input
- Workflow (a.k.a. Orchestration, Controller)
- Output
At present, the workflow is central to the effectiveness of models. However, as we move forward, we anticipate that large language models (LLMs) will increasingly handle exploration autonomously. In this future scenario, mechanisms for exploration and reinforcement learning will become even more important. Consequently, the evaluation of outputs will be crucial — not only for providing meaningful feedback but also for guiding the transition from traditional workflow-based architectures to systems that support self-exploration.
Conclusion
In our evaluation, autonomous agent solutions — like group chat–style interfaces — are currently less effective than workflow agent architectures for the majority of use cases. Workflow agents deliver the reliability and structure needed for deterministic outcomes in most scenarios.
Conversely, specialized systems such as Deep Research, which leverage finely tuned LLMs along with planning and backtracking strategies, have demonstrated success in agent-based approaches. However, as new LLMs, APIs, and frameworks emerge, the landscape is likely to evolve significantly.
Regardless of the approach, investing in domain knowledge extraction remains critical. Enhancing output evaluation methods will also be essential for ensuring system performance in the near future.
Embrace these changes and enjoy the evolving journey in AI technology!