Agentic flow // let’s see what the agents can do

One day, an autonomous agent operating in an infinite loop (agentic flow) will emerge as a new digital creature

6 min read2 days ago

https://x.com/MetaGPT_/status/1846044033820312016

Large language models (LLMs) have demonstrated remarkable potential in solving complex tasks across diverse domains, typically by employing agentic workflows that follow detailed instructions and operational sequences. However, constructing these workflows requires significant human effort, limiting scalability and generalizability. Recent research has sought to automate the generation and optimization of these workflows, but existing methods still rely on initial manual setup and fall short of achieving fully automated and effective workflow generation. To address this challenge, we reformulate workflow optimization as a search problem over code-represented workflows, where LLM-invoking nodes are connected by edges. We introduce AFLOW, an automated framework that efficiently explores this space using Monte Carlo Tree Search, iteratively refining workflows through code modification, tree-structured experience, and execution feedback. Empirical evaluations across six benchmark datasets demonstrate AFLOW’s efficacy, yielding a 5.7% average improvement over state-of-the-art baselines. Furthermore, AFLOW enables smaller models to outperform GPT-4o on specific tasks at 4.55% of its inference cost in dollars. The code will be available at https://github.com/geekan/MetaGPT.

Agentic Workflow. We define an agentic workflow W as a sequence of LLM-invoking nodes, denoted as N = {N1, N2, . . . , Ni . . .}. Each node Ni represents a specific operation performed by an LLM and is characterized by the following parameters.

This paper has introduced AFLOW, a novel framework for automated workflow optimization. We have comprehensively formulated the automated workflow optimization problem, establishing a foundational structure for future research. AFLOW has leveraged Monte Carlo Tree Search and code-represented workflows to navigate the vast search space of possible workflows efficiently. Our experiments across six benchmarks demonstrate the effectiveness of AFLOW, which has outperformed manually designed methods and existing automated optimization approaches. Ablation studies have shown that AFLOW can autonomously discover effective structures, even without predefined operators. Importantly, AFLOW has enabled weaker models to outperform stronger ones on the Pareto front of cost-effectiveness, potentially revolutionizing the adoption of agentic workflows across various domains. These results have highlighted AFLOW’s potential for enhancing LLMs’ problem-solving capabilities while optimizing computational costs.

https://x.com/alxcnwy/status/1844686921903038499

Agentic flow is a sequence of nodes (operations) in a generation pipeline.

Agentforce // Men in suits discuss agents

Large Language Models (LLMs), with their exceptional ability to handle a wide range of tasks, have driven significant advancements in tackling reasoning and planning tasks, wherein decomposing complex problems into executable workflows is a crucial step in this process. Existing workflow evaluation frameworks either focus solely on holistic performance or suffer from limitations such as restricted scenario coverage, simplistic workflow structures, and lax evaluation standards. To this end, we introduce WORFBENCH, a unified workflow generation benchmark with multi-faceted scenarios and intricate graph workflow structures. Additionally, we present WORFEVAL, a systemic evaluation protocol utilizing subsequence and subgraph matching algorithms to accurately quantify the LLM agent’s workflow generation capabilities. Through comprehensive evaluations across different types of LLMs, we discover distinct gaps between the sequence planning capabilities and graph planning capabilities of LLM agents, with even GPT-4 exhibiting a gap of around 15%. We also train two open-source models and evaluate their generalization abilities on held-out tasks. Furthermore, we observe that the generated workflows can enhance downstream tasks, enabling them to achieve superior performance with less time during inference

https://www.observeinc.com/blog/o11y-investigator-agentic-ai-to-drive-faster-resolution/

The article discusses the development of O11y Investigator, an AI system designed to assist Site Reliability Engineers (SREs) in investigating and resolving incident alerts more efficiently. It addresses the repetitive and time-consuming nature of triaging incidents, which often don’t reveal actionable insights. The goal of O11y Investigator is to automate crucial but routine tasks during incident investigations, such as identifying relevant runbooks and performing steps required for troubleshooting.

The AI system functions by collaborating with human engineers, using multiple AI agents to reason through various potential issues in distributed systems. Instead of relying on a single AI agent, O11y utilizes specialized sub-agents for specific tasks (e.g., reading code commits, working with Kubernetes infrastructure) and an orchestration agent to determine which sub-agent should handle each task. This modular approach optimizes AI performance and reduces unnecessary complexity.

O11y Investigator also emphasizes the need for AI agents to plan and react dynamically as new information is discovered during investigations. The AI system re-plans its actions after every tool call, ensuring it adapts to changes in real time. This agentic workflow enhances reasoning and decision-making, leading to faster resolutions.

O11y Investigator is designed to reduce Mean Time to Resolution (MTTR) and improve customer experiences by automating and streamlining incident investigations. It is currently available in preview.

Here’s a summary of the key points from the conversation:

1. Agents in AI:
— Agents are entities that can act autonomously with decision-making capabilities.
— The history of agents in AI goes back to the 1950s, rooted in reinforcement learning and dynamic programming.

2. Developing Agent Systems:
— Start by defining goals and the environment on paper before coding.
— Constrain the environment initially to test and refine agent behavior.
— Avoid giving agents unrestricted web access without proper safeguards.
— Use off-the-shelf tools and APIs before building custom solutions.

3. Evaluating Agent Frameworks:
— Choose simpler, more established tools before moving to bleeding-edge options.
— Define clear success criteria before starting development.
— Consider no-code or low-code options for initial prototyping.

4. Use Cases for Agents:
— Customer support automation
— Fraud detection in finance
— Supply chain optimization
— Patient monitoring in healthcare
— Agricultural management (e.g., farm sensors and automation)
— Personal shopping assistants

5. Challenges and Future Developments:
— Real-time processing is a challenge due to the latency of large language models.
— Small language models and parameter-efficient fine-tuning may address speed and cost issues.
— Combining traditional machine learning with language models in workflows can optimize performance.

6. General Advice:
— Keep systems simple and focused on solving specific problems.
— Be open to new technologies while critically evaluating their practical applications.
— Remember that many brilliant minds are working on solving current limitations in AI and agent systems.

The conversation emphasizes the potential of agent-based AI systems while highlighting the importance of careful planning, clear goal-setting, and pragmatic implementation strategies.

Agentic flow // let’s see what the agents can do

One day, an autonomous agent operating in an infinite loop (agentic flow) will emerge as a new digital creature

Agentforce // Men in suits discuss agents

Written by sbagency