FlowBank: Query-Adaptive Agentic Workflows Optimization through Precompute-and-Reuse explained

Brief context

Publication timing, weekly edition context, and source links for this brief.

Week

Jun 8, 2026

Published

Jun 9, 2026, 5:58 PM

Current score

Original paper

The executive brief below is grounded in the source paper and linked back to the arXiv abstract.

Large Language Model (LLM)-based multi-agent systems are increasingly powerful, but current agentic workflow optimization paradigms make an unsatisfying trade-off. Task-level methods spend substantial offline compute yet deploy only a single workflow, leaving complementary candidates unused, while query-level methods synthesize a new workflow per query at substantial inference cost. Our motivating analysis shows these paradigms are more complementary than competing: workflows discovered during offline search often solve different subsets of queries, and many queries handled by expensive query-level generation can already be solved by cheaper precomputed workflows. This suggests a different objective: rather than searching for one universally best workflow or regenerating one per instance, we should build a compact bank of reusable, complementary workflows and select among them adaptively at inference time. Doing so requires solving three coupled problems: generating complementary rather than redundant candidates, compressing them into a small deployable portfolio, and assigning each query to the right workflow under a performance-cost trade-off. To this end, we present FlowBank, a three-stage framework for portfolio-based agentic workflow optimization. Diversifying proposes DiverseFlow to steer search toward under-covered queries and produce a high-coverage candidate pool. Curating proposes CuraFlow to compress this pool into a compact portfolio with minimal redundancy. Matching casts deployment as edge-value prediction on a query-workflow bipartite graph and routes each incoming query to the portfolio member with the best predicted utility. Across five benchmarks, FlowBank achieves the highest average score among the evaluated methods while remaining cost-competitive, improving over the strongest automated and handcrafted baselines by 4.26% and 14.92% relative, respectively.

Open the original arXiv page

Score 78Full-paper briefagentsinferencetraininginfra

Executive brief

A short business-reader brief that explains why the paper matters now and what to watch or do next.

Why this is worth your attention

Agent workflows are starting to look less like one-off prompt chains and more like an operations problem: build a small library of tested procedures, then route each request to the cheapest one likely to work. FlowBank reports that this “precompute and reuse” approach beats both handcrafted workflows and strong automated baselines across five benchmarks while keeping inference cost competitive. If the pattern survives real workloads, teams building agent systems may get much of the benefit of per-query adaptation without paying to synthesize a new workflow every time; the unresolved question is whether the routing model stays reliable outside clean benchmark distributions.

The paper’s core claim is that discarded agent workflows are often not waste; they may solve different query slices. If that holds in your domain, optimizing for a small portfolio could beat endlessly tuning a single default workflow.
A useful agent platform should be able to explain whether it generates a new workflow per request, reuses tested workflows, and makes routing decisions with explicit cost-performance trade-offs. The buying question becomes: can the system show per-workflow success, token cost, and routing rationale?
The operational case is strongest where requests fall into recurring patterns: support triage, document QA, code repair, math-like reasoning, or compliance review. In those settings, spending offline compute to build a reusable workflow bank may be cheaper than paying an orchestration tax on every query.
The evidence is benchmark-based, with modest training splits, one executor model, and a small workflow pool. The result is promising for agent infrastructure design, but not yet proof that a workflow bank will generalize under messy enterprise data, changing query mix, or stricter latency controls.

Evidence ledger

The strongest claims in the brief, along with the confidence and citation depth behind them.

capabilityhighp.1

FlowBank builds a portfolio of complementary agentic workflows offline and selects among them at inference time.

capabilityhighp.10p.10

Across five benchmarks, FlowBank reports the highest average performance and improves over the strongest automated baseline.

inferencemediump.8p.10

The paper claims much of the adaptivity of per-query workflow generation can be recovered with lower online routing overhead.

caveathighp.18p.20

The results are based on controlled public benchmarks and relatively small workflow portfolios, so production transfer is not established.

Related briefs

More plain-English summaries from the archive with nearby topics or operator relevance.

cs.LG

PatchOptic for Shared-State LLM Workflows with Projected Views and Verified Structured Updates

Zhaoyu Bai, Jiaqi Cai

Read brief arXiv

cs.CR

Adaptive Evaluation of Out-of-Band Defenses Against Prompt Injection in LLM Agents

Praneeth Narisetty et al.

Read brief arXiv

cs.CL

Text2Sign: A Single-GPU Diffusion Baseline for Text-to-Sign Language Video Generation

Ruize Xia

Read brief arXiv

cs.AI

Semantic Early-Stopping for Iterative LLM Agent Loops

Sahil Shrivastava

Read brief arXiv