If you find AGP helpful in your research, please consider citing:
Large Language Model (LLM) based multi-agent systems have shown remarkable performance in various tasks, especially when enhanced through collaborative communication. However, current methods often rely on a fixed number of agents and static communication structures, limiting their ability to adapt to varying task complexities. In this paper, we propose Adaptive Graph Pruning (AGP), a novel task-adaptive multi-agent collaboration framework that jointly optimizes agent quantity (hard-pruning) and communication topology (soft-pruning). Specifically, our method employs a two-stage training strategy: firstly, independently training soft-pruning networks for different agent quantities to determine optimal agent-quantity-specific complete graphs and positional masks across specific tasks; and then jointly optimizing hard-pruning and soft-pruning within a maximum complete graph to dynamically configure the number of agents and their communication topologies per task. Extensive experiments demonstrate that our approach is: (1) High-performing, achieving state-of-the-art results across six benchmarks and consistently generalizes across multiple mainstream LLM architectures, with a increase in performance of \(2.58\%\sim 9.84\%\); (2) Task-adaptive, dynamically constructing optimized communication topologies tailored to specific tasks, with an extremely high performance in all three task categories (general reasoning, mathematical reasoning, and code generation); (3) Token-economical, having fewer training steps and token consumption at the same time, with a decrease in token consumption of \(90\%+\); and (4) Training-efficient, achieving high performance with very few training steps compared with other methods. The performance will surpass the existing baselines after about ten steps of training under six benchmarks.
Adaptive Graph Pruning (termed as AGP) first mines high-utility sub-graphs from a fixed pool of heterogeneous LLM-agents and preserves their edge labels and node masks as supervision. The next section sets up notation, casts these ideas in graph-topological terms, and details the multi-agent communication protocol that AGP learns to instantiate for any new query.
Our supervision corpus from Stage I have both the micro and macro structure. Each labeled pair is a triple:
AGP delivers the strongest overall accuracy.
AGP not only attains higher final accuracy but also achieves baseline-beating performance in fewer than ten optimization steps, evidencing markedly better sample- and compute-efficiency during training.
Under MMLU, GSM8k, and HummanEval benchmarks, the curves of the performance of AGP and G-Designer as the number of training steps increases. Starting from the fifth step, there will be an evaluation after each batch is trained.
AGP can provide more accurate and economical solutions in complex settings without the iterative overhead of full architecture searches.
Visualization of the performance and the number of prompt tokens of different multi-agent communication works across MMLU, GSM8K, SVAMP, and HumanEval benchmarks
If you find AGP helpful in your research, please consider citing:
@article{li2025adaptive,
title={Adaptive Graph Pruning for Multi-Agent Communication},
author={Li, Boyi and Zhao, Zhonghan and Lee, Der-Horng and Wang, Gaoang},
journal={arXiv preprint arXiv:2506.02951},
year={2025}
}