Download PDFOpen PDF in browserEnabling Fairness Across Multi-modal and Multi-agent ApplicationsEasyChair Preprint 159783 pages•Date: July 3, 2025AbstractModern multi-agent systems leverage a diverse set of AI models, including Large Language Models (LLMs), Vision-Language Models (VLMs), etc., to perform complex multi-modal tasks. However, fair model serving in such heterogeneous environments remains a significant challenge. Existing scheduling methods primarily focus on single-modality fairness, failing to account for varying computational costs across different models and the hierarchical structure of multi-agent applications. In this work, we introduce Hierarchical Multi-Modality Fair Scheduling (HMFS), a novel approach that ensures fairness across applications, agents, and tasks while maintaining high resource utilization. To enable cross-modality fairness, we propose a Unified Token Representation, which normalizes token costs across different transformer-based models by leveraging latent space embedding dimensions and computational intensity factors. Using this unified metric, we design a Hierarchical Multi-Modality Fair Scheduling algorithm that dynamically prioritizes requests at both application and agent levels, ensuring equitable access to compute resources. Keyphrases: Edge-Cloud Communication, Scheduling, fairness, multi-agent, multi-modality, transformer
|