Founding offer · lifetime membership for a single £24, exclusive to our first members · closes 20 June Claim your place →
Global Research Partnerships £24 Lifetime Log inCreate free account

Funded Projects › HORIZON

MELD · Multi-Dimensional Collaborative Deployment Mechanism of MoE-based Edge LLMs for 6G Ubiquitous Intelligence

HORIZONStatus: SIGNED1 January 202731 December 2028EU funding €252,180Call HORIZON-MSCA-2025-PF

Edge deployment of Large Language Models (LLMs) plays a vital role in ensuring low latency, reducing communication overhead, and enhancing privacy, bridging the gap unaddressed by cloud and on-device LLMs. However, edge LLMs face daunting challenges due to resource constraints and highly dynamic, heterogeneous environments. Notably, the Mixture-of-Experts (MoE) architecture, as seen in models like DeepSeek, has emerged as a promising solution for edge deployment. MoE enables sparse activation, dramatically lowering computational load and supporting collaborative, distributed deployment. This adaptability makes MoE-based LLMs well-suited for challenging edge scenarios. Still, several barriers persist. MoE-based LLMs typically have larger parameter sizes than dense models, requiring substantial cache memory, which strains edge resources. Additionally, frequent and voluminous inter-server data transfers, combined with limited bandwidth in edge networks compared to cloud data centers, form a critical performance bottleneck. The complexity is further compounded by the diverse and fluctuating demands of edge resources and applications, making collaborative resource allocation and efficient scheduling particularly difficult. To address these issues, advanced strategies are proposed in this project. Inter-server and intra-server collaborative deployment methods partition models based on expert activation paths and similarities, ensuring efficient distribution across edge servers and optimal expert scheduling within each server. Mixed-precision quantization enables dynamic adaptation of expert bit-widths, balancing resource constraints, application requirements, and expert popularity. Innovative token pruning and fusion mechanisms reduce data transfer frequency and volume, enhancing overall inference efficiency. This project establishes the theoretical foundations and practical methodologies for realizing high-performance and ubiquitous edge LLMs.

Consortium · 2 organisations

coordinator

KUNGLIGA TEKNISKA HOEGSKOLAN

SE · €252,180

associatedPartner

ERICSSON AB

SE

View the official record on CORDIS →

← Find collaborators and more funded projects

Source: CORDIS, Publications Office of the European Union. Global Research Partnerships surfaces open EU research data to help you find collaborators; we are not affiliated with the European Union.