Deep Mixture of Experts Network for Resource Optimization in Aerial-Terrestrial CF-mMIMO Systems under URLLC

Source: arXiv:2605.15135 · Published 2026-05-14 · By Donggen Li, Chong Huang, Jingfu Li, Pei Xiao, Wenjiang Feng, Dusit Niyato et al.

TL;DR

This paper addresses the challenge of enabling ultra-reliable low-latency communication (URLLC) for low-altitude unmanned aerial mobility (UAM) scenarios in 6G networks. Conventional cell-free massive MIMO (CF-mMIMO) networks struggle with channel aging and resource overhead when supporting high-mobility aerial and ground user equipment (UEs) under URLLC constraints, leading to high latency and inefficiency. The authors propose a hybrid aerial-terrestrial CF-mMIMO architecture combined with a deep learning framework for uplink optimization. This framework consists of a Transformer-based channel prediction network (CP-Net) that mitigates channel aging effects on three types of links (air-to-air, ground-to-ground, air-ground), and a mixture of experts (MoE) power allocation network (MoE-Net) with three expert subnetworks specialized to maximize spectral efficiency (SE), energy efficiency (EE), or their trade-off. An adaptive weighting network (WT-Net) learns to fuse the expert outputs based on heterogeneous UE requirements.

The result is a joint channel prediction and power allocation scheme that dynamically adapts to heterogeneous objectives and rapidly changing CSI conditions, achieving better reliability and efficiency than state-of-the-art iterative and deep learning baselines. Simulations demonstrate marked NMSE reductions in channel prediction (51.7% over Kalman filter and 39.5% over standard DNN), as well as improved SE, EE, and URLLC reliability satisfaction probabilities compared to benchmarks. The approach reconciles conflicting multi-objective power control goals while efficiently meeting short-packet latency and reliability constraints.

Key findings

CP-Net reduces normalized mean square error (NMSE) by 51.7% over Kalman filter and 39.5% over standard DNN baselines in channel prediction under aging conditions.
The proposed hybrid aerial-terrestrial CF-mMIMO network jointly supports aerial and ground UEs with centralized signal combining at the CPU, improving UL reliability in low-altitude UAM environments.
MoE-Net implements three expert subnetworks specializing in SE maximization, EE maximization, and their trade-off, allowing heterogeneous UE objectives to be handled more effectively than a single unified model.
WT-Net adaptively fuses expert outputs using a weighting network, enabling objective-aware power allocation decisions.
The joint framework improves URLLC packet error probability satisfaction compared to successive convex approximation (SCA) and deep learning power control baselines.
Short-packet transmission performance is analyzed with finite blocklength theory incorporating latency and reliability constraints, providing more practical metrics than classical Shannon capacity.
The proposed model maintains robust generalization across varying channel aging conditions and heterogeneous UE groups.
Orthogonal pilot assignment combined with least squares channel estimation supports scalable MUD with reasonable computational complexity.

Threat model

The paper does not explicitly model an active adversary or attacker; rather, the focus is on mitigating channel aging and interference effects inherent to highly dynamic and heterogeneous wireless aerial-terrestrial URLLC environments. The adversary in scope is effectively the wireless channel aging and interference dynamics that degrade CSI accuracy and optimization reliability. Threats such as malicious jamming, spoofing, or data injection are out of scope.

Methodology — deep read

Threat model & assumptions: The adversary is not explicitly modeled but the system addresses inherent wireless channel imperfections due to high-mobility UEs causing channel aging, inter-user interference, and fast fading. The focus is on mitigating channel aging and optimizing uplink power under strict URLLC latency and error probability constraints for both aerial and ground terminals.
Data: Simulated channel state information (CSI) sequences are generated reflecting hybrid aerial-terrestrial CF-mMIMO topology with M APs (both aerial and ground) and K UEs (both aerial and ground). Three link types (aerial-to-aerial ATA, ground-to-ground GTG, and air-ground AG) are modeled with Rician or Rayleigh fading combined with a time-varying channel aging model based on Doppler frequency and temporal correlations. The dataset includes labeled channel realizations (actual CSI) and aged/noisy CSI estimates for training and evaluation.
Architecture/algorithm:

CP-Net: A Transformer-based encoder-decoder network. The encoder uses 4 multi-head self-attention heads and dimensions scaled to antenna count (2L) to capture spatio-temporal correlations across the three link types. The decoder is a fully connected network reconstructing predicted CSI from the encoded temporal features. A channel quality-aware loss function weights errors more heavily on weak links (low average channel power) to improve prediction accuracy on statistically challenging channels.
MoE-Net: A mixture of experts model with three lightweight sub-networks (experts), each specialized to optimize one of three objectives: SE maximization, EE maximization, or balancing the EE-SE trade-off. Each expert outputs a candidate uplink power allocation vector.
WT-Net: A weighting gating network takes predicted CSI and input features to learn adaptive fusion weights for combining the three expert outputs dynamically, producing the final power allocation.

Training regime: The network is trained on simulated channels with supervised loss consisting of the combined weighted objectives and the channel quality-aware loss for CP-Net. Training hyperparameters such as epochs, batch size, optimizer, learning rate schedules, and random seeds are not specified in detail. The performance is evaluated on held-out simulated channel realizations with channel aging effects.
Evaluation protocol: Metrics include normalized mean square error (NMSE) for channel prediction accuracy, packet error probability derived from finite blocklength (FBL) theory for URLLC reliability, spectral efficiency (SE in bps/Hz), and energy efficiency (EE in bits/Joule). Baselines include Kalman filter-based channel prediction, standard DNN-based prediction, successive convex approximation (SCA) optimization, and unified DNN power control models. Ablations test performance without MoE structure or WT-Net fusion. Cross-validation details are not explicitly described, but generalization across different UE types and channel aging conditions is analyzed.
Reproducibility: The dataset is simulation-based and not public. No explicit mention of code or trained model weight release. The channel and system models are extensively described allowing replication by specialized researchers.

Concrete example: Given past noisy CSI time series from heterogeneous aerial and ground UEs suffering channel aging, CP-Net transforms input into predicted CSI for current interval, emphasizing weak links with a quality-aware loss. Then, MoE-Net experts independently propose power allocations focusing on maximizing SE, EE, or balance, respectively. The WT-Net adaptively weights these expert outputs and generates the final uplink power vector allocation, respecting maximum power and URLLC latency and error constraints. The predicted CSI and power allocation are then evaluated in the system simulation to quantify SINR, SE, EE, and URLLC reliability metrics. Iterations of training optimize this end-to-end pipeline.

Technical innovations

A link-quality-aware loss function in the Transformer-based channel prediction network (CP-Net) that biases training toward improving weak channel link accuracy.
Design of a multi-expert mixture of experts (MoE) power allocation network where each expert specializes in a distinct objective (SE, EE, or trade-off), addressing heterogeneous UE requirements simultaneously.
Adaptive weighting gating network (WT-Net) that dynamically fuses expert outputs based on input CSI and service demands to improve robustness and flexibility under varying URLLC constraints.
Hybrid aerial-terrestrial CF-mMIMO network model supporting coexisting aerial and ground UEs with centralized processing tailored for URLLC and aged CSI mitigation.

Datasets

Simulated hybrid aerial–terrestrial CF-mMIMO channel data — size unspecified — simulation-based, not publicly released

Baselines vs proposed

Kalman filter channel prediction: NMSE higher by 51.7% than proposed CP-Net
Standard DNN channel prediction: NMSE higher by 39.5% than CP-Net
Successive convex approximation (SCA) optimization: lower performance on URLLC reliability and SE/EE trade-off compared to proposed MoE-Net
Unified single-objective DNN power control: underperforms MoE-Net across heterogeneous UE objectives

Figures from the paper

Figures are reproduced from the source paper for academic discussion. Original copyright: the paper authors. See arXiv:2605.15135.

Fig 1

Fig 1: The architecture of a hybrid aerial–terrestrial CF-mMIMO network

Fig 2

Fig 2: The entire process of the proposed joint networks for channel prediction with CP-Net and power allocation via MoE-Net.

Fig 3

Fig 3 (page 7).

Fig 4

Fig 4 (page 7).

Fig 5

Fig 5 (page 7).

Fig 6

Fig 6 (page 7).

Fig 7

Fig 7 (page 7).

Fig 8

Fig 8 (page 7).

Limitations

The channel data and experiments are simulation-based without real-world measurement validation, potentially limiting environmental realism.
Training and evaluation focus mainly on normal channel aging effects; adversarial attacks or sudden channel state changes are not considered.
Latency, error probability, and power constraints rely on model parameters that may differ in real implementations, affecting practical deployment.
The paper does not detail full hyperparameter tuning or random seed controls, limiting precise reproducibility.
The computational complexity of the MoE-Net and WT-Net at very large scales (many UEs/APs) is not fully benchmarked.
The system assumes perfect synchronization for orthogonal pilot allocation which could be challenging in dense scenarios.

Open questions / follow-ons

How does the proposed framework perform under real-world measured channel datasets with more complex fading and blockage scenarios?
Can the MoE approach be extended to incorporate online learning or continual adaptation for non-stationary environments?
What are the robustness properties of the channel prediction and power allocation networks under adversarial or anomalous channel state perturbations?
How scalable is the combined CP-Net and MoE-Net framework to very large CF-mMIMO deployments with hundreds of APs and UEs?

Why it matters for bot defense

For bot-defense and CAPTCHA practitioners, this work provides insight into how advanced deep learning architectures can optimize dynamic wireless system parameters under strict latency and reliability constraints. While not directly related to CAPTCHA per se, the use of mixture-of-experts networks to handle heterogeneous objectives in real-time communication optimization could inspire analogous multi-expert approaches for adaptive challenge presentation and resource-aware bot detection. Moreover, the emphasis on mitigating imperfections in channel state information and achieving rapid optimized responses aligns with the needs in bot-defense systems where quick, resource-efficient, and precise decision-making is critical to meet user experience constraints. Understanding mechanisms to fuse multiple expert controllers dynamically may help design more resilient CAPTCHA or bot detection frameworks that tailor challenges based on variable attacker behaviors or network conditions.

Cite

bibtex

@article{arxiv2605_15135,
  title={ Deep Mixture of Experts Network for Resource Optimization in Aerial-Terrestrial CF-mMIMO Systems under URLLC },
  author={ Donggen Li and Chong Huang and Jingfu Li and Pei Xiao and Wenjiang Feng and Dusit Niyato and Zhu Han },
  journal={arXiv preprint arXiv:2605.15135},
  year={ 2026 },
  url={https://arxiv.org/abs/2605.15135}
}

Deep Mixture of Experts Network for Resource Optimization in Aerial-Terrestrial CF-mMIMO Systems under URLLC ​

TL;DR ​

Key findings ​

Threat model ​

Methodology — deep read ​

Technical innovations ​

Datasets ​

Baselines vs proposed ​

Figures from the paper ​

Limitations ​

Open questions / follow-ons ​

Why it matters for bot defense ​

Cite ​

Read the full paper ​