now publishers - Fairness-Oriented User Scheduling for Bursty Downlink Transmission Using Multi-Agent Reinforcement Learning

APSIPA Transactions on Signal and Information Processing > Vol 11 > Issue 1

Fairness-Oriented User Scheduling for Bursty Downlink Transmission Using Multi-Agent Reinforcement Learning

Mingqi Yuan, School of Science and Engineering, The Chinese University of Hong Kong, China, Qi Cao, School of Science and Engineering, The Chinese University of Hong Kong, China, Man-On Pun, School of Science and Engineering, The Chinese University of Hong Kong, China AND Shenzhen Research Institute of Big Data, China, SimonPun@cuhk.edu.cn , Yi Chen, School of Science and Engineering, The Chinese University of Hong Kong, and Shenzhen Research Institute of Big Data, China

Suggested Citation

Mingqi Yuan, Qi Cao, Man-On Pun and Yi Chen (2022), "Fairness-Oriented User Scheduling for Bursty Downlink Transmission Using Multi-Agent Reinforcement Learning", APSIPA Transactions on Signal and Information Processing: Vol. 11: No. 1, e32. http://dx.doi.org/10.1561/116.00000028

Publication Date: 31 Oct 2022

Subjects

Keywords

User scheduling, RBG allocation, fairness-oriented, Multi-agent reinforcement learning (MARL)

Journal details

Open Access

This is published under the terms of CC BY-NC.

Downloaded: 1221 times

In this article:

Abstract

In this work, we develop practical user scheduling algorithms for downlink bursty traffic with emphasis on user fairness. In contrast to the conventional scheduling algorithms that either equally divide the transmission time slots among users or maximize some ratios without practical physical interpretations, we propose to use the 5%-tile user data rate (5TUDR) as the metric to evaluate user fairness. Since it is difficult to directly optimize 5TUDR, we first cast the problem into the stochastic game framework and subsequently propose a Multi-Agent Reinforcement Learning (MARL)-based algorithm to perform optimization on the resource block group (RBG) allocation in a highly computationally efficient manner. Furthermore, each MARL agent is designed to take information measured by network counters from multiple network layers (e.g. Channel Quality Indicator, Buffer size) as the input states while the RBG allocation as action with a carefully designed reward function developed to maximize 5TUDR. Extensive simulation is performed to show that the proposed MARL-based scheduler can achieve fair scheduling while maintaining good average network throughput as compared to conventional schedulers.

DOI:10.1561/116.00000028

Introduction
System Model and Problem Formulation
Stochastic Game Framework for RBG Allocation
MARL-Based Algorithm
Simulation and Analysis
Conclusion
Appendix: Network Mechanism
References

Fairness-Oriented User Scheduling for Bursty Downlink Transmission Using Multi-Agent Reinforcement Learning

Share

Journal details

Abstract