Abstract
—This paper develops a cell-free massive multiple-input multiple-output enabled multi-access edge computing (MEC) system consisting of multiple users and one central processing unit (CPU) equipped with multiple access points (APs). It aims to achieve seamless task offloading and computation. By minimizing the average energy consumption, the joint optimization of AP clustering, task splitting coefficient, transmit power and computation resource of users is formulated as a mixed integer nonlinear programming problem. The formulated problem is complicated non-convex due to the coupled discrete and continuous variables, resulting in high complexity and non-real-time to directly obtain the optimal solution. To tackle this problem, we propose a proximal policy optimization (PPO)-based hierarchical deep reinforcement learning (HDRL) algorithm , where the discrete and continuous variables are iteratively solved by the designed PPO-based high-level and low-level agents. Simulation results demonstrate the superiority of the proposed algorithm in terms of average energy consumption.