A multi-agent deep deterministic policy gradient method with hybrid action space for energy-efficient HVAC control

Shrestha, Samir; Sapkota, Sobit; Kong, Minjin; 최진우; Hong, Taehoon; Choi, Jun-Ki

doi:10.1016/j.buildenv.2025.114151

상세 보기

A multi-agent deep deterministic policy gradient method with hybrid action space for energy-efficient HVAC control

Shrestha, Samir;
Sapkota, Sobit;
Kong, Minjin;
최진우;
Hong, Taehoon;
외 1명

Citations

WEB OF SCIENCE

0

초록

Heating, ventilation, and air-conditioning (HVAC) systems represent one of the most energy-intensive components in buildings, yet existing control strategies often fail to jointly optimize thermal comfort, indoor air quality (IAQ), and energy efficiency. This study presents a hybrid-action multi-agent deep deterministic policy gradient (MADDPG) framework for intelligent HVAC control, integrating discrete heater actuation and continuous airflow regulation within a unified reinforcement learning environment. The proposed architecture enables cooperative decision-making between specialized agents through centralized training and decentralized execution, thus capturing the mixed discrete-continuous nature of real HVAC operations. The agents interact with a physicsbased single-zone office model that captures coupled temperature-CO2 dynamics at 5-minute timesteps, driven by Ohio 2024-2025 cold-season outdoor temperature profiles and realistic occupancy schedules. MADDPG is trained for 200 episodes using experience replay, with actor-critic learning rates and target-update coefficients selected via a validation study over 12 hyperparameter configurations that maximized average episodic reward on a separate dataset. Final performance is evaluated on unseen November-February 2024-2025 weather and compared against a single-agent deep Q-network (DQN) baseline that controls only the heater under fixed airflow. Relative to DQN, MADDPG reduces total energy consumption by 7-10%, and discomfort hours by 38% on average across all months, while maintaining IAQ close to the threshold at the cost of modestly higher CO2 violations during extreme cold conditions. These results indicate that the hybrid-action multi-agent reinforcement learning is a promising pathway for energy-efficient, comfort-aware HVAC control in intelligent buildings.

제목: A multi-agent deep deterministic policy gradient method with hybrid action space for energy-efficient HVAC control

저자: Shrestha, Samir; Sapkota, Sobit; Kong, Minjin; 최진우; Hong, Taehoon; Choi, Jun-Ki

DOI: 10.1016/j.buildenv.2025.114151

발행일: 2026-02

저널명: Building and Environment

권: 290