Stackelberg Game Optimization Scheduling in Building Integrated Energy Systems Based on Deep Reinforcement Learning

Xiaoning SHEN; Xinghui CHEN; Wenyan CHEN; Xinsu XU

doi:10.13648/j.cnki.issn1674-0629.2026.03.008

PDF(2146 KB)

South Power Sys Technol ›› 2026, Vol. 20 ›› Issue (3) : 74-88. DOI: 10.13648/j.cnki.issn1674-0629.2026.03.008

Integrated Energy System Planning and Low Carbon Operation

Stackelberg Game Optimization Scheduling in Building Integrated Energy Systems Based on Deep Reinforcement Learning

Xiaoning SHEN ¹^,²^,³^,⁴ ,
Xinghui CHEN ¹ ,
Wenyan CHEN ¹ ,
Xinsu XU ¹

Author information +

History +

Abstract

As the global society is increasingly concerned about the transition to sustainable energy practices, building integrated energy systems optimization is significant in improving low-carbon and economic energy consumption. Therefore,research is conducted on the scheduling and pricing strategies of building energy operators. Firstly, the information interaction characteristics of both the supply side and the demand side are considered. A two-side optimization model of the building integrated energy system based on the Stackelberg game framework is established with the supply side as the leader and the demand side as the follower. Secondly, a deep deterministic strategy gradient algorithm is proposed based on the adaptive action exploration mechanism to solve the constructed model efficiently given the multiple information interactions between the two sides of the Stackelberg game framework. The adaptive action exploration mechanism constructs the action selection strategy of the adaptive exploration coefficient improvement algorithm based on the variance of the cumulative rewards and the average loss value of the critic network, ensuring the algorithm's accuracy and stability. Finally, the effectiveness of the proposed algorithm is verified by examples. The experimental results show that compared with other deep reinforcement learning algorithms, the proposed algorithm can improve the convergence accuracy and stability of the algorithm, as well as the total revenue of the energy operator, thus assisting the energy supply side in making better decisions.

Key words

deep reinforcement learning / building integrated energy system / intelligent scheduling / Stackelberg game / energy pricing

Cite this article

EndNote

Ris (Procite)

Bibtex

Download Citations

Xiaoning SHEN , Xinghui CHEN , Wenyan CHEN , et al. Stackelberg Game Optimization Scheduling in Building Integrated Energy Systems Based on Deep Reinforcement Learning[J]. Southern Power System Technology. 2026, 20(3): 74-88 https://doi.org/10.13648/j.cnki.issn1674-0629.2026.03.008

References

List( Publishing order | Descend order by publishing year | Descend order by cited within ) Chart analysis

[1]	JIANG Q， WANG H， KONG Q， et al. On-orbit remote sensing image processing complex task scheduling model based on heterogeneous multiprocessor［J］. IEEE Transactions on Geoscience and Remote Sensing， 2023（61）： 1001718.1 - 1001718.11. Cited in this article [1]

[2]	YU L， QIN S， ZHANG M， et al. A review of deep reinforcement learning for smart building energy management［J］. IEEE Internet of Things Journal， 2021， 8（15）： 12046 - 12063. Cited in this article [1]

[3]	NTONTIN K， BOULOGEORGOS A， BJÖRNSON E， et al. Wireless energy harvesting for autonomous reconfigurable intelligent surfaces［J］. IEEE Transactions on Green Communications and Networking， 2023， 7（1）： 114 - 129. Cited in this article [1]

[4]	BARBATO A， CAPONE A. Optimization models and methods for demand-side management of residential users： A survey［J］. Energies， 2014， 7（9）： 5787 - 5824. Cited in this article [1]

[5]

任炬光，张力，金立，等. 考虑可再生能源消纳的建筑综合能源系统日前经济调度模型［J］. 工程科学与技术， 2023， 55（2）： 160 - 170.

REN

Juguang

， ZHANG

， JIN

， et al. Day-ahead economic dispatch model of building integrated energy systems considering the renewable energy consumption［J］. Advanced Engineering Sciences， 2023， 55（2）： 160 - 170.

Cited in this article [1]

[6]	石文喆，李冰洁，尤培培，等. 基于深度强化学习的建筑能源系统优化策略［J］. 中国电力， 2023， 56（6）： 114 - 122. SHI Wenzhe， LI Bingjie， YOU Peipei， et al. Optimization strategy of building energy system based on deep reinforcement learning［J］. Electric Power， 2023， 56（6）： 114 - 122. Cited in this article [1]

[7]

高力强，费若雯，刘敏，等. 含有氢能汽车和储能的新型光伏建筑能源系统建模［J］. 西安建筑科技大学学报（自然科学版）， 2022， 54（3）： 414 - 422.

GAO

Liqiang

， FEI

Ruowen

， LIU

Min

， et al. Modelling of a novel photovoltaics building energy system considering hydrogen vehicles and energy storage［J］. Journal of Xi'an University of Architecture & Technology （Natural Science Edition）， 2022， 54（3）： 414 - 422.

Cited in this article [1]

[8]

张大海，贠韫韵，王小君，等. 计及光热电站及建筑热平衡的冷热电综合能源系统优化运行［J］. 高电压技术， 2022， 48（7）： 2505 - 2514.

ZHANG

Dahai

， YUAN

Wenyun

， WANG

Xiaojun

， et al. Operational optimization of integrated cooling， heating and power energy system considering concentrating solar power plant and heat balance of building［J］. High Voltage Engineering， 2022， 48（7）： 2505 - 2514.

Cited in this article [1]

[9]	刘晓华，张涛，刘效辰，等. 面向双碳目标的建筑能源系统再认识［J］. 力学学报， 2023， 55（3）： 699 - 709. LIU Xiaohua， ZHANG Tao， LIU Xiaochen， et al. Rethinking of the building energy systems towards the carbon neutral targer［J］. Chinese Journal of Theoretical and Applied Mechanics， 2023， 55（3）： 699 - 709. Cited in this article [1]

[10]

陈淑琴，陆敏艳，谭洪卫，等. 基于多目标优化的办公建筑可再生能源系统集成优化配置方案研究［J］. 太阳能学报， 2018， 39（11）： 3147 - 3154.

CHEN

Shuqin

， LU

Minyan

， TAN

Hongwei

， et al. Research on integration renewable energy systems in office building based on multi-objective optimization［J］. Acta Energiae solaris Sinica， 2018， 39（11）： 3147 - 3154.

Cited in this article [1]

[11]

李香龙，赵乐，王瀚秋，等. 基于柔性负荷和建筑多能源系统的电网峰谷调节算法［J］. 沈阳工业大学学报， 2023， 45（4）： 361 - 365.

Xianglong

， ZHAO

， WANG

Hanqiu

， et al. Power grid peak and valley adjustment algorithm based on flexible load and building multi-energy system［J］. Journal of Shenyang University of Technology， 2023， 45（4）： 361 - 365.

Cited in this article [1]

[12]

徐占伯，周春翔，吴江，等. 基于边云协同的建筑能源系统分布式供需协同优化［J］. 中国科学：信息科学， 2023， 53（3）： 517 - 534.

Zhanbo

， ZHOU

Chunxiang

， WU

Jiang

， et al. Edge-cloud framework-based distributed optimization of a building energy system with supply-demand coordination［J］. Scientia Sinica （Informationis）， 2023， 53（3）： 517 - 534.

Cited in this article [1]

[13]	安佳坤，贺春光，刘洪，等. 基于强化学习的建筑集群需求侧能量管理方法［J］. 电力建设， 2021， 42（5）： 16 - 26. AN Jiakun， HE Chunguang， LIU Hong， et al. Demand-side energy management method for building clusters applying reinforcement learning［J］. Electric Power Construction， 2021， 42（5）： 16 - 26. Cited in this article [1]

[14]	WANG， D， LIU B， JIA H， et al. Peer-to-peer electricity transaction decisions of the user-side smart energy system based on the SARSA reinforcement learning［J］. CSEE Journal of Power and Energy Systems， 2022， 8（3）： 826 - 837. Cited in this article [1]

[15]	李媛，迟昆，王洲，等. 基于强化学习的电-气-热多微网系统定价策略［J］. 南方电网技术， 2024， 18（1）： 94 - 101. LI Yuan， CHI Kun， WANG Zhou， et al. Pricing strategy for electric-gas-heat multi-microgrid system based on reinforcement learning［J］. Southern Power System Technology， 2024， 18（1）： 94 - 101. Cited in this article [1]

[16]	LIANG Z， HUANG C， SU W， et al. Safe reinforcement learning-based resilient proactive scheduling for a commercial building considering correlated demand response［J］. IEEE Open Access Journal of Power and Energy， 2021， 8（3）： 85 - 96. Cited in this article [1]

[17]	PENG Y， SHEN H， TANG X， et al. Energy consumption optimization for heating， ventilation and air conditioning systems based on deep reinforcement learning［J］. IEEE Access， 2023（11）： 88265 - 88277. Cited in this article [2]

[18]	PLAPPERT M， HOUTHOOFT R， DHARIWAL P， et al. Parameter space noise for exploration［DB/OL］. （2017 - 6-6）［2018 - 1-31］. https：//arxiv.org/abs/1706.01905 Cited in this article [1]

[19]	LI M， HUANG T， ZHU W. Adaptive exploration policy for exploration–exploitation tradeoff in continuous action control optimization［J］. International Journal of Machine Learning and Cybernetics， 2021， 12（12）： 3491 - 3501. Cited in this article [4]

[20]	YANG P， JIANG H， LIU C， et al. Coordinated optimization scheduling operation of integrated energy system considering demand response and carbon trading mechanism［J］. International Journal of Electrical Power & Energy Systems， 2023（147）： 108902.1 - 108902.10. Cited in this article [6]

[21]

叶宇静，邢海军，米阳，等. 考虑低碳需求响应及主从博弈的综合能源系统低碳优化调度［J］. 电力系统自动化， 2024， 48（9）： 34 - 43.

Yujing

， XING

Haijun

， MI

Yang

， et al. Low-carbon optimal scheduling of integrated energy system considering low-carbon demand response and stackelberg game［J］. Automation of Electric Power Systems， 2024， 48（9）： 34 - 43.

Cited in this article [1]

[22]	FENG C， WANG Y， ZHENG K. Smart meter data-driven customizing price design for retailers［J］. IEEE Transactions on Smart Grid， 2020， 11（3）： 2043 - 2054. Cited in this article [1]

[23]	LI K， YE N， LI S， et al. Distributed collaborative operation strategies in multi-agent integrated energy system considering integrated demand response based on game theory［J］. Energy， 2023（273）： 127137.1 - 127137.11. Cited in this article [1]

[24]	王俐英，林嘉琳，董厚琦，等. 计及阶梯式碳交易的综合能源系统优化调度［J］. 系统仿真学报， 2022， 34（7）： 1393 - 1404. WANG Liying， LIN Jialin， DONG Houqi， et al. Optimal dispatch of integrated energy system considering ladder-type carbon trading［J］， Journal of System Simulation， 2022， 34（7）： 1393 - 1404. Cited in this article [4]

[25]

SHENGREN

， SALAZAR

E M

， VERGARA

P P

， et al. Performance comparison of deep RL algorithms for energy systems optimal scheduling［C］//2022 IEEE PES Innovative Smart Grid Technologies Conference Europe （ISGT-Europe）， October 10 - 12， 2022， Novi Sad， Serbia. New York： IEEE， 2022： 1 - 6.

Cited in this article [3]

[26]

王桂兰，张海晓，刘宏，等. 基于近端策略优化算法含碳捕集的综合能源系统低碳经济调度［J］. 计算机应用研究， 2024， 41（5）： 1508 - 1514.

WANG

Guilan

， ZHANG

Haixiao

， LIU

Hong

. et al. Low carbon economic scheduling of integrated energy systems based on proximal policy optimization algorithm with carbon capture［J］. Application Research of Computers， 2024， 41（5）： 1508 - 1514.

Cited in this article [1]

[27]

彭寒梅，赵长桥，谭貌，等. 基于多智能体深度强化学习的大容量电池储能电站功率分配策略［J］.南方电网技术，2025，19（9）：82 - 93.

PENG

Hanmei

， ZHAO

Changqiao

， TAN

Mao

， et al. Power allocation strategy for large-capacity battery energy storage power station based on multi-agent deep reinforcement learning［J］.Southern Power System Technology，2025，19（9）：82 - 93.

[28]	唐文虎，韦慈航，柳洲，等.考虑负荷削减公平性的配电网两阶段韧性提升策略［J］.广东电力，2025，38（9）：52 - 67. TANG Wenhu， WEI Cihang， LIU Zhou，et al.Two-stage resilience enhancement strategy for distribution networks considering fairness of load shedding ［J］.Guangdong Electric Power，2025，38（9）：52 - 67.

[29]

李嘉文，余涛，张孝顺，等. 基于改进深度确定性梯度算法的AGC发电功率指令分配方法［J］. 中国电机工程学报， 2021， 41（21）： 7198 - 7212.

Jiawen

， YU

Tao

， ZHANG

Xiaoshun

， et al. AGC power generation command allocation method based on improved deep deterministic policy gradient algorithm［J］. Proceedings of the CSEE， 2021， 41（21）： 7198 - 7212.

Cited in this article [1]