Abstract

We consider the problem of optimal control of district cooling energy plants (DCEPs) consisting of multiple chillers, a cooling tower, and a thermal energy storage (TES), in the presence of time-varying electricity prices. A straightforward application of model predictive control (MPC) requires solving a challenging mixed-integer nonlinear program (MINLP) because of the on/off of chillers and the complexity of the DCEP model. Reinforcement learning (RL) is an attractive alternative since its real-time control computation is much simpler. But designing an RL controller is challenging due to myriad design choices and computationally intensive training. In this paper, we propose an RL controller and an MPC controller for minimizing the electricity cost of a DCEP, and compare them via simulations. The two controllers are designed to be comparable in terms of objective and information requirements. The RL controller uses a novel Q-learning algorithm that is based on least-squares policy iteration. We describe the design choices for the RL controller, including the choice of state space and basis functions, that are found to be effective. The proposed MPC controller does not need a mixed-integer solver for implementation, but only a nonlinear program (NLP) solver. A rule-based baseline controller is also proposed to aid in comparison. Simulation results show that the proposed RL and MPC controllers achieve similar savings over the baseline controller, about 17%.

References

1.
U.S. Energy Information Administration
,
2012
, “Commercial Buildings Energy Consumption Survey (CBECS): Overview of Commercial Buildings, 2012,” Technical Report, Energy Information Administration, Department of Energy, US Government, December.
2.
Pacific Gas and Electric Company
,
1997
, Thermal Energy Storage Strategies for Commercial HVAC Systems. Application Note.
3.
Hydeman
,
M.
, and
Zhou
,
G.
,
2007
, “
Optimizing Chilled Water Plant Control
,”
ASHRAE J.
,
49
, pp.
45
54
.
4.
Teleke
,
S.
,
Baran
,
M. E.
,
Bhattacharya
,
S.
, and
Huang
,
A. Q.
,
2010
, “
Rule-Based Control of Battery Energy Storage for Dispatching Intermittent Renewable Sources
,”
IEEE Trans. Sustain. Energy
,
1
(
3
), pp.
117
124
.
5.
Tam
,
A.
,
Ziviani
,
D.
,
Braun
,
J.
, and
Jain
,
N.
,
2018
, “
A Generalized Rule-Based Control Strategy for Thermal Energy Storage in Residential Buildings
,”
International High Performance Buildings Conference
,
West Lafayette, IN
,
July 9–12
.
6.
Pinamonti
,
M.
,
Prada
,
A.
, and
Baggio
,
P.
,
2020
, “
Rule-Based Control Strategy to Increase Photovoltaic Self-Consumption of a Modulating Heat Pump Using Water Storages and Building Mass Activation
,”
Energies
,
13
(
23
), p.
6282
.
7.
Lee
,
K.-H.
,
Joo
,
M.-C.
, and
Baek
,
N.-C.
,
2015
, “
Experimental Evaluation of Simple Thermal Storage Control Strategies in Low-Energy Solar Houses to Reduce Electricity Consumption During Grid On-Peak Periods
,”
Energies
,
8
(
9
), pp.
9344
9364
.
8.
Schibuola
,
L.
,
Scarpa
,
M.
, and
Tambani
,
C.
,
2015
, “
Demand Response Management by Means of Heat Pumps Controlled Via Real Time Pricing
,”
Energy Build.
,
90
, pp.
15
28
.
9.
Ma
,
Y.
,
Kelman
,
A.
,
Daly
,
A.
, and
Borrelli
,
F.
,
2012
, “
Predictive Control for Energy Efficient Buildings With Thermal Storage: Modeling, Stimulation, and Experiments
,”
IEEE Control Syst. Mag.
,
32
(
1
), pp.
44
64
.
10.
Cole
,
W. J.
,
Edgar
,
T. F.
, and
Novoselac
,
A.
,
2012
, “
Use of Model Predictive Control to Enhance the Flexibility of Thermal Energy Storage Cooling Systems
,”
American Control Conference
,
Montreal, Canada
,
June 27–29
, pp.
2788
2793
.
11.
Touretzky
,
C. R.
, and
Baldea
,
M.
,
2014
, “
Integrating Scheduling and Control for Economic MPC of Buildings With Energy Storage
,”
J. Process Control
,
24
(
8
), pp.
1292
1300
.
12.
Zabala
,
L.
,
Febres
,
J.
,
Sterling
,
R.
,
López
,
S.
, and
Keane
,
M.
,
2020
, “
Virtual Testbed for Model Predictive Control Development in District Cooling Systems
,”
Renewable Sustainable Energy Rev.
,
129
, p.
109920
.
13.
Risbeck
,
M. J.
,
Maravelias
,
C. T.
,
Rawlings
,
J. B.
, and
Turney
,
R. D.
,
2017
, “
A Mixed-Integer Linear Programming Model for Real-Time Cost Optimization of Building Heating, Ventilation, and Air Conditioning Equipment
,”
Energy Build.
,
142
, pp.
220
235
.
14.
Rawlings
,
J. B.
,
Patel
,
N. R.
,
Risbeck
,
M. J.
,
Maravelias
,
C. T.
,
Wenzel
,
M. J.
, and
Turney
,
R. D.
,
2018
, “
Economic MPC and Real-Time Decision Making With Application to Large-Scale HVAC Energy Systems
,”
Comput. Chem. Eng.
,
114
, pp.
89
98
.
15.
Patel
,
N. R.
,
Risbeck
,
M. J.
,
Rawlings
,
J. B.
,
Maravelias
,
C. T.
,
Wenzel
,
M. J.
, and
Turney
,
R. D.
,
2018
, “
A Case Study of Economic Optimization of HVAC Systems Based on the Stanford University Campus Airside and Waterside Systems
,”
8th International High Performance Buildings Conference
,
West Lafayette, IN
,
July 7–12
.
16.
Deng
,
K.
,
Sun
,
Y.
,
Li
,
S.
,
Lu
,
Y.
,
Brouwer
,
J.
,
Mehta
,
P. G.
,
Zhou
,
M.
, and
Chakraborty
,
A.
,
2015
, “
Model Predictive Control of Central Chiller Plant With Thermal Energy Storage Via Dynamic Programming and Mixed-Integer Linear Programming
,”
IEEE Trans. Autom. Sci. Eng.
,
12
(
2
), pp.
565
579
.
17.
Kim
,
D.
,
Wang
,
Z.
,
Brugger
,
J.
,
Blum
,
D.
,
Wetter
,
M.
,
Hong
,
T.
, and
Piette
,
M. A.
,
2022
, “
Site Demonstration and Performance Evaluation of MPC for a Large Chiller Plant With TES for Renewable Energy Integration and Grid Decarbonization
,”
Appl. Energy
,
321
, p.
119343
.
18.
Manoharan
,
P.
,
Venkat
,
M. P.
,
Nagarathinam
,
S.
, and
Vasan
,
A.
,
2021
, “
Learn to Chill: Intelligent Chiller Scheduling Using Meta-Learning and Deep Reinforcement Learning
,”
8th ACM International Conference on Systems for Energy-Efficient Buildings, Cities, and Transportation
,
Coimbra, Portugal
,
Nov. 17–18
, pp.
21
30
.
19.
Qiu
,
S.
,
Li
,
Z.
,
Li
,
Z.
, and
Zhang
,
X.
,
2020
, “
Model-Free Optimal Chiller Loading Method Based on Q-Learning
,”
Sci. Technol. Built Environ.
,
26
(
8
), pp.
1100
1116
.
20.
Qiu
,
S.
,
Li
,
Z.
,
Fan
,
D.
,
He
,
R.
,
Dai
,
X.
, and
Li
,
Z.
,
2022
, “
Chilled Water Temperature Resetting Using Model-Free Reinforcement Learning: Engineering Application
,”
Energy Build.
,
255
, p.
111694
.
21.
Nagarathinam
,
S.
,
Menon
,
V.
,
Vasan
,
A.
, and
Sivasubramaniam
,
A.
,
2020
, “
Marco – Multi-agent Reinforcement Learning Based Control of Building HVAC Systems
,”
Eleventh ACM International Conference on Future Energy Systems
,
New York, NY
,
June 22–26
, pp.
57
67
.
22.
Campos
,
G.
,
El-Farra
,
N. H.
, and
Palazoglu
,
A.
,
2022
, “
Soft Actor-Critic Deep Reinforcement Learning With Hybrid Mixed-Integer Actions for Demand Responsive Scheduling of Energy Systems
,”
Ind. Eng. Chem. Res.
,
61
(
24
), pp.
8443
8461
.
23.
Ahn
,
K. U.
, and
Park
,
C. S.
,
2020
, “
Application of Deep Q-Networks for Model-Free Optimal Control Balancing Between Different HVAC Systems
,”
Sci. Technol. Built Environ.
,
26
(
1
), pp.
61
74
.
24.
Qiu
,
S.
,
Li
,
Z.
,
Li
,
Z.
,
Li
,
J.
,
Long
,
S.
, and
Li
,
X.
,
2020
, “
Model-Free Control Method Based on Reinforcement Learning for Building Cooling Water Systems: Validation by Measured Data-Based Simulation
,”
Energy Build.
,
218
, p.
110055
.
25.
Henze
,
G. P.
, and
Schoenmann
,
J.
,
2003
, “
Evaluation of Reinforcement Learning Control for Thermal Energy Storage Systems
,”
HVAC&R Res.
,
9
(
3
), pp.
259
275
.
26.
Liu
,
S.
, and
Henze
,
G. P.
,
2007
, “
Evaluation of Reinforcement Learning for Optimal Control of Building Active and Passive Thermal Storage Inventory
,”
ASME J. Sol. Energy Eng.
,
129
(
2
), pp.
215
225
.
27.
Lu
,
F.
,
Mehta
,
P. G.
,
Meyn
,
S. P.
, and
Neu
,
G.
,
2021
, “
Convex Q-Learning
,”
American Control Conference
,
Virtual online
,
May 25–28
, IEEE, pp.
4749
4756
.
28.
Lagoudakis
,
M. G.
, and
Parr
,
R.
,
2003
, “
Least-Squares Policy Iteration
,”
J. Mach. Learn. Res.
,
4
, pp.
1107
1149
.
29.
Gibney
,
E.
,
2017
, “
Self-Taught AI is Best Yet At Strategy Game Go
,”
Nature
,
10
(
1
), pp.
68
74
.
30.
Banjac
,
G.
, and
Lygeros
,
J.
,
2019
, “
A Data-Driven Policy Iteration Scheme Based on Linear Programming
,”
58th IEEE Conference on Decision and Control
,
Nice, France
,
Dec. 11–13
, IEEE, pp.
816
821
.
31.
Luo
,
B.
,
Liu
,
D.
,
Wu
,
H.-N.
,
Wang
,
D.
, and
Lewis
,
F. L.
,
2017
, “
Policy Gradient Adaptive Dynamic Programming for Data-Based Optimal Control
,”
IEEE Trans. Cybern.
,
47
(
10
), pp.
3341
3354
.
32.
Fan
,
C.
,
Hinkelman
,
K.
,
Fu
,
Y.
,
Zuo
,
W.
,
Huang
,
S.
,
Shi
,
C.
,
Mamaghani
,
N.
,
Faulkner
,
C.
, and
Zhou
,
X.
,
2021
, “
Open-Source Modelica Models for the Control Performance Simulation of Chiller Plants With Water-Side Economizer
,”
Appl. Energy
,
299
, p.
117337
.
33.
Guo
,
Z.
,
Coffman
,
A. R.
, and
Barooah
,
P.
,
2022
, “
Reinforcement Learning for Optimal Control of a District Cooling Energy Plant
,” American Control Conference (ACC), pp.
3329
3334
.
34.
Guo
,
Z.
,
Chaudhari
,
A.
,
Coffman
,
A. R.
, and
Barooah
,
P.
,
2023
, “Optimal Control of District Cooling Energy Plant With Reinforcement Learning and MPC,” arXiv preprint 2310.03814.
35.
Yu
,
F.
, and
Chan
,
K.
,
2008
, “
Optimization of Water-Cooled Chiller System With Load-Based Speed Control
,”
Appl. Energy
,
85
(
10
), pp.
931
950
.
36.
Andersson
,
J. A. E.
,
Gillis
,
J.
,
Horn
,
G.
,
Rawlings
,
J. B.
, and
Diehl
,
M.
,
2019
, “
CasADi: A Software Framework for Nonlinear Optimization and Optimal Control
,”
Math. Program. Comput.
,
11
(
1
), pp.
1
36
.
37.
Wächter
,
A.
, and
Biegler
,
L. T.
,
2006
, “
On the Implementation of an Interior-Point Filter Line-Search Algorithm for Large-Scale Nonlinear Programming
,”
Math. Program.
,
106
(
1
), pp.
25
57
.
38.
American Society of Heating Refrigerating and Air Conditioning Engineers
,
2002
,
ASHRAE Guideline14-2002 for Measurement of Energy and Demand Savings.
ASHRAE
,
Atlanta, GA
, p.
151
.
39.
Braun
,
J. E.
, and
Diderrich
,
G. T.
,
1990
, “
Near-Optimal Control of Cooling Towers for Chilled-Water Systems
,”
ASHRAE Trans. (Am. Soc. Heat. Refrig. Air-Cond. Eng.
96
(
2
), pp.
806
813
.
40.
Miller
,
C.
,
2014
, .
41.
Miller
,
C.
,
Nagy
,
Z.
, and
Schlueter
,
A.
,
2014
, “
A Seed Dataset for a Public, Temporal Data Repository for Energy Informatics Research on Commercial Building Performance
,”
Third Conference on Future Energy Business & Energy Informatics
,
Rotterdam, Netherlands
,
June 20
.
42.
Sutton
,
R.
, and
Barto
,
A.
,
2018
,
Reinforcement Learning: An Introduction
, 2nd ed.,
MIT Press
,
Cambridge, MA
.
43.
Grant
,
M.
, and
Boyd
,
S.
,
2011
, CVX: Matlab Software for Disciplined Convex Programming, version 1.21, http://cvxr.com/cvx, February.
44.
PJM data miner
,
2022
, PJM Interconnection Real-Time Hourly LMPs., https://www.pjm.com/markets-and-operations/etools/data-miner-2, Accessed October 2, 2022.
45.
Braun
,
J. E.
, and
Chaturvedia
,
N.
,
2002
, “
An Inverse Gray-Box Model for Transient Building Load Prediction
,”
HVAC&R Res.
,
8
(
1
), pp.
73
99
.
46.
Guo
,
Z.
,
Coffman
,
A. R.
,
Munk
,
J.
,
Im
,
P.
,
Kuruganti
,
T.
, and
Barooah
,
P.
,
2021
, “
Aggregation and Data Driven Identification of Building Thermal Dynamic Model and Unmeasured Disturbance
,”
Energy Build.
,
231
, p.
110500
.
47.
Oldewurtel
,
F.
,
Ulbig
,
A.
,
Parisio
,
A.
,
Andersson
,
G.
, and
Morari
,
M.
,
2010
, “
Reducing Peak Electricity Demand in Building Climate Control Using Real-Time Pricing and Model Predictive Control
,”
49th IEEE Conference on Decision and Control
,
Atlanta, GA
,
Dec. 15–17
, pp.
1927
1932
.
48.
National Weather Service
and
NOAA
,
2022
, https://www.weather.gov/, Accessed November 1, 2022.
You do not currently have access to this content.