Abstract

Reinforcement learning algorithms can autonomously learn to search a design space for high-performance solutions. However, modern engineering often entails the use of computationally intensive simulation, which can lead to slower design timelines with highly iterative approaches such as reinforcement learning. This work provides a reinforcement learning framework that leverages models of varying fidelity to enable an effective solution search while reducing overall computational needs. Specifically, it utilizes models of varying fidelity while training the agent, iteratively progressing from low- to high fidelity. To demonstrate the effectiveness of the proposed framework, we apply it to two multimodal multi-objective constrained mixed integer nonlinear design problems involving the components of a ground and aerial vehicle. Specifically, for each problem, we utilize a high-fidelity and a low-fidelity deep neural network surrogate model, trained on performance data generated from underlying ground truth models. A tradeoff between solution quality and the proportion of low-fidelity surrogate model usage is observed. Specifically, high-quality solutions are achieved with substantial reductions in computational expense, showcasing the effectiveness of the framework for design problems where the use of just a high-fidelity model is infeasible. This solution quality-computational efficiency tradeoff is contextualized by visualizing the exploration behavior of the design agents.

References

1.
Li
,
K.
, and
Malik
,
J.
,
2016
, “
Learning to Optimize
,”
arXiv preprint
. https://arxiv.org/abs/1606.01885
2.
Lee
,
X. Y.
,
Balu
,
A.
,
Stoecklein
,
D.
,
Ganapathysubramanian
,
B.
, and
Sarkar
,
S.
,
2019
, “
A Case Study of Deep Reinforcement Learning for Engineering Design: Application to Microfluidic Devices for Flow Sculpting
,”
ASME J. Mech. Des.
,
141
(
11
), p.
111401
.
3.
Dworschak
,
F.
,
Dietze
,
S.
,
Wittmann
,
M.
,
Schleich
,
B.
, and
Wartzack
,
S.
,
2022
, “
Reinforcement Learning for Engineering Design Automation
,”
Adv. Eng. Inform.
,
52
, p.
101612
.
4.
Ororbia
,
M. E.
, and
Warn
,
G. P.
,
2022
, “
Design Synthesis Through a Markov Decision Process and Reinforcement Learning Framework
,”
ASME J. Comput. Inf. Sci. Eng.
,
22
(
2
), p.
021002
.
5.
Bender
,
E. M.
,
Gebru
,
T.
,
McMillan-Major
,
A.
, and
Shmitchell
,
S.
,
2021
, “
On the Dangers of Stochastic Parrots: Can Language Models Be Too Big?
FAccT 2021—Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency
,
Virtual Event Canada
,
Mar. 3–10
, Association for Computing Machinery, Inc., pp.
610
623
.
6.
Thompson
,
N. C.
,
Greenewald
,
K.
,
Lee
,
K.
, and
Manso
,
G. F.
,
2020
, “
The Computational Limits of Deep Learning
,”
arXiv preprint
.https://arxiv.org/abs/2007.05558
7.
Fernández-Godino
,
M. G.
,
Park
,
C.
,
Kim
,
N.-H.
, and
Haftka
,
R. T.
,
2016
, “
Review of Multi-Fidelity Models
,”
arXiv preprint
. https://arxiv.org/abs/1609.07196
8.
Miller
,
S. W.
,
Yukish
,
M. A.
, and
Simpson
,
T. W.
,
2018
, “
Design as a Sequential Decision Process: A Method for Reducing Design Set Space Using Models to Bound Objectives
,”
Struct. Multidiscipl. Optim.
,
57
(
1
), pp.
305
324
.
9.
Mehmani
,
A.
,
Chowdhury
,
S.
,
Tong
,
W.
, and
Messac
,
A.
,
2015
, “Adaptive Switching of Variable-Fidelity Models in Population-Based Optimization,”
Engineering and Applied Sciences Optimization: Dedicated to the Memory of Professor
,
M.G.
Karlaftis
,
N.D.
Lagaros
, and
M.
Papadrakakis
, eds.,
Springer International Publishing
,
Cham
, pp.
175
205
.
10.
Wang
,
X.
,
Liu
,
Y.
,
Sun
,
W.
,
Song
,
X.
, and
Zhang
,
J.
,
2018
, “
Multidisciplinary and Multifidelity Design Optimization of Electric Vehicle Battery Thermal Management System
,”
ASME J. Mech. Des.
,
140
(
9
), p.
094501
.
11.
Gross
,
D. C.
,
1999
, “
Report from the Fidelity Implementation Study Group
,”
Simulation Interoperability Workshop
,
Orlando, FL
,
Mar. 14–19
.
12.
Kennedy
,
M. C.
, and
O’hagan
,
A.
,
2000
, “
Predicting the Output from a Complex Computer Code When Fast Approximations Are Available
,”
Biometrika
,
87
(
1
), pp.
1
13
.
13.
Peherstorfer
,
B.
,
Willcox
,
K.
, and
Gunzburger
,
M.
,
2018
, “
Survey of Multifidelity Methods in Uncertainty Propagation, Inference, and Optimization
,”
SIAM Rev.
,
60
(
3
), pp.
550
591
.
14.
Newmark
,
N. M.
, and
Hall
,
W. J.
,
1981
,
Earthquake Resistant Design Considerations and Seismic Design Spectra. EERI Report No. 620/N46/1981, Earthquake Engineering Research Institute, Oakland, CA
.
15.
Xu
,
Z.
,
Lu
,
X.
,
Guan
,
H.
,
Han
,
B.
, and
Ren
,
A.
,
2014
, “
Seismic Damage Simulation in Urban Areas Based on a High-Fidelity Structural Model and a Physics Engine
,”
Natural Hazards
,
71
(
3
), pp.
1679
1693
.
16.
Wielinga
,
B.
, and
Schreiber
,
G.
,
1997
, “
Configuration-Design Problem Solving
,”
IEEE Expert
,
12
(
2
), pp.
49
56
.
17.
Mittal
,
S.
, and
Frayman
,
F.
,
1989
, “
Towards a Generic Model of Configuration Tasks
,”
IJCAI
,
2
, pp.
1395
1401
.
18.
Neema
,
H.
,
Lattmann
,
Z.
,
Meijer
,
P.
,
Klingler
,
J.
,
Neema
,
S.
,
Bapty
,
T.
,
Sztipanovits
,
J.
, and
Karsai
,
G.
,
2014
, “
Design Space Exploration and Manipulation for Cyber Physical Systems
,”
IFIP First International Workshop on Design Space Exploration of Cyber-Physical Systems
,
Berlin, Germany
,
April
, p.
8
.
19.
Miller
,
S. W.
,
Simpson
,
T. W.
,
Yukish
,
M. A.
,
Bennett
,
L. A.
,
Lego
,
S. E.
, and
Stump
,
G. M.
,
2013
, “
Preference Construction, Sequential Decision Making, and Trade Space Exploration
,”
International Design Engineering Technical Conferences and Computers and Information in Engineering Conference.
,
Portland, OR
,
Aug. 4–7
.
20.
Ball
,
L. J.
,
Maskill
,
L.
, and
Ormerod
,
T. C.
,
1998
, “
Satisficing in Engineering Design: Causes, Consequences and Implications for Design Support
,”
Autom. Constr.
,
7
(
2–3
), pp.
213
227
.
21.
Simon
,
H. A
,
2008
,
Satisficing. In: The New Palgrave Dictionary of Economics
,
Palgrave Macmillan
,
London
.
22.
Stoecklein
,
D.
,
Wu
,
C.-Y.
,
Kim
,
D.
,
di Carlo
,
D.
, and
Ganapathysubramanian
,
B.
,
2016
, “
Optimization of Micropillar Sequences for Fluid Flow Sculpting
,”
Phys. Fluids
,
28
(
1
), p.
012003
.
23.
Rios
,
L. M.
, and
Sahinidis
,
N. v.
,
2013
, “
Derivative-Free Optimization: A Review of Algorithms and Comparison of Software Implementations
,”
J. Glob. Optim.
,
56
(
3
)
,
pp.
1247
1293
.
24.
Saldanha
,
W. H.
,
Soares
,
G. L.
,
Machado-Coelho
,
T. M.
,
dos Santos
,
E. D.
, and
Ekel
,
P. I.
,
2017
, “
Choosing the Best Evolutionary Algorithm to Optimize the Multiobjective Shell-and-Tube Heat Exchanger Design Problem Using PROMETHEE
,”
Appl. Therm. Eng.
,
127
, pp.
1049
1061
.
25.
Sutton
,
R. S.
, and
Barto
,
A. G.
,
2018
,
Reinforcement Learning: An Introduction
,
MIT Press
,
Cambridge, MA
.
26.
Brown
,
N.
,
Garland
,
A.
,
Fadel
,
G.
, and
Li
,
G.
,
2022
, “
Deep Reinforcement Learning for Engineering Design Through Topology Optimization of Elementally Discretized Design Domains
,”
Mater. Des.
,
218
, p.
110672
.
27.
Settaluri
,
K.
,
Haj-Ali
,
A.
,
Huang
,
Q.
,
Hakhamaneshi
,
K.
, and
Nikolic
,
B.
,
2020
, “
AutoCkt: Deep Reinforcement Learning of Analog Circuit Designs
,”
Proceedings of the 2020 Design, Automation and Test in Europe Conference and Exhibition, DATE 2020
,
Grenoble, France
,
Mar. 9–13
, Institute of Electrical and Electronics Engineers Inc., pp.
490
495
.
28.
Regenwetter
,
L.
,
Nobari
,
A. H.
, and
Ahmed
,
F.
,
2022
, “
Deep Generative Models in Engineering Design: A Review
,”
ASME J. Mech. Des.
,
144
(
7
), p.
071704
.
29.
DARPA Information Innovation Office
,
2019
,
Broad Agency Announcement Symbiotic Design for Cyber Physical Systems HR001119S0083
.
30.
Martínez-Plumed
,
F.
,
Avin
,
S.
,
Brundage
,
M.
,
Dafoe
,
A.
,
hÉigeartaigh
,
S.Ó.
, and
Hernández-Orallo
,
J.
,
2018
, “
Between Progress and Potential Impact of AI: the Neglected Dimensions
,” arxXiv preprint, arXiv:1806.00610v2. https://arxiv.org/abs/1806.00610v2
31.
Tian
,
Y.
,
Ma
,
J.
,
Gong
,
Q.
,
Sengupta
,
S.
,
Chen
,
Z.
,
Pinkerton
,
J.
, and
Zitnick
,
C. L.
,
2019
, “
ELF OpenGo: An Analysis and Open Reimplementation of AlphaZero
,”
International Conference on Machine Learning
,
Long Beach, CA
,
June 9–15
.
32.
van Hasselt
,
H.
, and
Wiering
,
M. A.
,
2009
, “
Using Continuous Action Spaces to Solve Discrete Problems
,”
Proceedings of the International Joint Conference on Neural Networks
,
Atlanta, GA
,
June 14–19
, pp.
1149
1156
.
33.
Williams
,
G.
,
Meisel
,
N. A.
,
Simpson
,
T. W.
, and
McComb
,
C.
,
2020
, “
Deriving Metamodels to Relate Machine Learning Quality to Design Repository Characteristics in the Context of Additive Manufacturing
,”
Proceedings of the Volume 11A: 46th Design Automation Conference (DAC)
,
Virtual
,
Aug. 17–19
.
34.
Williams
,
G.
,
Meisel
,
N. A.
,
Simpson
,
T. W.
, and
McComb
,
C.
,
2019
, “
Design Repository Effectiveness for 3D Convolutional Neural Networks: Application to Additive Manufacturing
,”
ASME J. Mech. Des.
,
141
(
11
), p.
111701
.
35.
Williams
,
G.
,
Puentes
,
L.
,
Nelson
,
J.
,
Menold
,
J.
,
Tucker
,
C.
, and
McComb
,
C.
,
2020
, “
Comparing Attribute- and Form-Based Machine Learning Techniques for Component Prediction
,”
Proceedings of the Volume 11B: 46th Design Automation Conference (DAC)
,
Virtual
,
Aug. 17–19
.
36.
Jin
,
H.
,
Song
,
Q.
, and
Hu
,
X.
,
2019
, “
Auto-Keras: An Efficient Neural Architecture Search System
,”
Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining
,
Anchorage, AK
,
Aug. 4–8
, ACM, New York, pp.
1946
1956
.
37.
Schulman
,
J.
,
Wolski
,
F.
,
Dhariwal
,
P.
,
Radford
,
A.
, and
Klimov
,
O.
,
2017
, “
Proximal Policy Optimization Algorithms
,”
arXiv preprint
. https://arxiv.org/abs/1707.06347
38.
Agrawal
,
A.
, and
McComb
,
C.
,
2022
, “
Comparing Strategies for Visualizing the High-Dimensional Exploration Behavior of CPS Design Agents
,”
Proceedings of the 2022 IEEE Workshop on Design Automation for CPS and IoT (DESTION)
,
Milano, Italy
,
May 3–6
, IEEE, pp.
64
69
.
39.
Lapp
,
S.
,
Jablokow
,
K.
, and
McComb
,
C.
,
2019
, “
KABOOM: An Agent-Based Model for Simulating Cognitive Style in Team Problem Solving
,”
Design Sci.
,
5
, pp.
1
32
.
40.
Soria Zurita
,
N. F.
,
Colby
,
M. K.
,
Tumer
,
I. Y.
,
Hoyle
,
C.
, and
Tumer
,
K.
,
2018
, “
Design of Complex Engineered Systems Using Multi-Agent Coordination
,”
ASME J. Comput. Inf. Sci. Eng.
,
18
(
1
), p.
011003
.
41.
Steinerberger
,
S.
,
2015
, “
On the Number of Positions in Chess Without Promotion
,”
Int. J. Game Theory
,
44
(
3
), pp.
761
767
.
42.
Walker
,
J. D.
,
Heim
,
F. M.
,
Surampudi
,
B.
,
Bueno
,
P.
,
Carpenter
,
A.
,
Chocron
,
S.
,
Cutshall
,
J.
, et al
,
2022
, “
A Flight Dynamics Model for Exploring the Distributed Electrical EVTOL Cyber Physical Design Space
,”
Proceedings of the 2022 IEEE Workshop on Design Automation for CPS and IoT (DESTION)
,
Milano, Italy
,
May 3–6
, IEEE, pp.
7
12
.
43.
Ruiz-Montiel
,
M.
,
Boned
,
J.
,
Gavilanes
,
J.
,
Jiménez
,
E.
,
Mandow
,
L.
, and
Pérez-de-la-Cruz
,
J.-L.
,
2013
, “
Design With Shape Grammars and Reinforcement Learning
,”
Adv. Eng. Inform.
,
27
(
2
), pp.
230
245
.
44.
Mirhoseini
,
A.
,
Goldie
,
A.
,
Yazgan
,
M.
,
Jiang
,
J. W.
,
Songhori
,
E.
,
Wang
,
S.
,
Lee
,
Y. J.
, et al
,
2021
, “
A Graph Placement Methodology for Fast Chip Design
,”
Nature
,
594
(
7862
), pp.
207
212
.
45.
Tavakoli
,
M.
, and
Baldi
,
P.
,
2020
, “
Continuous Representation of Molecules Using Graph Variational Autoencoder
,”
arXiv preprint
. https://arxiv.org/abs/2004.08152
46.
Cross
,
N.
,
2004
, “
Expertise in Design: An Overview
,”
Design Studies
,
25
(
5
), pp.
427
441
.
47.
Chhabra
,
J. P.
, and
Warn
,
G. P.
,
2019
, “
A Method for Model Selection Using Reinforcement Learning When Viewing Design as a Sequential Decision Process
,”
Struct. Multidiscip. Optim
,
59
(
5
), pp.
1521
1542
.
48.
Grace
,
K.
,
Lou Maher
,
M.
,
Wilson
,
D.
, and
Najjar
,
N.
,
2017
, “
Personalised Specific Curiosity for Computational Design Systems
,”
Design Computing and Cognition ’16
,
Evanston (Chicago), IL
,
June 27–29
, Springer International Publishing, Cham, pp.
593
610
.
49.
Agrawal
,
A.
,
Won
,
S. J.
,
Sharma
,
T.
,
Deshpande
,
M.
, and
McComb
,
C.
,
2021
, “
A Multi-Agent Reinforcement Learning Framework for Intelligent Manufacturing With Autonomous Mobile Robots
,”
Proc. Des. Soc.
,
1
, pp.
161
170
.
You do not currently have access to this content.