Abstract

Anomalies are samples that significantly deviate from the rest of the data and their detection plays a major role in building machine learning models that can be reliably used in applications such as data-driven design and novelty detection. The majority of existing anomaly detection methods either are exclusively developed for (semi) supervised settings, or provide poor performance in unsupervised applications where there are no training data with labeled anomalous samples. To bridge this research gap, we introduce a robust, efficient, and interpretable methodology based on nonlinear manifold learning to detect anomalies in unsupervised settings. The essence of our approach is to learn a low-dimensional and interpretable latent representation (aka manifold) for all the data points such that normal samples are automatically clustered together and hence can be easily and robustly identified. We learn this low-dimensional manifold by designing a learning algorithm that leverages either a latent map Gaussian process (LMGP) or a deep autoencoder (AE). Our LMGP-based approach, in particular, provides a probabilistic perspective on the learning task and is ideal for high-dimensional applications with scarce data. We demonstrate the superior performance of our approach over existing technologies via multiple analytic examples and real-world datasets.

References

1.
Edgeworth
,
F. Y.
,
1887
, “
XLI. On Discordant Observations
,”
Lond. Edinb. Dublin Philos. Mag. J. Sci.
,
23
(
143
), pp.
364
375
.
2.
Chandola
,
V.
,
Banerjee
,
A.
, and
Kumar
,
V.
,
2009
, “
Anomaly Detection: A Survey
,”
ACM Comput. Surv. (CSUR)
,
41
(
3
), pp.
1
58
.
3.
Garmaroodi
,
M. S. S.
,
Farivar
,
F.
,
Haghighi
,
M. S.
,
Shoorehdeli
,
M. A.
, and
Jolfaei
,
A.
,
2020
, “
Detection of Anomalies in Industrial IoT Systems by Data Mining: Study of Christ Osmotron Water Purification System
,”
IEEE Int. Thin. J.
,
8
(
13
), pp.
10280
10287
.
4.
Skomedal
,
Å. F.
,
Aarseth
,
B. L.
,
Haug
,
H.
,
Selj
,
J.
, and
Marstein
,
E. S.
,
2020
, “
How Much Power is Lost in a Hot-Spot? a Case Study Quantifying the Effect of Thermal Anomalies in Two Utility Scale PV Power Plants
,”
Sol. Energy
,
211
(
1
), pp.
1255
1262
.
5.
Mehrotra
,
K. G.
,
Mohan
,
C. K.
, and
Huang
,
H.
,
2017
,
Anomaly Detection Principles and Algorithms
, Vol.
1
,
Springer
,
Cham, Switzerland
.
6.
Noto
,
K.
,
Brodley
,
C.
, and
Slonim
,
D.
,
2012
, “
FRAC: A Feature-Modeling Approach for Semi-Supervised and Unsupervised Anomaly Detection
,”
Data Min. Knowl. Discov.
,
25
(
4
), pp.
109
133
.
7.
Xia
,
X.
,
Pan
,
X.
,
Li
,
N.
,
He
,
X.
,
Ma
,
L.
,
Zhang
,
X.
, and
Ding
,
N.
,
2022
, “
GAN-Based Anomaly Detection: A Review
,”
Neurocomputing
,
493
(
1
), pp.
467
535
.
8.
Görnitz
,
N.
,
Kloft
,
M.
,
Rieck
,
K.
, and
Brefeld
,
U.
,
2013
, “
Toward Supervised Anomaly Detection
,”
J. Artif. Intell. Res.
,
46
(
1
), pp.
235
262
.
9.
Pang
,
G.
,
Shen
,
C.
, and
Cao
,
L.
,
2021
, “
Toward Deep Supervised Anomaly Detection: Reinforcement Learning From Partially Labeled Anomaly Data
,”
Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining
,
Singapore
,
Aug. 14–18
, pp.
1298
1308
.
10.
Ruff
,
L.
,
Vandermeulen
,
RA.
,
Görnitz
,
N.
,
Binder
,
A.
,
Müller
,
E.
,
Müller
,
KR.
, and
Kloft
,
M.
,
2019
,
arXiv preprint arXiv:1906.02694
.
11.
Villa-Pérez
,
M. E.
,
Alvarez-Carmona
,
M. A.
,
Loyola-Gonzalez
,
O.
,
Medina-Pérez
,
M. A.
,
Velazco-Rossell
,
J. C.
, and
Choo
,
K. -K. R.
,
2021
, “
Semi-Supervised Anomaly Detection Algorithms: A Comparative Summary and Future Research Directions
,”
Knowl. Based Syst.
,
218
(
1
), p.
106878
.
12.
Liu
,
J.
,
Song
,
K.
,
Feng
,
M.
,
Yan
,
Y.
,
Tu
,
Z.
, and
Zhu
,
L.
,
2021
, “
Semi-Supervised Anomaly Detection With Dual Prototypes Autoencoder for Industrial Surface Inspection
,”
Opt. Laser. Eng.
,
136
(
1
), p.
106324
.
13.
De Vita
,
F.
,
Bruneo
,
D.
, and
Das
,
S. K.
,
2021
, “
A Semi-Supervised Bayesian Anomaly Detection Technique for Diagnosing Faults in Industrial IoT Systems
,”
2021 IEEE International Conference on Smart Computing (SMARTCOMP)
,
Virtual Event
,
Aug. 23–27
, IEEE, pp.
31
38
.
14.
Chen
,
T.
,
Liu
,
X.
,
Xia
,
B.
,
Wang
,
W.
, and
Lai
,
Y.
,
2020
, “
Unsupervised Anomaly Detection of Industrial Robots Using Sliding-Window Convolutional Variational Autoencoder
,”
IEEE Access
,
8
(
1
), pp.
47072
47081
.
15.
Cui
,
Y.
,
Liu
,
Z.
, and
Lian
,
S.
,
2023
, “
A Survey on Unsupervised Industrial Anomaly Detection Algorithms
,”
IEEE Access
,
11
(
1
), pp.
55297
55315
.
16.
Fraser
,
K.
,
Homiller
,
S.
,
Mishra
,
R. K.
,
Ostdiek
,
B.
, and
Schwartz
,
M. D.
,
2022
, “
Challenges for Unsupervised Anomaly Detection in Particle Physics
,”
J. High Energy Phys.
,
2022
(
3
), pp.
1
31
.
17.
Usmani
,
U. A.
,
Happonen
,
A.
, and
Watada
,
J.
,
2022
, “
A Review of Unsupervised Machine Learning Frameworks for Anomaly Detection in Industrial Applications
,”
Intelligent Computing: Proceedings of the 2022 Computing Conference
,
Ghaziabad, India
,
July 14–15
, Vol. 2, Springer, pp.
158
189
.
18.
Yang
,
J.
,
Shi
,
Y.
, and
Qi
,
Z.
,
2022
, “
Learning Deep Feature Correspondence for Unsupervised Anomaly Detection and Segmentation
,”
Patt. Recogn.
,
132
(
1
), p.
108874
.
19.
Alimohammadi
,
H.
, and
Chen
,
S. N.
,
2022
, “
Performance Evaluation of Outlier Detection Techniques in Production Timeseries: A Systematic Review and Meta-Analysis
,”
Exp. Syst. Appl.
,
191
(
1
), p.
116371
.
20.
Ergen
,
T.
, and
Kozat
,
S. S.
,
2019
, “
Unsupervised Anomaly Detection With LSTM Neural Networks
,”
IEEE Trans. Neural Netw. Learning Syst.
,
31
(
8
), pp.
3127
3141
.
21.
Fan
,
J.
,
Zhang
,
Q.
,
Zhu
,
J.
,
Zhang
,
M.
,
Yang
,
Z.
, and
Cao
,
H.
,
2020
, “
Robust Deep Auto-Encoding Gaussian Process Regression for Unsupervised Anomaly Detection
,”
Neurocomputing
,
376
(
1
), pp.
180
190
.
22.
Talagala
,
P. D.
,
Hyndman
,
R. J.
, and
Smith-Miles
,
K.
,
2021
, “
Anomaly Detection in High-Dimensional Data
,”
J. Comput. Graph. Statist.
,
30
(
2
), pp.
360
374
.
23.
Breunig
,
M. M.
,
Kriegel
,
H. -P.
,
Ng
,
R. T.
, and
Sander
,
J.
,
2000
, “
LOF: Identifying Density-Based Local Outliers
,”
Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data
,
Dallas, TX
,
May 15–18
, pp.
93
104
.
24.
Pu
,
G.
,
Wang
,
L.
,
Shen
,
J.
, and
Dong
,
F.
,
2020
, “
A Hybrid Unsupervised Clustering-Based Anomaly Detection Method
,”
Tsinghua Sci. Technol.
,
26
(
2
), pp.
146
153
.
25.
Gao
,
Y.
,
Yang
,
T.
,
Xu
,
M.
, and
Xing
,
N.
,
2012
, “
An Unsupervised Anomaly Detection Approach for Spacecraft Based on Normal Behavior Clustering
,”
2012 Fifth International Conference on Intelligent Computation Technology and Automation
,
Zhangjiajie, Hunan, China
,
Jan. 12–14
, IEEE, pp.
478
481
.
26.
Syarif
,
I.
,
Prugel-Bennett
,
A.
, and
Wills
,
G.
,
2012
, “
Unsupervised Clustering Approach for Network Anomaly Detection
,”
International Conference on Networked Digital Technologies
,
Dubai, UAE
,
Apr. 24–26
, Springer, pp.
135
145
.
27.
Zhang
,
Y.
,
Du
,
B.
,
Zhang
,
L.
, and
Wang
,
S.
,
2015
, “
A Low-Rank and Sparse Matrix Decomposition-Based Mahalanobis Distance Method for Hyperspectral Anomaly Detection
,”
IEEE Trans. Geosci. Remote Sens.
,
54
(
3
), pp.
1376
1389
.
28.
Magyar
,
B.
,
Kenyeres
,
A.
,
Tóth
,
S.
,
Hajdu
,
I.
, and
Horváth
,
R.
,
2022
, “
Spatial Outlier Detection on Discrete GNSS Velocity Fields Using Robust Mahalanobis-Distance-Based Unsupervised Classification
,”
GPS Solut.
,
26
(
4
), p.
145
.
29.
Hariri
,
S.
,
Kind
,
M. C.
, and
Brunner
,
R. J.
,
2019
, “
Extended Isolation Forest
,”
IEEE Trans. Knowl. Data. Eng.
,
33
(
4
), pp.
1479
1489
.
30.
Song
,
X.
,
Aryal
,
S.
,
Ting
,
K. M.
,
Liu
,
Z.
, and
He
,
B.
,
2021
, “
Spectral-Spatial Anomaly Detection of Hyperspectral Data Based on Improved Isolation Forest
,”
IEEE Trans. Geosci. Remote Sens.
,
60
(
1
), pp.
1
16
.
31.
Karczmarek
,
P.
,
Kiersztyn
,
A.
,
Pedrycz
,
W.
, and
Czerwiński
,
D.
,
2021
, “
Fuzzy C-Means-Based Isolation Forest
,”
Appl. Soft. Comput.
,
106
(
1
), p.
107354
.
32.
Wang
,
B.
, and
Mao
,
Z.
,
2019
, “
Outlier Detection Based on Gaussian Process With Application to Industrial Processes
,”
Appl. Soft. Comput.
,
76
(
1
), pp.
505
516
.
33.
Rajabzadeh
,
Y.
,
Rezaie
,
A. H.
, and
Amindavar
,
H.
,
2016
, “
A Dynamic Modeling Approach for Anomaly Detection Using Stochastic Differential Equations
,”
Digital Signal Process.
,
54
(
1
), pp.
1
11
.
34.
Lv
,
F.
,
Liang
,
T.
,
Zhao
,
J.
,
Zhuo
,
Z.
,
Wu
,
J.
, and
Yang
,
G.
,
2021
, “
Latent Gaussian Process for Anomaly Detection in Categorical Data
,”
Knowl. Based Syst.
,
220
(
1
), p.
106896
.
35.
Yu
,
G.
,
Cai
,
Z.
,
Wang
,
S.
,
Chen
,
H.
,
Liu
,
F.
, and
Liu
,
A.
,
2019
, “
Unsupervised Online Anomaly Detection With Parameter Adaptation for KPI Abrupt Changes
,”
IEEE Trans. Netw. Serv. Manag.
,
17
(
3
), pp.
1294
1308
.
36.
Pang
,
G.
,
Shen
,
C.
,
Cao
,
L.
, and
Hengel
,
A. V. D.
,
2021
, “
Deep Learning for Anomaly Detection: A Review
,”
ACM Comput. Surv. (CSUR)
,
54
(
2
), pp.
1
38
.
37.
Chalapathy
,
R.
, and
Chawla
,
S.
,
2019
, “
Deep Learning for Anomaly Detection: A Survey
,”
arXiv preprint arXiv:1901.03407
. https://arxiv.org/abs/1901.03407
38.
Tao
,
X.
,
Gong
,
X.
,
Zhang
,
X.
,
Yan
,
S.
, and
Adak
,
C.
,
2022
, “
Deep Learning for Unsupervised Anomaly Localization in Industrial Images: A Survey
,”
IEEE Trans. Instrum. Measur.
,
71
(
1
), pp.
1
21
. DOI: 10.1109/TIM.2022.3196436.
39.
Fernando
,
T.
,
Gammulle
,
H.
,
Denman
,
S.
,
Sridharan
,
S.
, and
Fookes
,
C.
,
2021
, “
Deep Learning for Medical Anomaly Detection-A Survey
,”
ACM Comput. Surv. (CSUR)
,
54
(
7
), pp.
1
37
.
40.
Baur
,
C.
,
Denner
,
S.
,
Wiestler
,
B.
,
Navab
,
N.
, and
Albarqouni
,
S.
,
2021
, “
Autoencoders for Unsupervised Anomaly Segmentation in Brain MR Images: A Comparative Study
,”
Med. Imag. Anal.
,
69
(
1
), p.
101952
.
41.
Hu
,
X.
,
Lian
,
J.
,
Zhang
,
D.
,
Gao
,
X.
,
Jiang
,
L.
, and
Chen
,
W.
,
2022
, “
Video Anomaly Detection Based on 3D Convolutional Auto-Encoder
,”
Sign. Image Video Process.
,
16
(
7
), pp.
1885
1893
.
42.
Kingma
,
D. P.
, and
Welling
,
M.
,
2013
, “
Auto-Encoding Variational Bayes
,”
arXiv preprint arXiv:1312.6114
. https://arxiv.org/abs/:1312.6114
43.
Lee
,
G.
,
Jung
,
M.
,
Song
,
M.
, and
Choo
,
J.
,
2020
, “
Unsupervised Anomaly Detection of the Gas Turbine Operation Via Convolutional Auto-Encoder
,”
2020 IEEE International Conference on Prognostics and Health Management (ICPHM)
,
Detroit, MI
,
June 8–10
, IEEE, pp.
1
6
.
44.
Agrawal
,
S.
, and
Agrawal
,
J.
,
2015
, “
Survey on Anomaly Detection Using Data Mining Techniques
,”
Proc. Comput. Sci.
,
60
(
1
), pp.
708
713
.
45.
Zhang
,
X.
,
Wei
,
P.
, and
Wang
,
Q.
,
2023
, “
A Hybrid Anomaly Detection Method for High Dimensional Data
,”
PeerJ Comput. Sci.
,
9
(
1
), p.
e1199
.
46.
Yan
,
S.
,
Shao
,
H.
,
Xiao
,
Y.
,
Liu
,
B.
, and
Wan
,
J.
,
2023
, “
Hybrid Robust Convolutional Autoencoder for Unsupervised Anomaly Detection of Machine Tools Under Noises
,”
Rob. Comput. Integr. Manuf.
,
79
(
1
), p.
102441
.
47.
Aytekin
,
C.
,
Ni
,
X.
,
Cricri
,
F.
, and
Aksu
,
E.
,
2018
, “
Clustering and Unsupervised Anomaly Detection With L 2 Normalized Deep Auto-Encoder Representations
,”
2018 International Joint Conference on Neural Networks (IJCNN)
,
Rio, Brazil
,
July 8–13
, IEEE, pp.
1
6
..
48.
Zong
,
B.
,
Song
,
Q.
,
Min
,
M. R.
,
Cheng
,
W.
,
Lumezanu
,
C.
,
Cho
,
D.
, and
Chen
,
H.
,
2018
, “
Deep Autoencoding Gaussian Mixture Model for Unsupervised Anomaly Detection
,”
International Conference on Learning Representations.
,
Vancouver, British Columbia, Canada
,
May 1–3
.
49.
Oune
,
N.
, and
Bostanabad
,
R.
,
2021
, “
Latent Map Gaussian Processes for Mixed Variable Metamodeling
,”
Comput. Meth. Appl. Mech. Eng.
,
387
(
49
), p.
114128
.
50.
Bostanabad
,
R.
,
Kearney
,
T.
,
Tao
,
S. Y.
,
Apley
,
D. W.
, and
Chen
,
W.
,
2018
, “
Leveraging the Nugget Parameter for Efficient Gaussian Process Modeling
,”
Int. J. Numer. Meth. Eng.
,
114
(
5
), pp.
501
516
.
51.
Tao
,
S.
,
Shintani
,
K.
,
Bostanabad
,
R.
,
Chan
,
Y.-C.
,
Yang
,
G.
,
Meingast
,
H.
, and
Chen
,
W.
,
2010
, “
Enhanced Gaussian Process Metamodeling and Collaborative Optimization for Vehicle Suspension Design Optimization
,”
ASME 2017 International Design Engineering Technical Conferences and Computers and Information in Engineering Conference
,
Cleveland, OH
,
Aug. 6–9
.
52.
Bostanabad
,
R.
,
Kearney
,
T.
,
Tao
,
S.
,
Apley
,
D. W.
, and
Chen
,
W.
,
2018
, “
Leveraging the Nugget Parameter for Efficient Gaussian Process Modeling
,”
Int. J. Numer. Meth. Eng.
,
114
(
5
), pp.
501
516
.
53.
Liu
,
H.
,
Ong
,
Y-S.
,
Shen
,
X.
, and
Cai
,
J.
,
2020
, “
When Gaussian Process Meets Big Data: A Review of Scalable GPs
,”
IEEE Trans. Neural Netw. Learn. Syst.
,
31
(
11
), pp.
4405
4423
.
54.
Moon
,
H.
,
2010
, “Design and Analysis of Computer Experiments for Screening Input Variables,” Ph.D. thesis,
The Ohio State University
,
Columbus, OH
.
55.
Morris
,
M. D.
,
Mitchell
,
T. J.
, and
Ylvisaker
,
D.
,
1993
, “
Bayesian Design and Analysis of Computer Experiments: Use of Derivatives in Surface Prediction
,”
Technometrics
,
35
(
3
), pp.
243
255
.
56.
Egger
,
D. A.
,
Rappe
,
A. M.
, and
Kronik
,
L.
,
2016
, “
Hybrid Organic-Inorganic Perovskites on the Move
,”
Acc. Chem. Res.
,
49
(
3
), pp.
573
581
.
57.
Lumley
,
R.
,
2010
,
Fundamentals of Aluminium Metallurgy: Production, Processing and Applications
,
Elsevier
,
Sawston, Cambridge, UK
.
58.
Kopper
,
A.
,
Karkare
,
R.
,
Paffenroth
,
R. C.
, and
Apelian
,
D.
,
2020
, “
Model Selection and Evaluation for Machine Learning: Deep Learning in Materials Processing
,”
Integr. Mater. Manuf. Innov.
,
9
(
1
), pp.
287
300
.
59.
Eweis-Labolle
,
J. T.
,
Oune
,
N.
, and
Bostanabad
,
R.
,
2022
, “
Data Fusion With Latent Map Gaussian Processes
,”
ASME J. Mech. Des.
,
144
(
9
), p.
091703
.
You do not currently have access to this content.