Abstract
A whole building fault (WBF) refers to a fault occurring in one component, but may cause impacts on other components or subsystems, or arise significant impacts on energy consumption and thermal comfort. Conventional methods (such as component level rule-based method or physical model-based method) which targeted at component level fault detection cannot be successfully used to detect a WBF because of the fault propagation among the closely coupled equipment or subsystems. Therefore, a novel data-driven method named weather and schedule-based pattern matching (WPM) and feature-based principal component analysis (FPCA) method for WBF detection is developed. Three processes are established in the WPM-FPCA method to address three main issues in WBF detection. First, a feature selection process is used to pre-select data measurements which represent a whole building’s operation performance under a satisfied status, namely, baseline status. Second, a WPM process is used to locate weather and schedule patterns in the historical baseline database, which are similar to that from the current/incoming operation data, and to generate a WPM baseline. Lastly, real-time PCA models are generated for both the WPM baseline data and the current operation data. Statistic thresholds used to differentiate normal and abnormal (faulty) operations are automatically generated in this PCA modeling process. The PCA models and thresholds are used to detect the WBF. This paper is the first of a two-part study. Performance evaluation of the developed method is conducted using data collected from a real campus building and will be described in the second part of this paper.
1 Introduction
Today, residential and commercial buildings are responsible for more than 40% of the primary energy consumption in the U.S. [1]. Energy wastes are estimated to reach 15–30% of the total building energy consumption due to poorly maintained, degraded, and improperly controlled equipment or systems in commercial buildings in the U.S. [2]. According to the U.S. Department of Energy, it is estimated that faults can increase primary energy consumption in commercial buildings by approximately one quad (11%) of the energy consumed by heating, ventilation, and air conditioning (HVAC), lighting, and larger refrigeration systems [3]. HVAC component or system faults, i.e., malfunctioning control, operation, sensors, and equipment are considered as the top cause for “deficient” building systems [4] and may trigger significantly negative impacts not only on indoor environment and occupants, but also on operational cost, system lifespan, and energy consumption. Automated fault detection and diagnosis (AFDD) tools, which use an automated procedure to discover operational abnormalities and locate root causes in HVAC, lighting, or other building systems [5], can help to reach energy savings by 5–20% in a building [3]. At the same time, the deployment of AFDD tools can also improve indoor environment quality and reduce peak demand and pollution [6].
Numerous investigations have been conducted on the development of AFDD tools in buildings during the past several decades. However, they mainly focus on component or equipment level fault detection and diagnosis such as chiller plants [7–9], air handling units (AHU) [10–12], variable air volume (VAV) terminal units [13], heat pump systems [14], and so on. Reviews for the development of component level AFDD methods and tools can be found in Refs. [2,5,15,16].
Compared with the component level AFDD tool, the development of whole building level AFDD methods, which can detect and diagnose a whole building fault (WBF), has seldom been studied in the past years. In the context of this paper, a WBF is defined as a fault that may occur in one equipment or component but would impact (cause abnormalities) on more than one component and subsystem, or cause significant impacts on the energy consumption or indoor thermal comfort in a building. In a building HVAC system, various components or equipment are closely linked together. The coupled components or subsystems often cause fault propagation [17], i.e., a fault occurring in one component may impact other components or subsystems, triggering additional faults. For example, when an outdoor air damper of an AHU is stuck at a position that is much higher than normal in the cooling season, this fault would not only cause the AHU’s cooling coil valve to be opened at higher than a normal position, but also could cause abnormalities in the primary cooling system, such as a higher chilled water pump speed and/or higher return chilled water temperature. Furthermore, operators or owners of small-sized or medium-sized buildings often only have the time and budget to pay attention to faults that have a whole building level impact, which cannot be completely reflected by a component level AFDD tool. Therefore, it is necessary to develop methods for whole building level diagnostics, which provide a system level view of a building’s operation [18].
In recent years, the development of system level AFDD methods for HVAC systems has been gaining attention. These methods can be divided into two categories: detailed energy model-based methods and data-driven methods. In the detailed energy model-based methods, a system or equipment energy consumption model is first developed. Energy consumption abnormality can be then identified by comparing the real-time energy consumption with the baseline energy model. For example, Bynum et al. [19] developed an Automated Building Commissioning Analysis Tool (ABCAT), which was a simulation tool combining with an expert knowledge-based (i.e., rules summarized based on domain knowledge) system to detect the energy abnormality. The ABCAT used a baseline building energy model which was calibrated by using real energy consumption data collected from the building automation system (BAS), as well as local weather data. Forecasted energy performances from this calibrated simulation model were then compared with the measured data to detect faults. O'Neill et al. [20] developed another energy model-based fault detection and diagnosis (FDD) method. In this approach, energy consumption data from different subsystems including the HVAC system, lighting system, and plug equipment usage were compared with a reference EnergyPlus model, which was used to calculate the expected annual energy consumption to determine if a system had faults. In addition, another top-down strategy by incorporating temporal and spatial partitions was developed to detect HVAC faults across different levels in a building [18]. In this study, building energy consumption flow features were first extracted to detect system faults. Then, energy consumption data were partitioned and grouped to locate fault sources according to system temporal and spatial characteristics.
The advantage of energy model-based AFDD methods is that detailed building energy simulation results, which could provide a relatively accurate energy baseline (if the simulation models are well calibrated) are used to analyze the building’s energy performance. If the developed model could accurately represent subsystems and whole building level operation, fault diagnosis would not be too difficult because both subsystem level and whole building level comparison (i.e., models versus measurements) can be achieved to isolate faults. The main disadvantage of this approach is that developing such a detailed and accurate building energy simulation model is very time-consuming and expensive, and calibrating such models to achieve a subsystem level accuracy for fault isolation remains to be very challenging.
The second category of WBF AFDD method is the data-driven method, which uses BAS interval data and statistical approaches to identify system operating abnormalities and to locate fault root causes. Currently, BASs are increasingly deployed in commercial buildings and a significant amount of operating information from a HVAC system can be efficiently gathered through sensors and BASs. The added data sources bring tremendous benefits to closely monitor the building operating performance. Further, more and more data-driven methods, which exhibit an advantage of cost-effectiveness, high scalability, and easy integration with existing BASs, have been developed in recent years. Some representative data-driven WBF AFDD methods are reviewed below. For example, a statistical process control and Kalman filter-based method was developed for system level fault detection in HVAC systems [21]. Georgescu et al. used spectral methods to decompose diagnostic rule signals generated from trended data from a building management system, into components, which captured the device behavior at temporal and spatial scales [22]. Through this way, fault classifiers had an increased capability to observe abnormal system behaviors. A recursive deterministic perceptron (RDP) neural network was developed to detect and diagnose faults at the entire building level system [23]. In this study, four RDP neural network models were trained to represent four major components (i.e., chiller, coil, fan, and pump) in a HVAC system.
One strategy of using data-driven methods is using the principal component analysis (PCA) method, which is a multivariate statistical method and has been widely used in process monitoring and fault detection [24]. In the building sector, many studies have been conducted on using PCA-based methods for HVAC AFDD. For example, the PCA method was used to detect faults in variable refrigerant flow systems [25], chiller sensors [26], AHUs [27], and VAV terminal units [28]. Furthermore, in order to adapt various operation characteristics in a HVAC system, a variety of modifications have been developed to improve PCA detection accuracy and robustness. For instance, it is found that the detection accuracy and efficiency can be significantly improved when some data preprocessing techniques are used. Hu et al. developed a self-adaptive PCA which removes error samples in the original dataset and improves the method sensitivity [29]. The method was successfully used for chiller sensor fault detection. Li and Wen developed a wavelet transform to preprocess data so that impacts on fault detection performance due to changeable HVAC operations caused by various weather conditions could be minimized [30]. Li et al. developed a PCA-R and support vector data description method to obtain a better fault data distribution and tighter monitoring statistic, so that the fault detection sensitivity can be improved [31].
In this study, two preprocessing techniques, namely, weather and schedule-based pattern matching (WPM) and feature-based PCA (FPCA), have been developed to address three common fault detection challenges in complex HVAC systems, especially at a building level.
The first challenge faced by all HVAC fault detection methods is that different internal and external conditions, such as weather conditions, occupants and other internal loads, have strong influences on HVAC systems’ operation. How to differentiate abnormalities that are triggered by weather/occupancy from those triggered by faults is challenging. More specifically, how to generate a baseline that represents the same HVAC system operation as the incoming data is therefore difficult. Second, a large amount of data are generated from various sensors and are collected by a typical BAS. The high volume of measured data may cause a heavy computation burden for data-driven approaches and may reduce the detection accuracy. Last, a majority of existing methods have been evaluated only through simulation studies. However, data generated through these simulation platforms could hardly reflect a real system’s operational behaviors, especially when a system contains faults. Therefore, it is worthwhile to fully evaluate the developed method by using real BAS data in a building. Besides, using real BAS data for FDD method evaluation would also enable the method to be more easily integrated with an existing BAS.
This paper is the first part of a two-part study. In this paper, the theory of the WPM-FPCA method is introduced in detail. This paper is organized as follows. Section 2 describes the outline of the WPM-FPCA method. Section 3 presents the feature selection process. Section 4 introduces the development of the WPM method. Section 5 illustrated the PCA method for automated fault detection. Finally, Sec. 6 concludes this paper. The performance evaluation of the proposed method is carried out by using BAS data collected from a real campus building and will be described in the second part of this paper.
2 Outline of the Method
The architecture of the WPM-FPCA method consists of three procedures of the detection sequence as shown in Fig. 1.
The first procedure (Sec. A in Fig. 1) is to perform a feature selection process by using the Partial Least Square Regression and Genetic Algorithm (PLSR-GA) method. A historical baseline dataset would need to be collected first. Here, a baseline is defined as a status at which the building’s operation is considered to be satisfactory, such as when the building has just gone through a commissioning process. For a real building, a true fault-free status is hardly achievable. Hence, a more realistic fault detection process is to detect when a building’s status is significantly different from its baseline status. As discussed in Sec. 1, such a feature selection process, which pre-selects key features from all BAS measurements, is necessary to reduce the dimension of the original BAS dataset. This process is implemented in an offline manner, so that the time spent on real-time fault detection can be reduced. The key features selected from this process are later used to develop the PCA models.
The second procedure (Sec. B in Fig. 1) is to perform a WPM (weather and schedule-based pattern matching). In this study, a symbolic aggregate approximation (SAX, further detailed in Sec. 4.2) algorithm, which is a highly efficient pattern matching (PM) algorithm designed for time series data, is adopted for the WPM process. The SAX algorithm searches the historical baseline dataset by tagging the weather data. Through this tagging process, a WPM baseline dataset, which has similar weather conditions and building use (operation and occupancy) as the incoming snapshot data (current status data collected by the BAS), is generated for each snapshot window.
The third procedure (Sec. C in Fig. 1) is to determine whether a whole building level abnormality exists. The PCA modeling process is used here to evaluate the similarity (T2 value used in this study) between the incoming real-time building operation data and the historical WPM baseline data. If the similarity (i.e., T2 value) between the PCA model from the incoming data and that from the historical WPM baseline data is higher than the threshold (automatically generated when baseline PCA model is developed), the system operation is considered abnormal (faulty). This process can be carried out as an online process.
3 Development of Feature Selection Process
3.1 Introduction of Feature Selection.
While a large amount of HVAC system data (sensor measurements, control signals, etc.) can be easily collected and stored in an advanced BAS, not all of the data are equally important for monitoring system operation. Usually, only very few useful information can be deciphered from this large amount of data. This leads to a “data rich, but information poor” syndrome in the current BAS [30]. Furthermore, the enormous amount of data in building systems may also lower the accuracy of fault detection method [32]. Therefore, a determination and pre-selection of fewer key variables, which can impact the system overall operation and relevant information, should be conducted to improve the detection efficiency and accuracy when implementing fault detection [33].
These key variables can be defined as features of a system in the context of data analytics. Consequently, the determination of these key variables is equivalent to a process of reducing the variable dimensionality. This process can be achieved by optimally designing the monitoring system using expert knowledge, i.e., an expert determines and pre-selects a specific subset of a variable dataset for monitoring system operation [33]. However, in a large-scale system, manually selecting the proper variable subsets and reducing the variable dimensionality become challenging for the following three reasons. First, the number of data measurements has increased rapidly as a result of an increasing complexity of HVAC systems in commercial buildings. Therefore, a manual identification process would take too much time, and would not be suitable for the whole building AFDD solutions which require less human intervention. Second, a system with multi-operation modes may generate various combinations of sub datasets. This makes the expert knowledge-based variables reduction impractical in real practice. Lastly, key variables may often vary from building to building due to a diversity of building design, envelope types, internal load, and system configurations. It causes a manual process for variable selection to be more difficult and time-consuming. Hence, an automated variable dimension reduction technique is determined to be used to implement the variable selection process.
Although the PCA method can partly serve as the dimension reduction function when it is used for fault detection, PCA will be iteratively used to detect the abnormality in a dataset in each snapshot window. In such a detection scenario, a very high data dimensionality will cause a very heavy computational burden. Therefore, the PLSR-GA (explained in Sec. 3.2) method is used to pre-select features which represent the whole building’s operational performance in an offline manner.
3.2 Partial Least Square Regression and Genetic Algorithm Method.
In the PLSR-GA method, PLSR is used to develop a system performance model. Both building electrical power consumption and building virtual cooling consumption from historical baseline dataset can be used as the target output variables (indicators), respectively.
There are many PLSR algorithms for feature selection. Most methods are based on orthogonal score PLSR. The “wrapper” method [34] is one of the most popular methods among them. The PLSR based “wrapper” method is implemented in an iterative way, and is based on some supervised learning approaches, in which a PLSR model refitting is wrapped within the variable searching algorithm. Various variable searching algorithms can be used to obtain a subset of candidate variables and evaluate each subset by fitting a system performance model to the variable subset, so that the model performance can be evaluated and optimized.
Although all possible variable subsets are ideally to be evaluated, it becomes very time-consuming because the number of variable subsets increases dramatically with the number of variables. Therefore, a randomized searching method, i.e., GA [35] is used in the PLSR for feature selection in this study.
3.2.1 PLSR for the Development of a Building Energy Consumption Model.
Partial least square regression is a technique that generalizes and combines features from PCA and multiple regression. Although this technique was not originally designed for classifying and discriminating the dataset, it was found to be efficient to distinguish different datasets and determine the key variables [36].
3.2.2 Genetic Algorithm for Subset Searching.
The GA searching process is used to facilitate the process of searching candidate variable subsets. The GA searching process acts in an iterative way to extract the subset of relevant variables and evaluate each subset by fitting a model with the variable subset. When the system performance criteria are met, the variable subset can be determined. By this means, the key informative variables which represent a system overall performance can be selected, and the variable dimensionality can be reduced accordingly.
The GA is a stochastic searching methodology based on an analogy to natural selection and genetics in biological systems [40]. Compared with other algorithms, the GA has several features: (1) the GA works with a set of candidate solutions known as a population and obtains the optimal solution after a series of iterative computations. This process is similar to the natural selection, in which high-quality solutions to optimization and search problems could be generated through the operators such as mutation, crossover and selection defined in field of biology [41]; (2) the GA uses the objective function information rather than explicit derivative information of the problem set; and (3) the GA can handle large search spaces efficiently, and therefore has a lower risk to obtain a local optimal solution [42]. Based on those characteristics, the GA was originally proposed to solve various single-objective or multi-objective optimization problems when a set of solutions were found to be superior to the rest of the solutions in the search space [43].
In the GA searching algorithm, a given variable subset is represented as a binary string of length n. Here, n is the total number of the available variables and the string can be mimicked as “chromosome” in biology [44]. In the position index i of the string, a zero or one will denote the absence or presence of variable i in the variable set. Through this way, a population of variable subsets (chromosomes) is maintained. The “fitness” is used to determine how likely the variable set can survive and breed into the next generation [45]. A new variable set is created from old variable sets by the process similar to the crossover and mutation. In the process of crossover, two different parent variable sets are mixed together to create an offspring variable subset. In the process of mutation, the indices of a single parent variable set are randomly selected to create an offspring variable subset. The iteration will continue through these two processes until the ending criterion is met. This criterion can be either a finite number of iterations or a certain percentage of the individuals in the population using identical variable subsets.
The procedure of using the GA searching method in PLSR for feature selection is illustrated in Fig. 2. The first step is to generate random variable subsets. In this step, an initial population of variable sets is established by setting bits for each variable randomly, where bit “1” represents selection of corresponding variable while “0” presents nonselection. The approximate size of the variable sets is defined in advance. Then, each individual subset of the selected variables is evaluated. In this step, a PLSR model is fit to each variable set, and the model performance is computed.
In the third step, a collection of variable sets with higher performance are selected to survive until the next “generation.” In the GA method, crossover and mutation can be used to form new variable sets by (1) crossover of selected variables between the surviving variable sets and (2) changing (mutating) the bit value for each variable by a small probability [48,49].
Finally, a selected variable subset is determined. This subset will be used in the fault detection procedure.
3.3 PLSR-GA Method for Feature Pre-Selection.
In the variable selection process for this study, whole building electricity consumption is used as the predicted variable. All data points collected from a BAS are input variables.
In this study, a medium-sized mixed-use commercial building from Philadelphia, PA, is used for method development and evaluation. More detailed description about this building is provided in Part II of this two-part paper series. In the BAS of this building, 537 data points (measurements, control signals, etc.) are collected at a 5 min interval. Historical baseline datasets have been collected for summer, winter, and transitional seasons. For this building, there are three operation mode, i.e., Mode #1 and Mode #2 for summer season, as well as Mode #3 for transitional and winter seasons. Each mode here refers to a different operation strategy/schedule. More details are again provided in Part II of the study.
For each operation mode, the feature selection procedure is divided into several selection scenarios. This is because one of the risks in the GA variable selection is overfitting. Therefore, repeating the GA for several times and obtaining general trends can help to lower the risk of overfitting [47]. If a variable is repeatedly selected from different scenarios, this variable is more likely to represent the system operational performance for the entire operation mode.
In this research, every five consecutive days are grouped into one selection scenario in each operation mode. If a variable is selected in three scenarios, then the variable will be included in the final feature list to be used in the development of the PCA model.
3.4 Evaluation of PLSR-GA Feature Selection.
Although, Akaike Information Criterion [50] and Bayesian Information Criterion [51] can be used to evaluate the feature selection efficacy by analyzing the model developed from the selected variables, the ending purpose is not to evaluate the performance of the regression model developed from the selected variable set. Therefore, the results were evaluated by manually examining the selected variables.
In this study, the evaluation of the PLSR-GA feature selection is divided into two processes. In the first evaluation process, the criterion is established based on the empirical method to examine the feature selection ratio on the key measurements from the expert’s knowledge. This process is common in the field of image process and process control industry. In the second evaluation process, the criterion is to use the fault detection result to demonstrate the effectiveness of feature selection.
For a building system, which has many coupled subsystems, it is quite challenging to choose one or two measurements or control signals as benchmarks to assess an overall system operation. In the study, a total of 30 measurements from the HVAC system are manually selected as key measurements to evaluate the feature selection result. Among these measurements, measurements No. 1 to No 9. are from the chiller plant and are only used during a summer season (i.e., operation mode 1 and mode 2). Measurements No. 10 to No. 30 are from AHU-1 to AHU-3. The outdoor air damper position of AHU-3 is not included as a key measurement because the outdoor air damper in AHU-3 is opened at 100% position during all seasons as the AHU-3 serves lab areas. The steam valve positions for AHU-1 to AHU-3 (measurements Nos. 28, 29, and 30) are only used in the operation mode in transitional and winter seasons. The cooling coil valve positions for AHU-1 and AHU-3 (measurements No. 13, 19, and 25) are only used in evaluating variable selection result in summer season (i.e., operation mode #1 and #2). Weather conditions (outdoor air temperature, outdoor air humidity and outdoor air enthalpy) are not included here because those measurements are not used to develop the PCA model.
Therefore, a total of 27 measurements are used in the evaluation for operation mode #1 and operation mode #2. And a total of 18 measurements are used in the evaluation for operation mode #3. All key measurements in each operation mode are provided in Table 1. The word “variable” instead of “measurement” will be used in the following context to illustrate the feature selection results.
Key measurements (variables) from the BAS
No. | Name of measurement | Mode#1 | Mode#2 | Mode#3 | No. | Name of measurement | Mode#1 | Mode#2 | Mode#3 |
---|---|---|---|---|---|---|---|---|---|
1 | CHWS temperature | √ | √ | × | 16 | AHU-2 Supply air temperature | √ | √ | √ |
2 | CHWR temperature | √ | √ | × | 17 | AHU-2 Supply air static pressure | √ | √ | √ |
3 | Cooling water supply temperature | √ | √ | × | 18 | AHU-2 Supply air fan speed | √ | √ | √ |
4 | Cooling water return temperature | √ | √ | × | 19 | AHU-2 Cooling coil valve position | √ | √ | × |
5 | Chilled water differential pressure | √ | √ | × | 20 | AHU-2 Outdoor air damper position | √ | √ | √ |
6 | Chilled water pump speed | √ | √ | × | 21 | AHU-2 Mixed air temperature | √ | √ | √ |
7 | Chilled water flowrate | √ | √ | × | 22 | AHU-3 Supply air temperature | √ | √ | √ |
8 | Cooling water differential pressure | √ | √ | × | 23 | AHU-3 Supply air static pressure | √ | √ | √ |
9 | Cooling water pump speed | √ | √ | × | 24 | AHU-3 Supply air fan speed | √ | √ | √ |
10 | AHU-1 Supply air temperature | √ | √ | √ | 25 | AHU-3 Cooling coil valve position | √ | √ | × |
11 | AHU-1 Supply air static pressure | √ | √ | √ | 26 | AHU-3 Supply air humidity | √ | √ | √ |
12 | AHU-1 Supply air fan speed | √ | √ | √ | 27 | AHU-3 Between-coil air temperature | √ | √ | √ |
13 | AHU-1 Cooling coil valve position | √ | √ | × | 28 | AHU-1 Steam valve position | × | × | √ |
14 | AHU-1 Outdoor air damper position | √ | √ | √ | 29 | AHU-2 Steam valve position | × | × | √ |
15 | AHU-1 Mixed air temperature | √ | √ | √ | 30 | AHU-3 Steam valve position | × | × | √ |
No. | Name of measurement | Mode#1 | Mode#2 | Mode#3 | No. | Name of measurement | Mode#1 | Mode#2 | Mode#3 |
---|---|---|---|---|---|---|---|---|---|
1 | CHWS temperature | √ | √ | × | 16 | AHU-2 Supply air temperature | √ | √ | √ |
2 | CHWR temperature | √ | √ | × | 17 | AHU-2 Supply air static pressure | √ | √ | √ |
3 | Cooling water supply temperature | √ | √ | × | 18 | AHU-2 Supply air fan speed | √ | √ | √ |
4 | Cooling water return temperature | √ | √ | × | 19 | AHU-2 Cooling coil valve position | √ | √ | × |
5 | Chilled water differential pressure | √ | √ | × | 20 | AHU-2 Outdoor air damper position | √ | √ | √ |
6 | Chilled water pump speed | √ | √ | × | 21 | AHU-2 Mixed air temperature | √ | √ | √ |
7 | Chilled water flowrate | √ | √ | × | 22 | AHU-3 Supply air temperature | √ | √ | √ |
8 | Cooling water differential pressure | √ | √ | × | 23 | AHU-3 Supply air static pressure | √ | √ | √ |
9 | Cooling water pump speed | √ | √ | × | 24 | AHU-3 Supply air fan speed | √ | √ | √ |
10 | AHU-1 Supply air temperature | √ | √ | √ | 25 | AHU-3 Cooling coil valve position | √ | √ | × |
11 | AHU-1 Supply air static pressure | √ | √ | √ | 26 | AHU-3 Supply air humidity | √ | √ | √ |
12 | AHU-1 Supply air fan speed | √ | √ | √ | 27 | AHU-3 Between-coil air temperature | √ | √ | √ |
13 | AHU-1 Cooling coil valve position | √ | √ | × | 28 | AHU-1 Steam valve position | × | × | √ |
14 | AHU-1 Outdoor air damper position | √ | √ | √ | 29 | AHU-2 Steam valve position | × | × | √ |
15 | AHU-1 Mixed air temperature | √ | √ | √ | 30 | AHU-3 Steam valve position | × | × | √ |
From this equation, a higher NSKV or a lower NSV will lead to a higher KVSER. This means that the variable selection method can actually better identify a key variable after the variable selection process.
3.4.1 Feature Selection Result Under the Operation Mode #1.
In mode #1, 40 historical baseline days are included. Eight variable selection scenarios are implemented.
A total of 101 variables are selected in the end. Among them, 12 key variables are included as given in Table 1. In operation mode #1, the chiller provides a relatively stable CHWS temperature, and the chilled water flowrate is also stable during all time in a day. Therefore, less variables from the chiller plant are selected. A total of 80 variables are from VAV terminals in different zones. Table 2 shows the selection analysis. Table 3 shows the selected key variables.
Variable selection analysis (mode #1)
Number of selected variables | Number of selected key variables | WVSR | KVSR | KVSER |
---|---|---|---|---|
101 | 12 | 18.8% | 44.4% | 11.9% |
Number of selected variables | Number of selected key variables | WVSR | KVSR | KVSER |
---|---|---|---|---|
101 | 12 | 18.8% | 44.4% | 11.9% |
Selected key variable list (mode #1)
No. | Selected key variable name | No. | Selected key variable name |
---|---|---|---|
1 | Cooling water return temperature | 7 | AHU-2 Supply air temperature |
2 | CHWS delta pressure | 8 | AHU-2 Supply air fan speed |
3 | Cooling water delta pressure | 9 | AHU-2 Cooling coil valve position |
4 | AHU-1 Supply air fan speed | 10 | AHU-2 supply air static pressure |
5 | AHU-1 Cooling coil valve position | 11 | AHU-3 Supply air temperature |
6 | AHU-1 Supply air static pressure | 12 | AHU-3 Between-coil temperature |
No. | Selected key variable name | No. | Selected key variable name |
---|---|---|---|
1 | Cooling water return temperature | 7 | AHU-2 Supply air temperature |
2 | CHWS delta pressure | 8 | AHU-2 Supply air fan speed |
3 | Cooling water delta pressure | 9 | AHU-2 Cooling coil valve position |
4 | AHU-1 Supply air fan speed | 10 | AHU-2 supply air static pressure |
5 | AHU-1 Cooling coil valve position | 11 | AHU-3 Supply air temperature |
6 | AHU-1 Supply air static pressure | 12 | AHU-3 Between-coil temperature |
3.4.2 Feature Selection Result Under the Operation Mode #2.
In mode #2, 50 historical baseline days are included. Ten variable selection scenarios are implemented.
Two hundred and fourteen variables are selected in the end. Among them, there are 11 key variables. In operation mode #2, two variables from weather conditions including outdoor air temperature and enthalpy are selected. One hundred and nineteen variables are from VAV terminals in different zones. Table 4 shows the selection analysis. Table 5 shows the selected key variables.
Variable selection analysis (mode #2)
Number of selected variables | Number of selected key variables | WVSR | KVSR | KVSER |
---|---|---|---|---|
214 | 11 | 39.9% | 40.7% | 5.1% |
Number of selected variables | Number of selected key variables | WVSR | KVSR | KVSER |
---|---|---|---|---|
214 | 11 | 39.9% | 40.7% | 5.1% |
Selected key variable list (mode #2)
No. | Selected key variable names | No. | Selected key variable names |
---|---|---|---|
1 | Chilled water flowrate | 7 | AHU-2 Supply air static pressure |
2 | Cooling water supply temperature | 8 | AHU-3 Supply air temperature |
3 | Cooling water delta pressure | 9 | AHU-3 Supply air humidity |
4 | AHU-1 Cooling coil valve position | 10 | AHU-3 Supply air fan speed |
5 | AHU-1 Supply air static pressure | 11 | AHU-3 Between-coil temperature |
6 | AHU-2 Cooling coil valve position |
No. | Selected key variable names | No. | Selected key variable names |
---|---|---|---|
1 | Chilled water flowrate | 7 | AHU-2 Supply air static pressure |
2 | Cooling water supply temperature | 8 | AHU-3 Supply air temperature |
3 | Cooling water delta pressure | 9 | AHU-3 Supply air humidity |
4 | AHU-1 Cooling coil valve position | 10 | AHU-3 Supply air fan speed |
5 | AHU-1 Supply air static pressure | 11 | AHU-3 Between-coil temperature |
6 | AHU-2 Cooling coil valve position |
3.4.3 Feature Selection Result Under the Operation Mode #3.
In mode #3, 50 historical baseline days are included. Ten variable selection scenarios are implemented.
A total of 171 variables are selected. Among them, there are six key variables. One hundred and forty-five variables are from VAV terminals in different zones. Table 6 shows the selection analysis. Table 7 shows the selected key variables.
Variable selection analysis (mode #3)
Number of selected variables | Number of selected key variables | WVSR | KVSR | KVSER |
---|---|---|---|---|
171 | 6 | 31.8% | 33.3% | 3.5% |
Number of selected variables | Number of selected key variables | WVSR | KVSR | KVSER |
---|---|---|---|---|
171 | 6 | 31.8% | 33.3% | 3.5% |
Selected key variable list (mode #3)
No. | Selected key variable names | No. | Selected key variable names |
---|---|---|---|
1 | AHU-1 Supply air fan speed | 4 | AHU-2 Steam valve position |
2 | AHU-1 Steam valve position | 5 | AHU-3 Supply air humidity |
3 | AHU-2 Supply air static pressure | 6 | AHU-3 Between-coil temperature |
No. | Selected key variable names | No. | Selected key variable names |
---|---|---|---|
1 | AHU-1 Supply air fan speed | 4 | AHU-2 Steam valve position |
2 | AHU-1 Steam valve position | 5 | AHU-3 Supply air humidity |
3 | AHU-2 Supply air static pressure | 6 | AHU-3 Between-coil temperature |
3.4.4 Discussion.
The feature selection process successfully lowers the data dimensionality in all three operation modes. The highest key variable selection rate and key variable selection efficiency are achieved for Operation mode #1. Despite that a higher WVSR is achieved in operation modes #2 and #3, the KVSERs for both modes are not as high as compared with mode #1. On the contrary, more variables from the VAV terminals are selected in those two operation modes (i.e., modes #2 and #3). Therefore, the evaluation of an overall performance of a HVAC system may include more downstream subsystems if the chiller’s operation is not stable or the chiller is not operated.
Although it is recommended that the first procedure in implementing a feature selection is to use the domain knowledge [52], in this research, expert knowledge was not included to modify the selection results. Furthermore, during the feature selection process, only the PLSR-GA method was used to select features that reflect a satisfied system’s operation. Therefore, it may cause some limitations such as key measurements that do not affect the system’s performance during a normal operation may not be selected. For example, the CHWS temperature is maintained to be a relative stable value (i.e., 6.7 °C in modes #1 and #2) under normal operation. Therefore, although the measurement of CHWS temperature can be very useful in judging a chiller’s operation status, it is not selected and not used in the fault detection process.
Based on the limitations above, the above-described feature selection result is recommended to only be used in the fault detection process (when one needs to judge the abnormality from normal operation), but not in the fault diagnosis process (when one needs to isolate the root cause of an abnormality and hence needs features that reflect system operation with faults).
4 Development of Weather and Schedule-Based Pattern Matching Method
4.1 Introduction of Pattern Matching.
In a HVAC system, different internal and external conditions such as weather, occupancy schedule, and usages of electrical appliances have direct influences on the system’s operation. For example, when outdoor air warms up in the morning hours, the air handling unit could change its operation from a heating mode, to an economizer mode and then to a cooling mode. In such cases, directly using the data during a long span of time from the historical dataset may generate inaccurate baseline and make the fault detection process difficult as the abnormality could be caused by faults or by other internal/external conditions.
To address this issue, a PM method was developed to locate periods of system operation that are similar to the current operating conditions. This method was successfully used in the AHU fault detection [10]. When developing a PM method, two aspects, which have impacts on the thermal load of a building, and further have impact on a HVAC system operation, need to be considered. One aspect is those external factors that have significant impacts on the thermal load such as the weather conditions. Another aspect is those factors that affect a building’s internal thermal load, such as building occupancy and the usage of electrical appliances. Under the same operation mode, the operation behavior of a fault-free HVAC system is similar to that from the historic baseline.
Finding and matching similar patterns of internal and external factors to generate operation baseline data that has a similar operation as that from the incoming data are hence important to differentiate abnormalities caused by faults from those caused by other factors. This section discusses methods developed for matching weather patterns. Notice that similar methods can be used to match factors related to internal load, if such measurements are available (i.e., occupancy and appliance usages etc.). For the building considered in this study, such measurements related to internal loads are not available. However, these internal loads are highly correlated to the time of the day. Hence, time of the day (schedule) is directly used to represent the internal load factors in this study. This situation, i.e., lacking measurements related to internal load, is not uncommon for most of the commercial buildings.
4.2 Symbolic Aggregate Approximation Method.
By this means, a time series data is divided into N equal-sized segments. At the same time, the data with n dimensions can be reduced to N dimensions. Then, the mean value of the data in each segment can be calculated. Therefore, this transformation produces a piecewise constant approximation of the original time series dataset. Figure 3 illustrates this transformation based on the PPA method. The x axis represents the time series and the y axis represents the data range.
4.3 Development of Weather and Schedule-Based Pattern Matching Method.
When using the SAX method, the historical dataset should be first generated. Then, the weather information data from the historical dataset and the new incoming snapshot data are combined together to produce one time series dataset. The weather information of incoming snapshot data is extracted from this time series dataset to generate X(t). The weather information data X(t) is normalized by using z-score normalization.
After normalization, weather information data X(t) is divided into N individual nonoverlapping segments which are equal-sized data segments. One data segment is defined as one window. In this study, the same window size is applied for both snapshot data and historical baseline dataset. The window size is determined by two factors: how fast weather conditions change and computation efficiency. If the window size is too large, weather changes largely within a data window which leads to a significant operation change of the system. This will reduce the PM accuracy. If the window size is too small, within each data window, only a small number of data samples will be included, which may not be sufficient for accurate comparison. Moreover, the computation burden for fault detection and diagnosis will be increased. How this window size is selected is discussed in more detail in Sec. 4.4.
A number of alphabets are used to convert each window into a SAX string or letter. Each alphabet corresponds to a subsequence of data. In this study, the number of alphabets is set to 10 according to the SAX [55]. In the study, the outdoor air enthalpy is used as the tagging target because air enthalpy includes both temperature information and humidity information.
When the weather information time series data is tagged, the system’s operation data segments with the same symbolic letters/strings are clustered to generate a WPM baseline dataset. In order to control sample size, in this research, a sample pool size S is used to define how many samples are used to generate a WPM baseline dataset. As mentioned above, although a smaller window size can more accurately capture the weather change, less information is obtained to generate a valid baseline database. Therefore, an adjacent window (referred as data sample searching pool) to the snapshot window can be used as the sample pool because the building occupancy may remain steady in a relatively long period. Figure 4 summarizes the above-discussed procedure of WPM by using the SAX method. Details about how to select both window size and sample pool size is provided in Sec. 4.4.
4.4 Selection of Weather and Schedule-Based Pattern Matching Parameters
4.4.1 Introduction.
When implementing the SAX method for the WPM, different parameter settings such as the number of baseline days, snapshot window size, and searching pool size, may have impacts on the number of samples in each snapshot window, as well as the WPM accuracy. Therefore, it is necessary to determine a proper parameter setting to ensure (1) the number of samples in each snapshot window, so that the later data-driven method can be implemented, and (2) the PM accuracy, so that better detection results can be achieved. Sensitivity analyses are carried out to include several test cases under different parameter settings as illustrated in Table 8. Three scenarios are conducted as
Number of WBF-free days: 20 days, 30 days, and 40 days. Notice that here WBF-free days refer to the baseline days, in which the building is considered to be without any WBF. The baseline is needed for a data-driven fault detection process. However, the more baseline days a fault detection strategy requires, the more challenging it is to collect the required data.
Snapshot window size: 15 min, 30 min, and 45 min.
Searching pool size: 120 min, 180 min, and 240 min.
Test description
Test name | Test description | Test cases |
---|---|---|
Number of WBF-free day | Test the minimum number of WBF-free day that is needed to achieve a satisfactory performance | 20-day, 30-day, and 40-day |
Snapshot window size | Test the impacts of the snapshot window size on the samples selected in each snapshot window | 15-min, 30-min, 45-min |
Searching pool size | Test the impacts of the searching pool size on the samples selected in each snapshot window | 120-min, 180-min, 240-min |
Test name | Test description | Test cases |
---|---|---|
Number of WBF-free day | Test the minimum number of WBF-free day that is needed to achieve a satisfactory performance | 20-day, 30-day, and 40-day |
Snapshot window size | Test the impacts of the snapshot window size on the samples selected in each snapshot window | 15-min, 30-min, 45-min |
Searching pool size | Test the impacts of the searching pool size on the samples selected in each snapshot window | 120-min, 180-min, 240-min |
The testing matrix on the parameters will become very complicated due to large combinations. Therefore, in this study, a baseline case is first defined. In this baseline case, the parameter selections are number of WBF-free day equals 30 day, snapshot window size equals 30 min, and sample pool size equals 180 min.
4.4.2 Historical Datasets and Evaluation Criterion.
Since the WPM method assumes that a building is operated under similar operating modes when it is subject to similar weather conditions (which is a typical situation), if a building has multiple operating modes under a similar weather condition, each operating mode requires a corresponding historical baseline. This is not very typical but has occurred in the demonstration building in this study. The facility has modified the building’s operation strategies between 2016 and 2017. Hence, a historical dataset for each year is constructed for each season. In this research, the test of the WPM method is implemented within the same season.
Here seven historical datasets from 2016 to 2017, i.e., spring 2017, summer 2016 and 2017, fall 2016 and 2017, as well as winter 2016 and 2017, are used to analyze the impact of different parameter settings on the WPM performance. In each dataset, 10 days are randomly selected as one test case to obtain the WPM baseline dataset.
Four quantitative criteria are used to evaluate the WPM performance, i.e., (1) maximum number of sample (MaxNS) in each snapshot window, (2) minimum number of sample (MinNS) in each snapshot window, (3) average number of sample (AveNS) in each snapshot window, and (4) percentage of snapshot window (PctWin) with sample size less than 100.
It is noted that there is a lack of recommendation from the existing literature on the minimum data samples that should be included in each snapshot window. Usually in a BAS, the sampling rate ranges from 1 min to 15 min. Here, the data sampling rate is 5 min in the BAS of the tested building. Time range of data collection for a building in one season can be from 20 days to 60 days depending on the actual operation. Therefore, for a 20-day to 60-day historical baseline day, the minimum and maximum number of data samples in a 30-min snapshot window can be 120–300 under 100% selection rate. Here the AveNS is set to be equal to 100 as a threshold, i.e., the parameters, which ensure the AveNS more than 100 in the WPM method is used.
For the minimum number of historical baselines based on the existing literature and some preliminary tests, 20 days, 30 days, and 40 days are selected to be tested. When this parameter is being tested, other parameters are set to be their default values, i.e., snapshot window size equals 30 min, and sample pool size equals 180 min.
4.4.3 Selection of Number of Historical Baseline Days.
Obviously, including more historical baseline days, i.e., WBF-free days, will increase the accuracy and robustness of the fault detection method. However, in the real practice, especially for building re-commissioning, it may not always be possible to obtain a large number of historical WBF-free days. Therefore, it will be meaningful to investigate a suitable number of WBF-free days used to generate a baseline.
In this research, 20-day, 30-day, and 40-day historical baseline days are used in the sensitivity analysis, respectively. Seven test cases corresponding to different seasons are collected from the test building. The sizes of historical baseline days for each test case are from 61 days to 63 days as given in Table 9.
Selection of minimum number of historical baseline days
Test cases | Size of historical baseline (days) | MaxNS | MinNS | AveNS | PctWin | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
20 | 30 | 40 | 20 | 30 | 40 | 20 | 30 | 40 | 20 | 30 | 40 | ||
Fall 2016 | 61 | 180 | 216 | 348 | 1 | 12 | 30 | 82 | 126 | 170 | 62% | 22% | 13% |
Fall 2017 | 62 | 204 | 312 | 432 | 1 | 1 | 1 | 84 | 133 | 230 | 66% | 33% | 14% |
Spring 2017 | 62 | 282 | 318 | 450 | 1 | 1 | 1 | 95 | 145 | 193 | 59% | 32% | 27% |
Summer 2016 | 62 | 288 | 342 | 438 | 1 | 1 | 30 | 122 | 170 | 218 | 40% | 14% | 9% |
Summer 2017 | 61 | 210 | 264 | 354 | 1 | 30 | 30 | 72 | 122 | 175 | 75% | 44% | 28% |
Winter 2016 | 63 | 288 | 414 | 474 | 1 | 6 | 6 | 104 | 159 | 195 | 43% | 31% | 27% |
Winter 2017 | 62 | 258 | 324 | 348 | 1 | 30 | 30 | 91 | 145 | 187 | 53% | 26% | 17% |
Test cases | Size of historical baseline (days) | MaxNS | MinNS | AveNS | PctWin | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
20 | 30 | 40 | 20 | 30 | 40 | 20 | 30 | 40 | 20 | 30 | 40 | ||
Fall 2016 | 61 | 180 | 216 | 348 | 1 | 12 | 30 | 82 | 126 | 170 | 62% | 22% | 13% |
Fall 2017 | 62 | 204 | 312 | 432 | 1 | 1 | 1 | 84 | 133 | 230 | 66% | 33% | 14% |
Spring 2017 | 62 | 282 | 318 | 450 | 1 | 1 | 1 | 95 | 145 | 193 | 59% | 32% | 27% |
Summer 2016 | 62 | 288 | 342 | 438 | 1 | 1 | 30 | 122 | 170 | 218 | 40% | 14% | 9% |
Summer 2017 | 61 | 210 | 264 | 354 | 1 | 30 | 30 | 72 | 122 | 175 | 75% | 44% | 28% |
Winter 2016 | 63 | 288 | 414 | 474 | 1 | 6 | 6 | 104 | 159 | 195 | 43% | 31% | 27% |
Winter 2017 | 62 | 258 | 324 | 348 | 1 | 30 | 30 | 91 | 145 | 187 | 53% | 26% | 17% |
The function of a historical baseline is to provide baseline data for incoming snapshot data. In this study, whenever a snapshot data comes in (for example 10:00–10:30, outdoor temperature is around 26.7 °C and relative humidity is around 70%), data from baseline dataset that are within the same time (10:00–10:30) and with similar weather (26.7 °C and 70% relative humidity) will be collected to form the WPM baseline for this specific snapshot data. Hence, the key of minimum historical baseline data collection really depends on how repeatable the weather is at a specific location.
In this study, for each test case, 10 days are randomly selected and considered as the incoming snapshot data. The sensitivity test will then randomly pick the remaining baseline days to form the historical baseline dataset with varying sizes.
Table 9 shows the test result. Again, four indices defined earlier are used here to evaluate the impact of varying historical baseline size on the number of samples that are selected to form the WPM baseline. The results show that, only in the test cases of summer and winter 2016, the AveNS can surpass 100 when using 20 historical days to generate baseline datasets. At the same time the PctWin for all test cases is more than 40%. This means that more than 40% of the window in the baseline has data samples less than 100. Therefore, 20-day is not used for the WPM method to generate the WPM baseline dataset.
When using 30 historical days to generate a baseline dataset, the AveNS covers from 122 to 170 in all seven test cases. At the same time, the PctWin ranges from 14% to 44%. When using 40 historical days to generate a baseline dataset, the AveNS covers from 170 to 230 in all seven test cases. At the same time, the PctWin ranges from 9% to 28%. Despite 40-day historical baseline generates better results in both AveNS and PctWin, considering in the real practice, the collection of more historical days becomes increasingly difficult, 30 days are selected as the minimum number of historical baseline days in this study.
4.4.4 Selection of Snapshot Window Size.
In order to identify the effects of snapshot window size on the WPM method and fault detection result, sensitivity tests are performed. A dataset with smaller window size helps the SAX method to better capture the weather dynamic change, and therefore generate more accurate baseline data. This will further help to detect faults with higher dynamic responses under different weather and indoor conditions.
In this study, snapshot window sizes of 15 min, 30 min, and 45 min are used to evaluate their impact on the number of data samples in each snapshot window. Data from the seven datasets are used to implement the sensitivity test. Historical datasets that cover a 30 WBF-free day range are used in each test. The criterion of AveNS and PctWin is also used to quantify the sensitivity test result.
Although a smaller window size (15 min or 30 min) can generate a more accurate WPM baseline from the historical baseline dataset than a larger (45 min) window size, too small a snapshot window size may cause a high computation burden.
Besides, the test results show that there is no significant impact on the number of data samples between 15-min and 30-min snapshot window sizes as shown in Table 10. Based on these analyses above, a window size of 30 min will be adopted in the WPM method.
Sensitivity test on snapshot window size
Snapshot window size | 15-min | 30-min | 45-min | |||
---|---|---|---|---|---|---|
AveNS | PctWin | AveNS | PctWin | AveNS | PctWin | |
Summer 2016 | 172 | 14% | 170 | 14% | 170 | 17% |
Fall 2016 | 125 | 20% | 126 | 22% | 129 | 21% |
Winter 2016 | 159 | 31% | 159 | 31% | 161 | 31% |
Spring 2017 | 146 | 31% | 145 | 32% | 147 | 33% |
Summer 2017 | 122 | 44% | 122 | 44% | 122 | 46% |
Fall 2017 | 133 | 33% | 133 | 33% | 127 | 40% |
Winter 2017 | 144 | 27% | 145 | 26% | 146 | 28% |
Snapshot window size | 15-min | 30-min | 45-min | |||
---|---|---|---|---|---|---|
AveNS | PctWin | AveNS | PctWin | AveNS | PctWin | |
Summer 2016 | 172 | 14% | 170 | 14% | 170 | 17% |
Fall 2016 | 125 | 20% | 126 | 22% | 129 | 21% |
Winter 2016 | 159 | 31% | 159 | 31% | 161 | 31% |
Spring 2017 | 146 | 31% | 145 | 32% | 147 | 33% |
Summer 2017 | 122 | 44% | 122 | 44% | 122 | 46% |
Fall 2017 | 133 | 33% | 133 | 33% | 127 | 40% |
Winter 2017 | 144 | 27% | 145 | 26% | 146 | 28% |
4.4.5 Sensitivity Test on Data Sample Searching Pool Size.
When forming the “similar baseline data,” neighboring windows of the snapshot window are typically included in the searching pool for the WPM baseline. For example, when analyzing a snapshot window of 10:00 and 10:30, all collected historical baseline data between 10:00 and 10:30, as well as its neighboring windows, such as 9:30 to 10:00 and 10:30 to 11:00, can be used to form the searching pool, among which, those with similar weather conditions will be selected as the “similar baseline data,” as shown in Fig. 5. This searching pool size (S) will affect the WPM accuracy. If S is too small, there would not be enough samples to be included in the WPM baseline dataset. If S is too large, the time frame of the neighboring windows will be too long and the assumption of invariant internal load may not be correct. In the above example, when extending the concept of neighboring windows to be from 8:00 to 12:00, instead of 9:00 to 11:00, data that contain very different internal loads could be included in the searching pool, and thus affect the WPM accuracy.
In order to test the data sample searching pool size, three searching sizes such that 120 min (2 windows before and 2 windows after), 180 min (3 windows before and 3 windows after), and 240 min (4 windows before and 4 windows after) are considered. Here, the snapshot window size is chosen as 30 min, based on the previous sensitivity test. The testing data used here are similar to those in the evaluation of the historical baseline dataset size. Again, when evaluating the data sample searching pool size, AveNS and PctWin are used to evaluate the WPM result under different data sample searching pool sizes. In order to fully evaluate the effects of different data sample searching pool sizes, three numbers of search days, i.e., 20 days, 30 days, and 40 days are used here.
Table 11 summarizes the test results for various data sample searching pool sizes in different datasets. The results show that under a 120 min data sample searching pool size, AveNS can meet the requirement of 100 minimum data samples only when the searching day reaches 40 days. But for the 180-min data sample searching pool size, both 30 searching days and 40 searching days can meet the requirement. Although the requirement can be met for most test cases under a 240-min data sample searching pool size, it may lower the WPM accuracy. Therefore, considering the trade-off of the minimum data sample in each snapshot window, the searching day as well as the searching pool size, the 180-min searching pool size will be used in the WPM method.
Sensitivity test on data searching pool size
Searching pool size | 120-min Searching pool | 180-min Searching pool | 240-min Searching pool | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
MaxNS | MinNS | AveNS | PctWin | MaxNS | MinNS | AveNS | PctWin | MaxNS | MinNS | AveNS | PctWin | |
20 searching days | ||||||||||||
Su 2016 | 201 | 1 | 84 | 63% | 288 | 1 | 122 | 40% | 366 | 1 | 159 | 28% |
F 2016 | 126 | 1 | 55 | 91% | 180 | 1 | 82 | 62% | 240 | 1 | 109 | 43% |
W2016 | 168 | 1 | 69 | 82% | 288 | 1 | 104 | 43% | 378 | 1 | 137 | 32% |
Sp 2017 | 204 | 1 | 70 | 84% | 282 | 1 | 95 | 59% | 378 | 1 | 122 | 46% |
Su 2017 | 150 | 1 | 51 | 90% | 210 | 1 | 72 | 75% | 294 | 1 | 93 | 65% |
F 2017 | 144 | 1 | 56 | 79% | 204 | 1 | 84 | 66% | 264 | 1 | 106 | 52% |
W 2017 | 174 | 1 | 59 | 84% | 258 | 1 | 91 | 53% | 330 | 1 | 121 | 41% |
30 searching days | ||||||||||||
Su 2016 | 240 | 1 | 115 | 41% | 342 | 1 | 170 | 14% | 438 | 1 | 227 | 11% |
F 2016 | 174 | 6 | 84 | 66% | 216 | 12 | 126 | 22% | 282 | 24 | 168 | 14% |
W2016 | 276 | 1 | 106 | 47% | 414 | 6 | 159 | 31% | 540 | 12 | 211 | 24% |
Sp 2017 | 228 | 1 | 100 | 56% | 318 | 1 | 145 | 32% | 426 | 1 | 192 | 24% |
Su 2017 | 180 | 12 | 82 | 64% | 264 | 30 | 122 | 44% | 360 | 24 | 160 | 34% |
F 2017 | 210 | 1 | 89 | 56% | 312 | 1 | 133 | 33% | 414 | 1 | 163 | 36% |
W 2017 | 228 | 24 | 98 | 58% | 324 | 30 | 145 | 26% | 426 | 30 | 193 | 19% |
40 searching days | ||||||||||||
Su 2016 | 294 | 30 | 147 | 30% | 438 | 30 | 218 | 9% | 558 | 42 | 291 | 6% |
F 2016 | 240 | 18 | 114 | 34% | 348 | 30 | 170 | 13% | 456 | 30 | 227 | 9% |
W2016 | 312 | 1 | 131 | 34% | 474 | 6 | 195 | 24% | 618 | 12 | 259 | 20% |
Sp 2017 | 318 | 1 | 130 | 35% | 450 | 1 | 193 | 27% | 576 | 1 | 257 | 17% |
Su 2017 | 246 | 18 | 116 | 44% | 354 | 30 | 175 | 28% | 438 | 30 | 228 | 18% |
F 2017 | 288 | 1 | 153 | 19% | 432 | 1 | 230 | 14% | 582 | 1 | 306 | 13% |
W 2017 | 288 | 1 | 153 | 19% | 348 | 30 | 187 | 17% | 450 | 30 | 248 | 16% |
Searching pool size | 120-min Searching pool | 180-min Searching pool | 240-min Searching pool | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
MaxNS | MinNS | AveNS | PctWin | MaxNS | MinNS | AveNS | PctWin | MaxNS | MinNS | AveNS | PctWin | |
20 searching days | ||||||||||||
Su 2016 | 201 | 1 | 84 | 63% | 288 | 1 | 122 | 40% | 366 | 1 | 159 | 28% |
F 2016 | 126 | 1 | 55 | 91% | 180 | 1 | 82 | 62% | 240 | 1 | 109 | 43% |
W2016 | 168 | 1 | 69 | 82% | 288 | 1 | 104 | 43% | 378 | 1 | 137 | 32% |
Sp 2017 | 204 | 1 | 70 | 84% | 282 | 1 | 95 | 59% | 378 | 1 | 122 | 46% |
Su 2017 | 150 | 1 | 51 | 90% | 210 | 1 | 72 | 75% | 294 | 1 | 93 | 65% |
F 2017 | 144 | 1 | 56 | 79% | 204 | 1 | 84 | 66% | 264 | 1 | 106 | 52% |
W 2017 | 174 | 1 | 59 | 84% | 258 | 1 | 91 | 53% | 330 | 1 | 121 | 41% |
30 searching days | ||||||||||||
Su 2016 | 240 | 1 | 115 | 41% | 342 | 1 | 170 | 14% | 438 | 1 | 227 | 11% |
F 2016 | 174 | 6 | 84 | 66% | 216 | 12 | 126 | 22% | 282 | 24 | 168 | 14% |
W2016 | 276 | 1 | 106 | 47% | 414 | 6 | 159 | 31% | 540 | 12 | 211 | 24% |
Sp 2017 | 228 | 1 | 100 | 56% | 318 | 1 | 145 | 32% | 426 | 1 | 192 | 24% |
Su 2017 | 180 | 12 | 82 | 64% | 264 | 30 | 122 | 44% | 360 | 24 | 160 | 34% |
F 2017 | 210 | 1 | 89 | 56% | 312 | 1 | 133 | 33% | 414 | 1 | 163 | 36% |
W 2017 | 228 | 24 | 98 | 58% | 324 | 30 | 145 | 26% | 426 | 30 | 193 | 19% |
40 searching days | ||||||||||||
Su 2016 | 294 | 30 | 147 | 30% | 438 | 30 | 218 | 9% | 558 | 42 | 291 | 6% |
F 2016 | 240 | 18 | 114 | 34% | 348 | 30 | 170 | 13% | 456 | 30 | 227 | 9% |
W2016 | 312 | 1 | 131 | 34% | 474 | 6 | 195 | 24% | 618 | 12 | 259 | 20% |
Sp 2017 | 318 | 1 | 130 | 35% | 450 | 1 | 193 | 27% | 576 | 1 | 257 | 17% |
Su 2017 | 246 | 18 | 116 | 44% | 354 | 30 | 175 | 28% | 438 | 30 | 228 | 18% |
F 2017 | 288 | 1 | 153 | 19% | 432 | 1 | 230 | 14% | 582 | 1 | 306 | 13% |
W 2017 | 288 | 1 | 153 | 19% | 348 | 30 | 187 | 17% | 450 | 30 | 248 | 16% |
Note: Su: summer, F: fall, W: winter, and Sp: spring.
5 Principal Component Analysis for Fault Detection
5.1 Introduction of Principal Component Analysis.
The last procedure of the developed method is to use the WPM baseline dataset to develop a statistical model and perform the fault detection over the incoming snapshot data.
5.2 Workflow of Fault Detection.
Figure 7 demonstrates the workflow of PCA fault detection for a single snapshot window. When implementing the PCA method for fault detection, the following procedures are followed. First, a “WPM baseline dataset,” i.e., data that have similar weather condition and time frame as the snapshot data, is identified from the historical dataset by using the WPM method. Second, a local PCA model is developed for the “WPM baseline dataset” of each snapshot window. Then, a statistical threshold (i.e., T2) is generated for the developed local PCA model in each snapshot window. Third, the statistical value from the snapshot data will be calculated and compared with the threshold. Lastly, abnormal snapshot data is flagged and marked if the statistical value (from incoming snapshot data) is higher than the statistical threshold. The same procedure will be repeated when a new incoming snapshot data is collected and fed into the fault detection process. Through this procedure, a dynamic characteristic of a system operation can be more accurately modeled. Therefore, the detection accuracy is expected to be increased.
6 Conclusions
In this first paper of a two-paper series, the principles and methodology of a novel data-driven fault detection and diagnosis method, i.e., the WPM-FPCA method, is presented for detecting whole building level faults. In this method, a feature pre-selection process is first implemented by using the PLSR-GA method to select candidate variables, which are later used to develop PCA models. The HVAC system operation schedule as well as weather patterns, such as outdoor air enthalpy, are used to dynamically generate baseline data (i.e., WPM baseline data) that represent a building’s normal operation. A real-time PCA model is continuously generated based on the selected features for each WPM baseline according to each incoming data window. When a new incoming BAS data is collected and fed into the fault detection solution, system operational abnormality can be flagged if the calculated statistics T2 overpasses the threshold which is automatically generated from the baseline PCA model. The selections of several parameters such as the number of baseline days, snapshot window size, and data sample searching pool are also discussed. The implementation process of the WPM-FPCA method to detect WBFs in a real building, as well as the evaluation of the developed WPM-FPCA method by using real BAS data will be presented in the second part of this study.
Acknowledgment
Financial support provided by the U.S. Department of Energy for the research of VOLTTRON Compatible Whole Building Root-Fault Detection and Diagnosis (Grant No. DE-FOA-0001167) is greatly appreciated. We also want to thank Mr. William Taylor from Drexel University for his significant support of this project.
Conflict of Interest
There are no conflicts of interest.