DETAILED ACTION
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claims 1-5, 9-10, and 13, 15-21 are rejected under 35 U.S.C. 103 as being unpatentable over
Zhanpan ZHANG et al. (hereinafter ZHANG) US 2022/0292666 A1,
in view of Marcello Tedesco et.al. (hereinafter Ted) US 2021/0125428 A1.
in view of Arnab Chowdhury et.al. (hereinafter Chow) US 2020/0380336 A1.
In regard to claim 1:
ZHANG discloses:
- receiving machine historical sensor data and their failure log and generating a failure labeling model to generate training data from a failure prediction window, a history window and a failure infected interval settings;
In [0026]:
As an overview, embodying systems and methods provide an AI (Artificial Intelligence) anomaly pattern recognition model that leverages a diagnostic expert domain knowledge base and deep learning technique to automatically detect an industrial asset (e.g., wind turbine) operational anomaly and identify root cause(s) corresponding to the detected anomaly. In some embodiments, a large set of training cases can be established based on historical diagnostic records that include multiple root causes. For each training case, several pairs of time series of sensor measurements may be configured and represented as scatter plots, where a combination of data patterns in or derived from the scatter plots indicates a specific root cause of an anomaly reflected in the sensor measurements (i.e., data).
(BRI: a time series of sensor measurements may be configured and represented as a historical records is a “history window)
In [0028]:
FIG. 1 is a schematic block diagram of an example system 100 that may be associated with some embodiments herein. The system includes an industrial asset 105 that may generally operate normally for substantial periods of time but occasionally experience an anomaly that results in a malfunction or other abnormal operation of the asset
(BRI: malfunction or abnormal operation is a “failure”)
In [0028]:
a set of sensors 110 51 through SN may monitor one or more characteristics of the asset 105 (e.g., acceleration, vibration, noise, speed, energy consumed, output power, etc.). The information from the sensors may, according to some embodiments described herein, be collected and used to facilitate detection and/or prediction of abnormal operation (i.e., an anomaly) of operating asset 105 and the root cause corresponding to the detected anomaly.
In [0045]:
As such, each scatter plot captures a specific pair of time series data derived from the sensor measurements for a wind turbine (or other asset). In FIG. 5A, the high tower acceleration measurements are due to wind turbine blade misalignment and in FIG. 5B the high tower acceleration measurements captured in the scatter plot are due to an incorrect setting of a specific control parameter for the wind turbine.
(BRI: collecting asset characteristics to predict abnormal operations does provide data for setting failure intervals. The settings need is captured with this limitation)
In [0026]:
a large set of training cases can be established based on historical diagnostic records that include multiple root causes.
(BRI: A diagnostic record detailing an anomaly that stems from multiple underlying issues can be considered a type of failure log)
In [0037]:
The training data establishment component 320 or functionality of deep learning model system 310 may operate to establish a set of training cases based on the historical diagnostic records of the wind turbine operational data 305 that includes multiple root causes embedded within the data. The set of training cases may be used in training the deep learning model generated by component 325.
In [0038]:
deep learning model building and validation component 325 may operate to develop (i.e., generate) a deep learning classification model that builds connections (e.g., transfer functions, algorithms, etc.) between the scatter plots based on the operational data and root causes for anomalies in the operational data by processing an input of high-dimensional images including data pixels corresponding to the scatter plots to generate an output including root cause labels associated with one or more anomalies derived from data patterns in the images.
In [0064]:
The machine learning engine processes the combination of images to recognize patterns therein that correspond to one of a plurality of defined anomalies
(BRI: a machine learning engine that uses a combination of images to recognize patterns corresponding to a plurality of defined anomalies can be, and often is, an ensemble classifier)
In [0053]:
In some embodiments, at least a portion of the received historical time series sensor data may be transformed to a format, configuration, level, resolution, etc. from its raw configuration as obtained by the wind turbine (or other asset) sensors
In [0055]:
At operation 615, a root cause label is assigned to each visual image including the scatter plots representing an operational anomaly based on a reference
In [0055]:
In some aspects, a standardized ground truth label is assigned to each generated image. In some regards, abnormal sensor measurements (i.e., anomalies) may be caused by different root causes. In particular, each root cause requires a specific type of maintenance and repair practice. As such, identification of the correct root cause can provide actionable insights with respect to on-going operations, preventative maintenance, and corrective maintenance aspects of a wind turbine (and/or other assets).
In [0057]:
Continuing to operation 620, a deep learning model and more particularly a convolutional neural network (CNN) model is trained using a first subset of the labeled images and tested based on a second subset of the labeled images applied to the trained model to evaluate the performance of the trained model
- providing the failure labeling model's output data to a failure classification model or pipeline that is generated automatically to learn failure signal behavior and also providing the failure labeling model's output to an anomaly detection model or pipeline to detect an abnormal behavior in real time;
In [0011]:
FIG. 8 is an illustrative example representation of data associated with labeling images in accordance with some embodiments;
In [0038]:
The deep learning model building and validation component 325 or functionality of deep learning model system 310 may operate to convert or transform the scatter plots (or other representations of wind turbine operational data 305) into visual representation images of the scatter plots (or other representations of the operational data). For example, deep learning model building and validation component 325 may operate to develop (i.e., generate) a deep learning classification model that builds connections (e.g., transfer functions, algorithms, etc.) between the scatter plots based on the operational data and root causes for anomalies in the operational data by processing an input of high-dimensional images including data pixels corresponding to the scatter plots to generate an output including root cause labels associated with one or more anomalies derived from data patterns in the images. The deep learning model herein is a deep learning classification model developed to build a connection between scatter plots including data representations of wind turbine anomalies and the corresponding root causes thereof. In some aspects, a convolutional neural network (CNN) model is developed to capture and process pixel data to recognize the complex data patterns in images of the scatter plots and to further classify anomaly cases in the training set as being associated with a particular root cause for the determined anomaly classification.
In [0047] :
In some aspects, there might generally be a large variation in wind turbine operation data due to a plurality or combination of sensor, turbine control, and environment factors. The combination and complexity of factors presents a challenge to accurately distinguishing between normal wind turbine operation and abnormal wind turbine operation
In [0028]:
the information from the sensors may, according to some embodiments described herein, be collected and used to facilitate detection and/or prediction of abnormal operation (i.e., an anomaly) of operating asset 105 and the root cause corresponding to the detected anomaly.
(BRI: abnormal prediction is a “failure prediction”)
ZHANG does not explicitly disclose:
- and applying an ensemble classifier to the outputs of the data failure classification model and the anomaly detection model to predict a machine failure.
However, Ted discloses:
- and applying an ensemble classifier to the outputs of the data failure classification model and the anomaly detection model to predict a machine failure.
In [0005]:
with principles of inventive concepts a vehicle monitoring system may monitor a vehicle characteristic and, from the monitoring, may determine the state of a vehicle component. The system may set an alert and may communicate that alert to a user or supervisory authority. The state of the vehicle component may relate to a vehicle tire and to the potential delamination of a tire.
In [0058]:
In example embodiments one or more classifiers may be trained using tire characteristic data from one or more sensors. If multiple classifiers are trained, they may be trained to provide an indication of the degree to which tire delamination has taken place and “live” signals from an active vehicle may be compared against the one or more trained classifiers to determine the probability of failure (for example, delamination) within a given period (the “period” may be expressed as time, or distance, for example). The probability may take into account various driving conditions, such as velocity, load, or road surface quality, for example, in addition to sensor data such as pressure or temperature data, for example,
In [0067]:
A system and method may employ machine learning to recognize a tire fault and to determine the severity of the fault. Machine learning may be used constantly or may be engaged after an initial indication of a fault (for example, a periodic signal anomaly) is detected.
In [0056]:
principles of inventive concepts may assess the possibility of the onset and/or propagation of a delamination by detecting and analyzing the variation of movement and other sensed characteristics of a tire. These sensed characteristics may be used to determine the degree of failure (for example, delamination) and the time of failure migration. In example embodiments data from triaxial accelerometers, (and/or, possibly, other sensors which may disclose the time/acceleration signature associated with an angle of delamination, for example) may be used to develop a learning process (to train a classifier, for example) to refine the process of recognizing the onset of tire failures.
It would have obvious to one of ordinary skill in the art before the effective filing date of the present application to combine ZHANG and Ted.
ZHANG teaches failure labeling, windows for historical failures and detection of abnormality in operation.
Ted teaches ensemble (multiple classifiers) for failure classification and anomaly detection.
One of ordinary skill would have motivation to combine ZHANGZ and Ted that can improve the operational life of the tire system and avoid costly catastrophes that may associate with it (Ted [0054]).
ZHANG and Ted do not explicitly disclose:
- A method to maintain a machine, comprising:
However, Chow discloses:
- A method to maintain a machine, comprising:
In [0031]:
A system, method, and computer-readable medium are disclosed for a hardware component failure prediction system that can incorporate a time-series dimension as an input
In [0038]:
Data is provided to the system by a plurality of internet of things (IoT) devices 130 and 135 that are connected to information handling system 100 by network 140.
(BRI: the machine is a IoT for maintenance)
It would have obvious to one of ordinary skill in the art before the effective filing date of the present application to combine ZHANG, Ted and Chow.
ZHANG teaches failure labeling, windows for historical failures and detection of abnormality in operation.
Ted teaches ensemble (multiple classifiers) for failure classification and anomaly detection.
Chow teaches IoT maintenance.
One of ordinary skill would have motivation to combine ZHANG , Ted and Chow that can provide accuracy improvement (Chow [0061]).
In regard to claim 2:
ZHANG discloses:
- comprising automatically identifying failure instances from a historical data stream by the failure labeling model
0028] FIG. 1 is a schematic block diagram of an example system 100 that may be associated with some embodiments herein. The system includes an industrial asset 105 that may generally operate normally for substantial periods of time but occasionally experience an anomaly that results in a malfunction or other abnormal operation of the asset.
In [0028]:
the information from the sensors may, according to some embodiments described herein, be collected and used to facilitate detection and/or prediction of abnormal operation (i.e., an anomaly) of operating asset 105 and the root cause corresponding to the detected anomaly.
In [0034]:
FIG. 3 is a schematic block diagram depicting an overall system 300, in accordance with some embodiments. System 300 illustrates wind turbine operational data 305 being provided as input(s) to a deep learning model development and implementation system, device, service, or apparatus (also referred to herein simply as a “system” or “service”) 310 that outputs, at least, data 330 indicative of wind turbine anomalies detected by deep learning model system 310 and the root cause(s) corresponding to the detected anomalies.
In [0036]:
some scenarios, operational data 305 might include historical operational data associated with one or more wind turbines.
In [0039]:
output of deep learning model system 310 including an indication of the detected one or more anomalies derived from data patterns in the images and the corresponding root cause labels
In regard to claim 3:
ZHANG discloses:
- comprising using time series similarities to relabel a failure and normal signals
In 0055]:
At operation 615, a root cause label is assigned to each visual image including the scatter plots representing an operational anomaly based on a reference to and leveraging of, at least in part, a digitized knowledge domain data structure or system associated with the industrial asset(s) in combination with the data patterns in each image. In some aspects, a standardized ground truth label is assigned to each generated image. In some regards, abnormal sensor measurements (i.e., anomalies) may be caused by different root causes. In particular, each root cause requires a specific type of maintenance and repair practice. As such, identification of the correct root cause can provide actionable insights with respect to on-going operations, preventative maintenance, and corrective maintenance aspects of a wind turbine (and/or other assets).
In [0064]:
the machine learning engine processes the combination of images to recognize patterns therein that correspond to one of a plurality of defined anomalies (e.g., 8 anomalies in the example of FIG. 12). The output 1215 of the machine learning engine includes an indication of the specific root cause (e.g., anomaly 2=blade calibration and anomaly 4=incorrect ramp rate) in response to the specific inputs 1210.
(BRI: Using time series similarities to relabel failure and normal signals is a process where unlabeled or ambiguously labeled data points are assigned a definitive label (either "failure" or "normal") based on how closely their patterns or shapes match known, pre-established examples of each class)
ZHANG, and Ted do not explicitly disclose:
- and increasing the quality of training data for the failure classification model or pipeline.
However, Chow discloses :
- and increasing the quality of training data for the failure classification model or pipeline.
In [0042]:
to allow for accurate and efficient results to be provided by the deep neural network, the data needs to be preprocessed to better enable the deep neural networks to converge rapidly to a solution that can accurately predict device failure.
(BRI: preprocessing enhance the quality of the NN)
In regard to claim 4:
ZHANG and Ted do not explicitly disclose:
- comprising real-time general streaming that allows businesses to link machines and assets.
However, Chow discloses:
- comprising real-time general streaming that allows businesses to link machines and assets.
In [0043]:
Once training and validation datasets are formed that include information relevant to continuous and categorical features, that information can be used to determine a failure prediction model for the hardware device type. Modeling stage 240 utilizes the sample sets to first train the double-stacked long-short term memory deep neural network, and then validate the trained solution to perform additional tuning. Once the solution has been satisfactorily tuned, the solution can be used to help enable failure prediction for devices not included in the sample sets. This information can be provided during deployment stage 250 to business units that can utilize the information in support of customers.
In [0044]:
FIG. 3 is a simplified flow diagram illustrating a set of steps involved in data processing stage 240, in accord with embodiments of the present invention. As discussed above, information collected from a set of devices falling in an IoT device type of interest
In regard to claim 5:
ZHANG and Ted do not explicitly disclose:
- comprising providing the output of the failure labeling model to generate quality labeled training data.
However, Chow discloses:
- comprising providing the output of the failure labeling model to generate quality labeled training data.`
In [0042]:
to allow for accurate and efficient results to be provided by the deep neural network, the data needs to be preprocessed to better enable the deep neural networks to converge rapidly to a solution that can accurately predict device failure.
In [0063] :
The failure prediction system discussed above is designed such that it is generic and can be used for any IoT hardware components that are connected to provide telemetry data. While the above discussion has focused on an example of hard disk drives, embodiments are not limited to HDDs, but can be applied to any IoT device.
In regard to claim 9:
ZHANG and Ted do not explicitly disclose:
- comprising representing machine sensor data as two dimensional (2D) time series data with timestamps and features.
However, Chow discloses:
- comprising representing machine sensor data as two dimensional (2D) time series data with timestamps and features.
In [0034]:
Embodiments of the present invention utilize a deep-learning based architecture for component failure prediction and address a variety of issues inherent in traditional systems. Such issues include: (1) incorporating a time-series dimension is an input; (2) incorporating a combination of multi-dimensional continuous and categorical parameters with only the continuous parameters having a time-series component; (3) addressing a class imbalance problem between devices that have failed and those that have not failed; (4) ensuring that device observation sequences are weighted based on their importance in their ability to predict a next failure; (5) predicting component failure in any day in a certain window of a future time period; and, (6) providing self-learning for the prediction model.
In [0048]:
FIG. 4 is a table 400 illustrating observation ranking for each passing HDD. A primary object of the solution model is to predict whether an IoT device will fail within the next “d” days. To this end, “a” days of observations are selected for each of the passing device samples in the passing device data frames of both the training and validation datasets based on the ranking performed in step 330 (350). As illustrated in FIG. 4, the range of the ranking is [d+1, d+a] with d+a≤x, where x is the minimum threshold of event data occurrences used in 310.
PNG
media_image1.png
311
617
media_image1.png
Greyscale
(BRI: the FIG 4 representation of observation is a timestamp of observations)
It would have obvious to one of ordinary skill in the art before the effective filing date of the present application to combine ZHANG, Ted and Chow.
ZHANG teaches failure labeling, windows for historical failures and detection of abnormality in operation.
Ted teaches ensemble (multiple classifiers) for failure classification and anomaly detection.
Chow teaches IoT maintenance.
One of ordinary skill would have motivation to combine ZHANG , Ted and Chow that can provide accuracy improvement (Chow [0061]).
In regard to claim 10:
ZHANG and Ted do not explicitly disclose:
- comprising representing machine sensor data as three- dimensional (3D) time series data sequences with timestamps, history window, and features to capture temporal context.
However, Chow discloses:
- comprising representing machine sensor data as three- dimensional (3D) time series data sequences with timestamps, history window, and features to capture temporal context.
In [0034]:
Embodiments of the present invention utilize a deep-learning based architecture for component failure prediction and address a variety of issues inherent in traditional systems. Such issues include: (1) incorporating a time-series dimension is an input; (2) incorporating a combination of multi-dimensional continuous and categorical parameters with only the continuous parameters having a time-series component; (3) addressing a class imbalance problem between devices that have failed and those that have not failed; (4) ensuring that device observation sequences are weighted based on their importance in their ability to predict a next failure; (5) predicting component failure in any day in a certain window of a future time period; and, (6) providing self-learning for the prediction model.
In [0048]:
FIG. 4 is a table 400 illustrating observation ranking for each passing HDD. A primary object of the solution model is to predict whether an IoT device will fail within the next “d” days. To this end, “a” days of observations are selected for each of the passing device samples in the passing device data frames of both the training and validation datasets based on the ranking performed in step 330 (350). As illustrated in FIG. 4, the range of the ranking is [d+1, d+a] with d+a≤x, where x is the minimum threshold of event data occurrences used in 310.
PNG
media_image1.png
311
617
media_image1.png
Greyscale
(BRI: the FIG 4 representation of observation is a timestamp of observations)
It would have obvious to one of ordinary skill in the art before the effective filing date of the present application to combine ZHANG, Ted and Chow.
ZHANG teaches failure labeling, windows for historical failures and detection of abnormality in operation.
Ted teaches ensemble (multiple classifiers) for failure classification and anomaly detection.
Chow teaches IoT maintenance.
One of ordinary skill would have motivation to combine ZHANG , Ted and Chow that can provide accuracy improvement (Chow [0061]).
In regard to claim 13:
ZHANG and Bhat do not explicitly disclose:
- augmenting failure data;
- balancing the failure data;
- extracting features from the data;
- if features are extracted, selecting a 2D deep learning model and otherwise selecting a 3D deep learning model;
- and performing failure prediction.
However, Chow discloses:
- augmenting failure data;
In [0008]:
generating the oversampled set of observations from the set of records associated with failed devices in the training dataset further includes synthetically creating repetitive samples using a moving time window. In still a further aspect, synthetically creating repetitive samples using a moving time window further includes generating and over sampled set of observations “d” from “a” actual observations such that for observation “n” in the set of observations, the observation is in a date range characterized by [d+2−n, d+a+1−n].
(BRI: synthetically creating samples associated with the training set of failed devices represents augmenting the failure data)
- balancing the failure data;
In [0098]:
One of the challenges of the feeder ranking application is that of imbalanced data/scarcity of data characterizing the failure class can cause problems with generalization. Specifically, primary distribution feeders are susceptible to different kinds of failures, and one can have very few training examples for each kind of event, making it difficult to reliably extract statistical regularities or determine the features that affect reliability.
In [0099]:
In one particular embodiment, the focus is on most serious failure type, where the entire feeder is automatically taken offline by emergency substation relays, due to some type of fault being detected by sensors. The presently disclosed system for generating data sets can address the challenge of learning with rare positive examples (feeder failures). An actual feeder failure incident is instantaneous: a snapshot of the system at that moment will have only one failure example. To better balance the data, one can employ the rare event prediction setup shown in FIG. 6, labeling any example that had experienced a failure over some time window as positive
- extracting features from the data;
In [0042]:
Data processing steps can include data transformation, such as filtering, ordering, normalization, oversampling, and selecting sample sets. Feature engineering techniques can include defining continuous and categorical features, normalization of continuous features, determining those features of greatest impact to device failure, and the like.
In [0053]:
Continuous feature data is normalized (815). In one embodiment, the data is normalized using a min-max normalization, such that (a) each feature contributes approximately proportionately while predicting the target feature; and (b) gradient descent converges faster with features scaling than without features scaling. Min−max normalization is a normalization strategy that linearly transforms x to y=(x−min)/(max−min), wherein min and max are minimum and maximum values in X, where X is a set of observed values of x.
- if features are extracted, selecting a 2D deep learning model and otherwise selecting a 3D deep learning model;
In [0054]:
After processing the categorical and continuous features, the train, validation, and hold-out datasets are separated out using each dataset identifier
(BRI:a DNN-based failure prediction system that incorporates a time-series dimension can utilize a 2D deep learning model, specifically by transforming the time-series data into a 2D format)
In [0031]:
A system, method, and computer-readable medium are disclosed for a hardware component failure prediction system that can incorporate a time-series dimension as an input while also addressing issues related to a class imbalance problem associated with failure data. Embodiments provide this capability through the use of a deep learning-based artificial intelligence binary classification method. Embodiments utilize a double-stacked long short-term memory (DS-LSTM) deep neural network with a first layer of the LSTM passing hidden cell states learned from a sequence of multi-dimensional parameter time steps to a second layer of the LSTM that is configured to capture a next sequential prediction output. Output from the second layer of the LSTM is concatenated with a set of categorical variables to an input layer of a fully-connected dense neural network layer. Information generated by the dense neural network provides prediction of whether a hardware component will fail in a given future time interval. In addition, in some embodiments, a lagged feedback component from the output is added back to the input layer of the DNN and concatenated to the set of categorical parameters and next sequential higher-dimension parameter set. This enables the system to self-learn and increases robustness.
In [0053]:
Continuous feature data is normalized (815). In one embodiment, the data is normalized using a min-max normalization, such that (a) each feature contributes approximately proportionately while predicting the target feature; and (b) gradient descent converges faster with features scaling than without features scaling. Min−max normalization is a normalization strategy that linearly transforms x to y=(x−min)/(max−min), wherein min and max are minimum and maximum values in X, where X is a set of observed values of x.
- and performing failure prediction
In [0031]:
A system, method, and computer-readable medium are disclosed for a hardware component failure prediction system that can incorporate a time-series dimension as an input while also addressing issues related to a class imbalance problem associated with failure data. Embodiments provide this capability through the use of a deep learning-based artificial intelligence binary classification method. increases robustness.
It would have obvious to one of ordinary skill in the art before the effective filing date of the present application to combine ZHANG, Ted and Chow.
ZHANG teaches failure labeling, windows for historical failures and detection of abnormality in operation.
Ted teaches ensemble (multiple classifiers) for failure classification and anomaly detection.
Chow teaches IoT maintenance.
One of ordinary skill would have motivation to combine ZHANG , Ted and Chow that can provide accuracy improvement (Chow [0061]).
In regard to claim 15:
ZHANG and Ted do not explicitly disclose:
- comprising applying time series augmentation methods to artificially generate failure sequences when small number of failure events occurred in training data.
However, Chow discloses:
- comprising applying time series augmentation methods to artificially generate failure sequences when small number of failure events occurred in training data.
In [0008]:
generating the oversampled set of observations from the set of records associated with failed devices in the training dataset further includes synthetically creating repetitive samples using a moving time window.
(BRI: synthetically creating samples is an artificially generated associated with the set of failed devices (augmenting the failure data))
In [0062]:
As discussed above, embodiments introduce a unique way of handling class imbalance by synthetically creating repetitive samples of the lower proportion class using a moving time window method. The manner in which the model architecture is designed uniquely provides an initial layer of LSTM that consumes time series specific multi-dimensional input parameters to output a hidden cell state at each time step
In [0062] :
embodiments introduce a unique way of handling class imbalance by synthetically creating repetitive samples of the lower proportion class using a moving time window method.
(BRI: A small number of failure events occurring in training data is commonly referred to as an imbalanced dataset or class imbalance)
It would have obvious to one of ordinary skill in the art before the effective filing date of the present application to combine ZHANG, Ted and Chow.
ZHANG teaches failure labeling, windows for historical failures and detection of abnormality in operation.
Ted teaches ensemble (multiple classifiers) for failure classification and anomaly detection.
Chow teaches IoT maintenance.
One of ordinary skill would have motivation to combine ZHANG , Ted and Chow that can provide accuracy improvement (Chow [0061]).
In regard to claim 16:
ZHANG discloses:
- receiving machine historical sensor data and their failure log and generating a failure labeling model to generate training data from a failure prediction window, a history window and a failure infected interval settings;
In [0026]:
As an overview, embodying systems and methods provide an AI (Artificial Intelligence) anomaly pattern recognition model that leverages a diagnostic expert domain knowledge base and deep learning technique to automatically detect an industrial asset (e.g., wind turbine) operational anomaly and identify root cause(s) corresponding to the detected anomaly. In some embodiments, a large set of training cases can be established based on historical diagnostic records that include multiple root causes. For each training case, several pairs of time series of sensor measurements may be configured and represented as scatter plots, where a combination of data patterns in or derived from the scatter plots indicates a specific root cause of an anomaly reflected in the sensor measurements (i.e., data).
(BRI: a time series of sensor measurements may be configured and represented as scatter (BRI: a time series of sensor measurements may be configured and represented as a historical record is a “history window)
In [0028]:
FIG. 1 is a schematic block diagram of an example system 100 that may be associated with some embodiments herein. The system includes an industrial asset 105 that may generally operate normally for substantial periods of time but occasionally experience an anomaly that results in a malfunction or other abnormal operation of the asset
(BRI: malfunction or abnormal operation is a “failure”)
In [0028]:
a set of sensors 110 51 through SN may monitor one or more characteristics of the asset 105 (e.g., acceleration, vibration, noise, speed, energy consumed, output power, etc.). The information from the sensors may, according to some embodiments described herein, be collected and used to facilitate detection and/or prediction of abnormal operation (i.e., an anomaly) of operating asset 105 and the root cause corresponding to the detected anomaly.
In [0045]:
As such, each scatter plot captures a specific pair of time series data derived from the sensor measurements for a wind turbine (or other asset). In FIG. 5A, the high tower acceleration measurements are due to wind turbine blade misalignment and in FIG. 5B the high tower acceleration measurements captured in the scatter plot are due to an incorrect setting of a specific control parameter for the wind turbine.
(BRI: collecting asset characteristics to predict abnormal operations does provide data for setting failure intervals. The settings need is captured with this limitation)
In [0026]:
a large set of training cases can be established based on historical diagnostic records that include multiple root causes.
(BRI: A diagnostic record detailing an anomaly that stems from multiple underlying issues can be considered a type of failure log)
In [0037]:
The training data establishment component 320 or functionality of deep learning model system 310 may operate to establish a set of training cases based on the historical diagnostic records of the wind turbine operational data 305 that includes multiple root causes embedded within the data. The set of training cases may be used in training the deep learning model generated by component 325.
In [0038]:
deep learning model building and validation component 325 may operate to develop (i.e., generate) a deep learning classification model that builds connections (e.g., transfer functions, algorithms, etc.) between the scatter plots based on the operational data and root causes for anomalies in the operational data by processing an input of high-dimensional images including data pixels corresponding to the scatter plots to generate an output including root cause labels associated with one or more anomalies derived from data patterns in the images.
In [0064]:
The machine learning engine processes the combination of images to recognize patterns therein that correspond to one of a plurality of defined anomalies
(BRI: a machine learning engine that uses a combination of images to recognize patterns corresponding to a plurality of defined anomalies can be, and often is, an ensemble classifier)
In [0053]:
In some embodiments, at least a portion of the received historical time series sensor data may be transformed to a format, configuration, level, resolution, etc. from its raw configuration as obtained by the wind turbine (or other asset) sensors
In [0055]:
At operation 615, a root cause label is assigned to each visual image including the scatter plots representing an operational anomaly based on a reference
In [0055]:
In some aspects, a standardized ground truth label is assigned to each generated image. In some regards, abnormal sensor measurements (i.e., anomalies) may be caused by different root causes. In particular, each root cause requires a specific type of maintenance and repair practice. As such, identification of the correct root cause can provide actionable insights with respect to on-going operations, preventative maintenance, and corrective maintenance aspects of a wind turbine (and/or other assets).
In [0057]:
Continuing to operation 620, a deep learning model and more particularly a convolutional neural network (CNN) model is trained using a first subset of the labeled images and tested based on a second subset of the labeled images applied to the trained model to evaluate the performance of the trained model
- providing the failure labeling model's output data to a failure classification model or pipeline that is generated automatically to learn failure signal behavior and also providing the failure labeling model's output to an anomaly detection model or pipeline to detect an abnormal behavior in real time;
In [0011]:
FIG. 8 is an illustrative example representation of data associated with labeling images in accordance with some embodiments;
In [0038]:
The deep learning model building and validation component 325 or functionality of deep learning model system 310 may operate to convert or transform the scatter plots (or other representations of wind turbine operational data 305) into visual representation images of the scatter plots (or other representations of the operational data). For example, deep learning model building and validation component 325 may operate to develop (i.e., generate) a deep learning classification model that builds connections (e.g., transfer functions, algorithms, etc.) between the scatter plots based on the operational data and root causes for anomalies in the operational data by processing an input of high-dimensional images including data pixels corresponding to the scatter plots to generate an output including root cause labels associated with one or more anomalies derived from data patterns in the images. The deep learning model herein is a deep learning classification model developed to build a connection between scatter plots including data representations of wind turbine anomalies and the corresponding root causes thereof. In some aspects, a convolutional neural network (CNN) model is developed to capture and process pixel data to recognize the complex data patterns in images of the scatter plots and to further classify anomaly cases in the training set as being associated with a particular root cause for the determined anomaly classification.
In [0047] :
In some aspects, there might generally be a large variation in wind turbine operation data due to a plurality or combination of sensor, turbine control, and environment factors. The combination and complexity of factors presents a challenge to accurately distinguishing between normal wind turbine operation and abnormal wind turbine operation
In [0028]:
the information from the sensors may, according to some embodiments described herein, be collected and used to facilitate detection and/or prediction of abnormal operation (i.e., an anomaly) of operating asset 105 and the root cause corresponding to the detected anomaly.
(BRI: abnormal prediction is a “failure prediction”)
ZHANG does not explicitly disclose:
- and applying an ensemble classifier to the outputs of the data failure classification model and the anomaly detection model to predict a machine failure.
However, Ted discloses:
- and applying an ensemble classifier to the outputs of the data failure classification model and the anomaly detection model to predict a machine failure.
In [0058]:
In example embodiments one or more classifiers may be trained using tire characteristic data from one or more sensors. If multiple classifiers are trained, they may be trained to provide an indication of the degree to which tire delamination has taken place and “live” signals from an active vehicle may be compared against the one or more trained classifiers to determine the probability of failure (for example, delamination) within a given period (the “period” may be expressed as time, or distance, for example). The probability may take into account various driving conditions, such as velocity, load, or road surface quality, for example, in addition to sensor data such as pressure or temperature data, for example,
In [0005]:
with principles of inventive concepts a vehicle monitoring system may monitor a vehicle characteristic and, from the monitoring, may determine the state of a vehicle component. The system may set an alert and may communicate that alert to a user or supervisory authority. The state of the vehicle component may relate to a vehicle tire and to the potential delamination of a tire.
In [0067]:
A system and method may employ machine learning to recognize a tire fault and to determine the severity of the fault. Machine learning may be used constantly or may be engaged after an initial indication of a fault (for example, a periodic signal anomaly) is detected.
In [0056]:
principles of inventive concepts may assess the possibility of the onset and/or propagation of a delamination by detecting and analyzing the variation of movement and other sensed characteristics of a tire. These sensed characteristics may be used to determine the degree of failure (for example, delamination) and the time of failure migration. In example embodiments data from triaxial accelerometers, (and/or, possibly, other sensors which may disclose the time/acceleration signature associated with an angle of delamination, for example) may be used to develop a learning process (to train a classifier, for example) to refine the process of recognizing the onset of tire failures.
It would have obvious to one of ordinary skill in the art before the effective filing date of the present application to combine ZHANG and Ted.
ZHANG teaches failure labeling, windows for historical failures and detection of abnormality in operation.
Ted teaches ensemble (multiple classifiers) for failure classification and anomaly detection.
One of ordinary skill would have motivation to combine ZHANGZ and Ted that can improve the operational life of the tire system and avoid costly catastrophes that may associate with it (Ted [0054]).
ZHANG and Ted do not explicitly disclose:
- A system, comprising: at least a machine to be maintained; a maintenance server coupled to the machine using an internet of things (IoT) protocol in real time as a stream of data, the maintenance server running computer code for:
However, Chow discloses:
- A system, comprising: at least a machine to be maintained; a maintenance server coupled to the machine using an internet of things (IoT) protocol in real time as a stream of data, the maintenance server running computer code for:
In [0038]:
Data is provided to the system by a plurality of internet of things (IoT) devices 130 and 135 that are connected to information handling system 100 by network 140. IoT devices 130 are coupled to the information handling system via edge network server 142, which can act as an intermediary in gathering data from the IoT devices and providing a desired subset of the data to the information handling system 100 via network port 110.
In [0041]:
data acquisition stage 210 is an initial stage of the process in which IoT devices coupled to a network (e.g., network 140) provide information about the state of those devices to servers (e.g., information handling system 100 or edge network server 142) that can store the information in one or more databases.
In [0006]:
A system, method, and computer-readable medium are disclosed for predicting failure of a hardware device, where the system, method, and computer-readable medium can incorporate a time-series dimension as an input
In [0032]:
early component failure detection coupled with preventative replacement and automatic monitoring facilitates total productive maintenance in real time.
In [0039]:
the implementation of the predictive maintenance system on information handling system 100 provides a useful and concrete result of accurate estimation of when an IoT device is about to fail.
It would have obvious to one of ordinary skill in the art before the effective filing date of the present application to combine ZHANG, Ted and Chow.
ZHANG teaches failure labeling, windows for historical failures and detection of abnormality in operation.
Ted teaches ensemble (multiple classifiers) for failure classification and anomaly detection.
Chow teaches IoT.
One of ordinary skill would have motivation to combine ZHANG , Ted and Chow that can provide accuracy improvement over the prediction of the failure (Chow [0061]).
In regard to claim 17:
ZHANG discloses:
- comprising automatically identifying failure instances from a historical data stream by the failure labeling model
0028] FIG. 1 is a schematic block diagram of an example system 100 that may be associated with some embodiments herein. The system includes an industrial asset 105 that may generally operate normally for substantial periods of time but occasionally experience an anomaly that results in a malfunction or other abnormal operation of the asset.
In [0028]:
the information from the sensors may, according to some embodiments described herein, be collected and used to facilitate detection and/or prediction of abnormal operation (i.e., an anomaly) of operating asset 105 and the root cause corresponding to the detected anomaly.
In [0034]:
FIG. 3 is a schematic block diagram depicting an overall system 300, in accordance with some embodiments. System 300 illustrates wind turbine operational data 305 being provided as input(s) to a deep learning model development and implementation system, device, service, or apparatus (also referred to herein simply as a “system” or “service”) 310 that outputs, at least, data 330 indicative of wind turbine anomalies detected by deep learning model system 310 and the root cause(s) corresponding to the detected anomalies.
In [0036]:
some scenarios, operational data 305 might include historical operational data associated with one or more wind turbines.
In [0039]:
output of deep learning model system 310 including an indication of the detected one or more anomalies derived from data patterns in the images and the corresponding root cause labels
In regard to claim 18:
ZHANG discloses:
- comprising using time series similarities to relabel a failure and normal signals and increasing the quality of training data for the failure classification model or pipeline.
In 0055]:
At operation 615, a root cause label is assigned to each visual image including the scatter plots representing an operational anomaly based on a reference to and leveraging of, at least in part, a digitized knowledge domain data structure or system associated with the industrial asset(s) in combination with the data patterns in each image. In some aspects, a standardized ground truth label is assigned to each generated image. In some regards, abnormal sensor measurements (i.e., anomalies) may be caused by different root causes. In particular, each root cause requires a specific type of maintenance and repair practice. As such, identification of the correct root cause can provide actionable insights with respect to on-going operations, preventative maintenance, and corrective maintenance aspects of a wind turbine (and/or other assets).
In [0064]:
the machine learning engine processes the combination of images to recognize patterns therein that correspond to one of a plurality of defined anomalies (e.g., 8 anomalies in the example of FIG. 12). The output 1215 of the machine learning engine includes an indication of the specific root cause (e.g., anomaly 2=blade calibration and anomaly 4=incorrect ramp rate) in response to the specific inputs 1210.
(BRI: Using time series similarities to relabel failure and normal signals is a process where unlabeled or ambiguously labeled data points are assigned a definitive label (either "failure" or "normal") based on how closely their patterns or shapes match known, pre-established examples of each class)
ZHANG, and Ted do not explicitly disclose:
- and increasing the quality of training data for the failure classification model or pipeline.
However, Chow discloses:
- and increasing the quality of training data for the failure classification model or pipeline.
In [0042]:
to allow for accurate and efficient results to be provided by the deep neural network, the data needs to be preprocessed to better enable the deep neural networks to converge rapidly to a solution that can accurately predict device failure.
In regard to claim 19:
ZHANG and Ted do not explicitly disclose:
- comprising real-time general streaming that allows businesses to link machines and assets.
However, Chow discloses:
- comprising real-time general streaming that allows businesses to link machines and assets.
In [0043]:
Once training and validation datasets are formed that include information relevant to continuous and categorical features, that information can be used to determine a failure prediction model for the hardware device type. Modeling stage 240 utilizes the sample sets to first train the double-stacked long-short term memory deep neural network, and then validate the trained solution to perform additional tuning. Once the solution has been satisfactorily tuned, the solution can be used to help enable failure prediction for devices not included in the sample sets. This information can be provided during deployment stage 250 to business units that can utilize the information in support of customers.
In [0044]:
FIG. 3 is a simplified flow diagram illustrating a set of steps involved in data processing stage 240, in accord with embodiments of the present invention. As discussed above, information collected from a set of devices falling in an IoT device type of interest
In regard to claim 20:
ZHANG and Ted do not explicitly disclose:
- comprising providing the output of the failure labeling model to generate quality labeled training data.
However, Chow discloses:
- comprising providing the output of the failure labeling model to generate quality labeled training data.`
In [0042]:
to allow for accurate and efficient results to be provided by the deep neural network, the data needs to be preprocessed to better enable the deep neural networks to converge rapidly to a solution that can accurately predict device failure.
In [0063] :
The failure prediction system discussed above is designed such that it is generic and can be used for any IoT hardware components that are connected to provide telemetry data. While the above discussion has focused on an example of hard disk drives, embodiments are not limited to HDDs, but can be applied to any IoT device.
In regard to claim 21:
ZHANG and Ted do not explicitly disclose:
- augmenting failure data;
- balancing the failure data;
- extracting features from the data;
- if features are extracted, selecting a 2D deep learning model and otherwise selecting a 3D deep learning model;
- and performing failure prediction.
However, Chow discloses:
- augmenting failure data;
In [0008]:
generating the oversampled set of observations from the set of records associated with failed devices in the training dataset further includes synthetically creating repetitive samples using a moving time window. In still a further aspect, synthetically creating repetitive samples using a moving time window further includes generating and over sampled set of observations “d” from “a” actual observations such that for observation “n” in the set of observations, the observation is in a date range characterized by [d+2−n, d+a+1−n].
(BRI: synthetically creating samples associated with the training set of failed devices represents augmenting the failure data)
- balancing the failure data;
In [0098]:
One of the challenges of the feeder ranking application is that of imbalanced data/scarcity of data characterizing the failure class can cause problems with generalization. Specifically, primary distribution feeders are susceptible to different kinds of failures, and one can have very few training examples for each kind of event, making it difficult to reliably extract statistical regularities or determine the features that affect reliability.
In [0099]:
In one particular embodiment, the focus is on most serious failure type, where the entire feeder is automatically taken offline by emergency substation relays, due to some type of fault being detected by sensors. The presently disclosed system for generating data sets can address the challenge of learning with rare positive examples (feeder failures). An actual feeder failure incident is instantaneous: a snapshot of the system at that moment will have only one failure example. To better balance the data, one can employ the rare event prediction setup shown in FIG. 6, labeling any example that had experienced a failure over some time window as positive
- extracting features from the data;
In [0042]:
Data processing steps can include data transformation, such as filtering, ordering, normalization, oversampling, and selecting sample sets. Feature engineering techniques can include defining continuous and categorical features, normalization of continuous features, determining those features of greatest impact to device failure, and the like.
In [0053]:
Continuous feature data is normalized (815). In one embodiment, the data is normalized using a min-max normalization, such that (a) each feature contributes approximately proportionately while predicting the target feature; and (b) gradient descent converges faster with features scaling than without features scaling. Min−max normalization is a normalization strategy that linearly transforms x to y=(x−min)/(max−min), wherein min and max are minimum and maximum values in X, where X is a set of observed values of x.
- if features are extracted, selecting a 2D deep learning model and otherwise selecting a 3D deep learning model;
In [0054]:
After processing the categorical and continuous features, the train, validation, and hold-out datasets are separated out using each dataset identifier
(BRI:a DNN-based failure prediction system that incorporates a time-series dimension can utilize a 2D deep learning model, specifically by transforming the time-series data into a 2D format)
In [0031]:
A system, method, and computer-readable medium are disclosed for a hardware component failure prediction system that can incorporate a time-series dimension as an input while also addressing issues related to a class imbalance problem associated with failure data. Embodiments provide this capability through the use of a deep learning-based artificial intelligence binary classification method. Embodiments utilize a double-stacked long short-term memory (DS-LSTM) deep neural network with a first layer of the LSTM passing hidden cell states learned from a sequence of multi-dimensional parameter time steps to a second layer of the LSTM that is configured to capture a next sequential prediction output. Output from the second layer of the LSTM is concatenated with a set of categorical variables to an input layer of a fully-connected dense neural network layer. Information generated by the dense neural network provides prediction of whether a hardware component will fail in a given future time interval. In addition, in some embodiments, a lagged feedback component from the output is added back to the input layer of the DNN and concatenated to the set of categorical parameters and next sequential higher-dimension parameter set. This enables the system to self-learn and increases robustness.
In [0053]:
Continuous feature data is normalized (815). In one embodiment, the data is normalized using a min-max normalization, such that (a) each feature contributes approximately proportionately while predicting the target feature; and (b) gradient descent converges faster with features scaling than without features scaling. Min−max normalization is a normalization strategy that linearly transforms x to y=(x−min)/(max−min), wherein min and max are minimum and maximum values in X, where X is a set of observed values of x.
- and performing failure prediction
In [0031]:
A system, method, and computer-readable medium are disclosed for a hardware component failure prediction system that can incorporate a time-series dimension as an input while also addressing issues related to a class imbalance problem associated with failure data. Embodiments provide this capability through the use of a deep learning-based artificial intelligence binary classification method. increases robustness.
It would have obvious to one of ordinary skill in the art before the effective filing date of the present application to combine ZHANG, Ted and Chow.
ZHANG teaches failure labeling, windows for historical failures and detection of abnormality in operation.
Ted teaches ensemble (multiple classifiers) for failure classification and anomaly detection.
Chow teaches IoT maintenance.
One of ordinary skill would have motivation to combine ZHANG , Ted and Chow that can provide accuracy improvement (Chow [0061]).
Claims 6-8 are rejected under 35 U.S.C. 103 as being unpatentable over
Zhanpan ZHANG et al. (hereinafter ZHANG) US 2022/0292666 A1,
In view of Marcello Tedesco et.al. (hereinafter Ted) US 2021/0125428 A1,
in view of Arnab Chowdhury et.al. (hereinafter Chow) US 2020/0380336 A1,
further in view of Surendra Reddy et.al. (hereinafter Reddy) US 2019/0259033 A1,
further in view of Honggang Wang et.al. (hereinafter Wang) US 2020/0293032 A1,
further in view of Roger Anderson et.al. (hereinafter Anderson) US 2013/0232094 A1.
In regard to claim 6:
ZHANG discloses:
- and performing failure labeling refurbishment.
In [0037]:
In some embodiments, the diagnostic data records 305 and the corresponding scatter plots may be reviewed by domain experts and/or automated processing systems that can, for example, reference digitized or other machine readable data structures and systems, devices, and services that embody a domain expert knowledge base to ensure correct labeling of training cases.
(BRI: a service to ensure the correct labeling of training cases can be considered a form of failure label refurbishment which may also be known as label correction or label cleaning)
ZHANG, Ted and Chow do not explicitly disclose:
- wherein the failure labeling model comprises labeling failure log intervals;
- for each failure, saving data from a predetermined time frame;
- deleting intersecting signals between failures and normal labels;
However, Reddy discloses:
- wherein the failure labeling model comprises labeling failure log intervals;
In [0047] :
a cloud based semantic data store is part of a database management system
(BRI: log is a fundamental component of a database management system (DBMS)
In [0051]:
enterprise data genome system 100 can include pulse contextual data related to business transactions i.e. identify and gather relevant data gathering and local learning component 200 and plurality of data sources 210,
in [0053]:
In FIG. 1, data sources 211, 212, 213, and 214 are computer accessible components that provide and/or stores data from banks or other institutions involved in transactions or that are sources of data that may be relevant in the classification of transactions as suspicious or representing illegal or otherwise bad behavior.
In [0053]:
data sources 210 are currently used by many banking businesses to run their business operations effectively.
In [0061]:
FIG. 5 is a flowchart of an autonomous method for data source selection, extraction, processing, classification, enrichment, and labeling of entities
- for each failure, saving data from a predetermined time frame;
In [0070]:
in one embodiment, the determination automatically identifies in a provided financial activity streams (FACTs), without requiring an input of a priori models for normal or abnormal behavior. Thus, complex aspects of suspicious activity patterns identified within the data set are converted into threat vector
(BRI: saving data from a predetermined time frame" refers to the practice of systematically recording and preserving a specific window of data leading up to a system or component failure normally represented in a failure analysis)
in [0093]:
(f) the scenario “Surge in Inflow and Outflow of Funds Through Account” indicates those accounts which are potentially being used for fraudulent activity.
In [0093]:
Primarily, such accounts have bursts of activity within a predetermined time period (e.g., a short time period) and then remain quiet for some time
In [0070]:
complex aspects of suspicious activity patterns identified within the data set are converted into threat vectors,
in [0076]:
encoder 1101 (a) stores the computed threat vectors;
- deleting intersecting signals between failures and normal labels;
In [0014:
FIG. 5 is a flowchart of an autonomous method for data source selection, extraction, processing, classification, enrichment, and labeling of entities, relationships, rules, associations, attributes, and scores according to an implementation consistent with the embodiments of the invention;
In [0085]:
FIG. 14 is a data flow diagram of one embodiment of a process for predicting the next steps in a transaction. Referring to FIG. 14, past behavior (e.g., fingerprints) 1401 are input to a processor 1403 for behavior-based reasoning to identify the behavior associated with an individual as being suspicious or indicative of bad behavior.
In [0085]:
In one embodiment, processor 1404 uses a temporal statistical model to determine such patterns based on the customer and their accounts such that the patterns are determined at the customer level and the account level. By identifying such patterns and determining that they overlap with patterns associated with suspicious activity or bad behavior, a determination can be made that the activities of the individual should be brought to the attention of a case analyst.
In [0117]:
if the time-based behavior correlates to financially-specific patterns of suspicious behavior being monitored by determining an extent of overlap between a sequence of events related to the new financially-related transaction and one or more of the financially-specific patterns of suspicious behavior being monitored (processing block 1504). In on embodiment, determining if the time-based behavior correlates to financially-specific patterns of suspicious behavior being monitored by overlapping feature sets.
In [0077] :
After the transformation of raw data into feature vectors, the next operation is to understand the distribution of the feature vectors. This is an important step to check for any biases in the data. In one embodiment, certain validation checks are: (a) If the data favors any particular groups or certain individuals; (b) if there is an imbalance in the dataset with respect to gender or race or an occupation, the chances of the model learning these biases are high; (c) if the sample selected to train the models, does not represent the entire population of the dataset, sample biases could be introduced. If the model has been trained on data where women have not laundered any money then it is likely that the model can draw wrong inferences creating stereotypes or prejudices; and (d) If there are strong correlations amongst variables in the dataset provided by the bank to train the models, this could affect the results too. In one embodiment, auto encoder 1111 removes any such data from the sampled signals using clustering technique.
(BRI: removing overlapping features (signals) between 'failure' and 'normal' classes can help balance data which is a form of feature engineering)
It would have obvious to one of ordinary skill in the art before the effective filing date of the present application to combine ZHANG, Ted , Chow and Reddy.
ZHANG teaches failure labeling, windows for historical failures and detection of abnormality in operation.
Ted teaches ensemble (multiple classifiers) for failure classification and anomaly detection.
Chow teaches IoT maintenance.
Reddy teaches similarity modeling for normal and abnormal activity.
One of ordinary skill would have motivation to combine ZHANG , Ted, Chow and Reddy that that may reduce the time and resources for banking transactions (Reddy [0028]).
In regard to claim 7:
ZHANG, Ted and Chow do not explicitly disclose:
- wherein the performing the failure labeling refurbishment is based on time series similarity which automatically distinguishes between normal and failure signals by using failure labeled cases as ground truth and labeling similar signals as failures.
However, Reddy discloses:
- wherein the performing the failure labeling refurbishment is based on time series similarity which automatically distinguishes between normal and failure signals by using failure labeled cases as ground truth and labeling similar signals as failures.
In [0081]:
Referring back to FIG. 11B, in one embodiment, output matrix 1112 is a consolidated feature set. In one embodiment, a large number of observable quantities from the multi-dimensional input data are organized as signals. In some embodiments, each signal comprises a plurality of threat vectors measured simultaneously in a time unit. The collection of signals is organized as a financial genome in which various threat vectors are linked by their similarity. The similarity is a measure imposed by the user. A threat diffusion similarity measure imposes a similarity relationship between any two data points by computing all combinations among pairs of data points. In one embodiment, these threat vectors 1112 are clustered using similarity measures that characterize different behavioral patterns, such that all the normal activities are inside “safe” clusters and all anomalies are outside the safe clusters.
In [0048]:
particular instances of a data genome can serve as a model for a banking industry and serve as a reference to represent one or more relationships, interactions, and transactions among and between such entities and individuals
(BRI: a reference is a ground truth)
In [0081]:
In these financial genomes, the user can redefine relevance via a similarity measure, and in this way filter away unrelated information. In one embodiment, self-organization of threat vectors is achieved through local similarity modeling.
In regard to claim 8:
ZHANG, Ted and Chow do not explicitly disclose:
wherein the failure labeling model marks all instances from [xn- pw: xn] as failures before each
failure, then the failure labeling model performs label refurbishment by:
- a. selecting XF = [xl, x2,..., xf], XF are failure cases from all the failure-labeled predictive windows;
- b. selecting XNO = [xl, x2,..., xno], where XN are Normal instances with no failure in the time span -pw, +pw for each x as normal instances unrelated to failures;
- c. deleting any instances that contain sequences intersecting from the two sets XF and XNO; and
- d. labeling remaining examples from training data that do not belong to XF and XNO as failure or normal using a DTW (Dynamic temporal warping) similarity measurement.
However, Reddy discloses:
- c. deleting any instances that contain sequences intersecting from the two sets XF and XNO;
In [0014:
FIG. 5 is a flowchart of an autonomous method for data source selection, extraction, processing, classification, enrichment, and labeling of entities, relationships, rules, associations, attributes, and scores according to an implementation consistent with the embodiments of the invention;
In [0085]:
FIG. 14 is a data flow diagram of one embodiment of a process for predicting the next steps in a transaction. Referring to FIG. 14, past behavior (e.g., fingerprints) 1401 are input to a processor 1403 for behavior-based reasoning to identify the behavior associated with an individual as being suspicious or indicative of bad behavior.
In [0085]:
In one embodiment, processor 1404 uses a temporal statistical model to determine such patterns based on the customer and their accounts such that the patterns are determined at the customer level and the account level. By identifying such patterns and determining that they overlap with patterns associated with suspicious activity or bad behavior, a determination can be made that the activities of the individual should be brought to the attention of a case analyst.
In [0117]:
if the time-based behavior correlates to financially-specific patterns of suspicious behavior being monitored by determining an extent of overlap between a sequence of events related to the new financially-related transaction and one or more of the financially-specific patterns of suspicious behavior being monitored (processing block 1504). In on embodiment, determining if the time-based behavior correlates to financially-specific patterns of suspicious behavior being monitored by overlapping feature sets.
In [0077] :
After the transformation of raw data into feature vectors, the next operation is to understand the distribution of the feature vectors. This is an important step to check for any biases in the data. In one embodiment, certain validation checks are: (a) If the data favors any particular groups or certain individuals; (b) if there is an imbalance in the dataset with respect to gender or race or an occupation, the chances of the model learning these biases are high; (c) if the sample selected to train the models, does not represent the entire population of the dataset, sample biases could be introduced. If the model has been trained on data where women have not laundered any money then it is likely that the model can draw wrong inferences creating stereotypes or prejudices; and (d) If there are strong correlations amongst variables in the dataset provided by the bank to train the models, this could affect the results too. In one embodiment, auto encoder 1111 removes any such data from the sampled signals using clustering technique.
(BRI: removing overlapping features (signals) between 'failure' and 'normal' classes can help balance data which is a form of feature engineering)
It would have obvious to one of ordinary skill in the art before the effective filing date of the present application to combine ZHANG, Ted , Chow and Reddy.
ZHANG teaches failure labeling, windows for historical failures and detection of abnormality in operation.
Ted teaches ensemble (multiple classifiers) for failure classification and anomaly detection.
Chow teaches IoT maintenance.
Reddy teaches similarity modeling for normal and abnormal activity.
One of ordinary skill would have motivation to combine ZHANG , Ted, Chow and Reddy that that may reduce the time and resources for banking transactions (Reddy [0028]).
ZHANG, Ted, Chow and Reddy do not explicitly disclose:
a. selecting XF = [xl, x2,..., xf], XF are failure cases from all the failure-labeled predictive windows;
- b. selecting XNO = [xl, x2,..., xno], where XN are Normal instances with no failure in the time span -pw, +pw for each x as normal instances unrelated to failures;
- d. labeling remaining examples from training data that do not belong to XF and XNO as failure or normal using a DTW (Dynamic temporal warping) similarity measurement.
However, Wang discloses:
wherein the failure labeling model marks all instances from [xn- pw: xn] as failures before each
failure, then the failure labeling model performs label refurbishment by:
in [0007]:
a method may include monitoring a power substation asset. During an offline analysis mode, training data may be acquired and processing, and one or more classifiers may be generated for an online anomaly detection and localization mode
in [0033]:
collection of ensembles may be combined with data augmentation for enhanced classification accuracy under a small sample size and unbalanced data challenge.
- a. selecting XF = [xl, x2,..., xf], XF are failure cases from all the failure-labeled predictive windows;
In [0081]:
In one particular embodiment, a Nearest Neighbor classifier may be implemented using dynamic time warping as a similarity metric, for example. In a Nearest Neighbor classifier embodiment, each hidden layer node may store a time sequence which may comprise a representative sequence from a cluster. The cluster may represent a specific system state such as normal, transformer pre-failure, potential transformer (PT) pre-failure, voltage transformer (VT) pre-failure, arrestor pre-failure, circuit breaker mis-operation, loose connection, or instrument drifting, to name just a few examples among many.
(BRI: the sequence of instances are provided within the context of abnormal states (breaker mis-operation))
In [0035]:
Once a number of unclassified instances reach to a certain threshold number, such as 20, for example, the system may trigger a low-level alarm once to allow for an operator to analyze stored time series snapshots and confirm a particular data label. Subsequent labeled data may be sent to a classifier database for model training use.
- b. selecting XNO = [xl, x2,..., xno], where XN are Normal instances
In [0081]:
In one particular embodiment, a Nearest Neighbor classifier may be implemented using dynamic time warping as a similarity metric, for example. In a Nearest Neighbor classifier embodiment, each hidden layer node may store a time sequence which may comprise a representative sequence from a cluster. The cluster may represent a specific system state such as normal, transformer pre-failure, potential transformer (PT) pre-failure, voltage transformer (VT) pre-failure, arrestor pre-failure, circuit breaker mis-operation, loose connection, or instrument drifting, to name just a few examples among many.
(BRI: the sequence of instances are provided within the context of normal states (normal))
In [0035]:
Once a number of unclassified instances reach to a certain threshold number, such as 20, for example, the system may trigger a low-level alarm once to allow for an operator to analyze stored time series snapshots and confirm a particular data label. Subsequent labeled data may be sent to a classifier database for model training use.
- d. labeling remaining examples from training data that do not belong to XF and XNO as failure or normal using a DTW (Dynamic temporal warping) similarity measurement.
In [0036]:
Another way to update a model is actively search for PMU related asset condition data from publicly available resources, such as from industry literature, event logs, and/or outage reports, etc. Once new available data reaches a certain value, a similarity between a new instances and an existing training instance may be conducted. If the highest similarity index value goes below a predefined threshold, then this new instance may be added to the training instance and a new model can be initiated.
(BRI: the adding the new instance as “failure” or “normal” based similarity such above threshold a “normal” and below threshold a “failure”)
In [0081]:
In one particular embodiment, a Nearest Neighbor classifier may be implemented using dynamic time warping as a similarity metric,
In [0082]:
PNG
media_image2.png
147
562
media_image2.png
Greyscale
In [0083]:
In Relation 1,
W
k
may comprise a distance which corresponds to the kth element of warping path W.
It would have obvious to one of ordinary skill in the art before the effective filing date of the present application to combine ZHANG, Ted , Chow , Reddy and Wang.
ZHANG teaches failure labeling, windows for historical failures and detection of abnormality in operation.
Ted teaches ensemble (multiple classifiers) for failure classification and anomaly detection.
Chow teaches IoT maintenance.
Reddy teaches similarity modeling for normal and abnormal activity.
Wang discloses similarity measurement,
One of ordinary skill would have motivation to combine ZHANG , Ted, Chow, Reddy, Wang that can provide an improved classification performance (Wang[0094]).
ZHANG , Ted, Chow, Reddy, Wang do not explicitly disclose:
- with no failure in the time span -pw, +pw for each x as normal instances unrelated to failures;
However, Anderson discloses:
- with no failure in the time span -pw, +pw for each x as normal instances unrelated to failures;
In [0140]:
the challenges in mining historical power grid data of high complexity in an unprecedented fashion. The present disclosure contrasts entirely with a subset of work in power engineering where data is generated using Monte Carlo simulations, and simulated failures are predicted using machine learning algorithms.
In [0038]:
FIG. 23 demonstrates that overtreatment in the High Potential Preventive Maintenance program was identified using statistical comparisons to performance of Control Groups and remediation in the form of Modified and A/C Hipot tests was instigated by the utlity. In [0082]:
In this case, the examples are electrical components, and the attribute one wants to predict is whether a failure will occur within a given time interval.
In [0063]:
In general the failure rate of a component or a composite system like a feeder will have a varying MTBF over its lifetime. Something that is new or has just had maintenance may have early failures also known as "infant mortality." Then systems settle down into their mid-life with a lower failure rate and finally the failure rate increases at the end of their lifetimes. (See FIG. 4.)
In [0126]:
The improvement in Mean Time Between Failure was tracked for each network as preventive maintenance work has been done to improve performance since 2002.
In [0126]:
to determine if the end point of a linear regression in yearly MTBF per network in 2009 was significantly improved from the beginning point of the regression in 2002.
It would have obvious to one of ordinary skill in the art before the effective filing date of the present application to combine ZHANG, Ted , Chow , Reddy, Wang and Anderson.
ZHANG teaches failure labeling, windows for historical failures and detection of abnormality in operation.
Ted teaches ensemble (multiple classifiers) for failure classification and anomaly detection.
Chow teaches IoT maintenance.
Reddy teaches similarity modeling for normal and abnormal activity.
Wang discloses similarity measurement,
One of ordinary skill would have motivation to combine ZHANG , Ted, Chow, Reddy, Wang and Anderson that improves the reliability index by lowering the MTBF (Anderson [0127]).
Claims 11-12 are rejected under 35 U.S.C. 103 as being unpatentable over
Zhanpan ZHANG et al. (hereinafter ZHANG) US 2022/0292666 A1,
in view of Marcello Tedesco et.al. (hereinafter Ted) US 2021/0125428 A1,
in view of Arnab Chowdhury et.al. (hereinafter Chow) US 2020/0380336 A1,
further in view of Surendra Reddy et.al. (hereinafter Reddy) US 2019/0259033 A1,
In regard to claim 11:
ZHANG discloses:
- decrease number of features with a feature selection model;
In [0027]:
the model may be used for real-time anomaly prediction of operational assets. Some embodiments might include a feedback loop to, for example, track model accuracy, facilitate the continuous updating of the training data, model improvement, and combinations thereof.
In [0071]:
FIG. 19 illustrates model improvement based on the retraining of an existing model using a new image and updated ground truth data. In FIG. 19, image 1905 is an earlier image used to, for example, initially train a model and image 1910 is a new image that can be used to retrain the model to enhance a performance thereof.
(BRI: model improvement may include decrease of number of features for learning)
- normalizing the features;
In [0055]:
At operation 615, a root cause label is assigned to each visual image including the scatter plots
In [0054]:
drawing a scatter plot for each pair of time series sensor measurements; using, for each scatter plot, a binary scale for each pixel value, or using a continuous scale that incorporates additional information (e.g., data density and/or other normalized sensor measurements) in the scatter plot; adjusting the vertical and horizontal axis scale across scatter plots in the image
in [0054]:
in the scatter plot; adjusting the vertical and horizontal axis scale across scatter plots in the image to, for example, present/magnify certain image features; and adding, to each scatter plot, a comparative scatterplot as a reference/baseline plot, thereby generating a multi-layer image.
ZHANG, Ted and Chow do not explicitly disclose:
- comprising: generating two-dimensional (2D) time series data with timestamps and features;
- and generating three-dimensional (3D) time series data with timestamps, history window, and features.
However, Reddy discloses:
- comprising: generating two-dimensional (2D) time series data with timestamps and features;
In [0119]:
processing logic generates, via the encoder, a matrix having a consolidated set of one or more features that is converted into an explanation of features of the new financially-related transaction that fit at least one pattern of financially-specific patterns of suspicious behavior being monitored (processing block 1506).
In [0081]:
Referring back to FIG. 11B, in one embodiment, output matrix 1112 is a consolidated feature set. In one embodiment, a large number of observable quantities from the multi-dimensional input data are organized as signals. In some embodiments, each signal comprises a plurality of threat vectors measured simultaneously in a time unit.
(BRI: generating a matrix of features from multidimensional input data with a time unit can result in a form of 2D time series data)
In [0085]:
FIG. 14 is a data flow diagram of one embodiment of a process for predicting the next steps in a transaction. Referring to FIG. 14, past behavior (e.g., fingerprints) 1401 are input to a processor 1403 for behavior-based reasoning to identify the behavior associated with an individual as being suspicious or indicative of bad behavior. The steps and/or actions in a transaction 1402 are input into a processor 1404 for sequence-based reasoning to identify time-based behavior over a period of time to determine if there are correlations to patterns of financial activities that are being monitored.
- and generating three-dimensional (3D) time series data with timestamps, history window, and features.
In [0119]:
processing logic generates, via the encoder, a matrix having a consolidated set of one or more features that is converted into an explanation of features of the new financially-related transaction that fit at least one pattern of financially-specific patterns of suspicious behavior being monitored (processing block 1506).
In [0081]:
Referring back to FIG. 11B, in one embodiment, output matrix 1112 is a consolidated feature set. In one embodiment, a large number of observable quantities from the multi-dimensional input data are organized as signals. In some embodiments, each signal comprises a plurality of threat vectors measured simultaneously in a time unit.
(BRI: generating a matrix of features from multidimensional input data with a time unit can result in a form of 2D time series data)
In [0085]:
FIG. 14 is a data flow diagram of one embodiment of a process for predicting the next steps in a transaction. Referring to FIG. 14, past behavior (e.g., fingerprints) 1401 are input to a processor 1403 for behavior-based reasoning to identify the behavior associated with an individual as being suspicious or indicative of bad behavior. The steps and/or actions in a transaction 1402 are input into a processor 1404 for sequence-based reasoning to identify time-based behavior over a period of time to determine if there are correlations to patterns of financial activities that are being monitored.
(BRI: 3D within the context of a multi-dimensional data generation)
It would have obvious to one of ordinary skill in the art before the effective filing date of the present application to combine ZHANG, Ted , Chow and Reddy.
ZHANG teaches failure labeling, windows for historical failures and detection of abnormality in operation.
Ted teaches ensemble (multiple classifiers) for failure classification and anomaly detection.
Chow teaches IoT maintenance.
Reddy teaches similarity modeling for normal and abnormal activity.
One of ordinary skill would have motivation to combine ZHANG , Ted, Chow and Reddy that that may reduce the time and resources for banking transactions (Reddy [0028]).
In regard to claim 12:
ZHANG and Ted do not explicitly disclose:
- comprising providing the 3D time series data to the failure classification model and the anomaly detection model.
However, Chow discloses:
- comprising providing the 3D time series data to the failure classification model and the anomaly detection model.
In [0034]:
Embodiments of the present invention utilize a deep-learning based architecture for component failure prediction and address a variety of issues inherent in traditional systems.
In [0034]:
(4) ensuring that device observation sequences are weighted based on their importance in their ability to predict a next failure; (5) predicting component failure in any day in a certain window of a future time period; and, (6) providing self-learning for the prediction model.
In [0031]:
A system, method, and computer-readable medium are disclosed for a hardware component failure prediction system that can incorporate a time-series dimension as an input while also addressing issues related to a class imbalance problem associated with failure data. Embodiments provide this capability through the use of a deep learning-based artificial intelligence binary classification method.
In [0032]:
use of the present mechanism for determining hardware component failure prediction can have significant monetary and competitive advantages.
In [0032]:
early component failure detection coupled with preventative replacement and automatic monitoring facilitates total productive maintenance in real time.
(BRI: early failure detection is generally considered as a specific application of anomaly detection, where the "anomalies" are subtle deviations from normal equipment behavior that signal impending failure)
It would have obvious to one of ordinary skill in the art before the effective filing date of the present application to combine ZHANG, Ted and Chow.
ZHANG teaches failure labeling, windows for historical failures and detection of abnormality in operation.
Ted teaches ensemble (multiple classifiers) for failure classification and anomaly detection.
Chow teaches IoT maintenance.
One of ordinary skill would have motivation to combine ZHANG , Ted and Chow that can provide accuracy improvement (Chow [0061]).
Claim 14 is rejected under 35 U.S.C. 103 as being unpatentable over
Zhanpan ZHANG et al. (hereinafter ZHANG) US 2022/0292666 A1,
In view of Marcello Tedesco et.al. (hereinafter Ted) US 2021/0125428 A1,
in view of Arnab Chowdhury et.al. (hereinafter Chow) US 2020/0380336 A1,
further in view of Honggang Wang et.al. (hereinafter Wang) US 2020/0293032 A1,
In regard to claim 14:
ZHANG, Ted and Chow do not explicitly disclose:
- comprising generating anomaly detection model from failure labeling model output selecting normal instances training data to learn how to auto encode normal signals and performing abnormality detection to the sensor data stream.
However, Wang discloses:
- comprising generating anomaly detection model from failure labeling model output selecting normal instances training data to learn how to auto encode normal signals and performing abnormality detection to the sensor data stream.
In [0032]:
early component failure detection coupled with preventative replacement and automatic monitoring facilitates total productive maintenance in real time.
In [0052]:
During modeling, the dataset identifiers are used to bifurcate the dataset appropriately. As an initial step to better enable the model to utilize categorical features, categorical features containing two categories are label encoded and the remaining categorical features are one-hot encoded in the combined set (810).
(BRI: an early component failure detection is an anomaly detection)
In [0081]:
In one particular embodiment, a Nearest Neighbor classifier may be implemented using dynamic time warping as a similarity metric, for example. In a Nearest Neighbor classifier embodiment, each hidden layer node may store a time sequence which may comprise a representative sequence from a cluster. The cluster may represent a specific system state such as normal, transformer pre-failure, potential transformer (PT) pre-failure, voltage transformer (VT) pre-failure, arrestor pre-failure, circuit breaker mis-operation, loose connection, or instrument drifting, to name just a few examples among many.
(BRI: the sequence of instances are provided within the context of normal states (normal))
In [0007]:
during the online anomaly detection and localization mode, power system related data may be received from field devices, a state of a substation system and of the power substation asset component and an unclassified state of one or instances may be generated based on the one or more classifiers. An alert may be generated to indicate the state of the substation system and of the power substation asset.
In [0035]:
Once a number of unclassified instances reach to a certain threshold number, such as 20, for example, the system may trigger a low-level alarm once to allow for an operator to analyze stored time series snapshots and confirm a particular data label. Subsequent labeled data may be sent to a classifier database for model training use.
In [0103]:
The data generated or obtained by the measurement device 1120 may comprise coded data (e.g., encoded data) associated with the power grid system that may input (or be fed into) a traditional SCADA system
It would have obvious to one of ordinary skill in the art before the effective filing date of the present application to combine ZHANG, Ted , Chow , Reddy and Wang.
ZHANG teaches failure labeling, windows for historical failures and detection of abnormality in operation.
Ted teaches ensemble (multiple classifiers) for failure classification and anomaly detection.
Chow teaches IoT maintenance.
Reddy teaches similarity modeling for normal and abnormal activity.
Wang discloses similarity measurement,
One of ordinary skill would have motivation to combine ZHANG , Ted, Chow, Reddy, Wang that can provide an improved classification performance (Wang[0094]).
Conclusion
Any inquiry concerning this communication or earlier communications from the
examiner should be directed to TIRUMALE KRISHNASWAMY RAMESH whose telephone number is (571)272-4605. The examiner can normally be reached by phone.
Examiner interviews are available via telephone, in-person, and video conferencing
using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at
http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Li B Zhen can be reached on phone (571-272-3768). The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be
obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit:
https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for
information about filing in DOCX format.
For additional questions, contact the Electronic
Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO
Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/TIRUMALE K RAMESH/Examiner, Art Unit 2121
/Li B. Zhen/Supervisory Patent Examiner, Art Unit 2121