DETAILED ACTION
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
This action is in response to the Application filed on 07/14/2022. Claims 1-20 are pending in the case. All claims are examined and rejected accordingly.
Information Disclosure Statement
3. As required by MPEP 609 (c), the Applicants’ submission of the Information Disclosure Statement(s) filed on 07/14/2022, 11/02/2022, 01/06/2023, 02/15/2023, 03/10/2023, 11/28/2023 are acknowledged by the examiner and the cited references have been considered in the examination of the claims now pending.
Claim Rejections - 35 USC § 101
4. 35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.
5. Claims 1-20 are rejected under 35 U.S.C. 101 because the claimed invention is directed towards an abstract idea, without significantly more.
Step 1
According to the first part of the analysis, in the instant case, claim is directed to a computer implemented method, which is a process and falls within one of the four statutory categories (i.e., process, machine, manufacture, or composition of matter).
Regarding Claim 1, 9 and 20,
At step 2A, prong 1, Does the claim recite a judicial exception?
Claim 1 further recites the steps of :
… training data record includes a plurality of operational features of the communication network and one or more observed performance characteristics of the communication network … (This step relies on performing mathematical relationship operation which falls into the “Mathematical concept” grouping of abstract ideas.),
… the ML model is configured for computing mappings of given input feature-value pairs to output predicted performance characteristics, and wherein, for each input training data record, the mappings represent relationships and/or interactions between one or more combinations among the plurality of operational features and one or more predicted performance characteristics (This step relies on mathematical calculations and mathematical modeling , which falls into the “Mathematical process” grouping of abstract ideas.),
for each input data record of a first subset of the set of training data records, computing a fair distribution of first respective quantitative contributions of each of the plurality of operational features to the one or more predicted performance characteristics of the trained ML model, wherein the first subset includes at least those training data records sufficient to represent a baseline of observed performance characteristics (This step relies on mathematical calculations and mathematical modeling, which falls into the “Mathematical process grouping of abstract ideas.),
for each input data record of a second subset of the set of training data records, computing a fair distribution of second respective quantitative contributions of each of the plurality of operational features to the one or more predicted performance characteristics of the trained ML model, …(This step relies on generation output based on collected data using neural network which is mathematical operation, which falls into the “Mathematical Concepts” and “mathematical Processes” grouping of abstract ideas.),
comparing the first and second respective quantitative contributions to determine a respective degradation metric for associating each of the plurality of operational features of the second subset with the at least one problematic observed performance characteristic of the second subset (This step relies on SHAP value computation and comparison which is mathematical operation, which falls into the “Mathematical Concept” grouping of abstract ideas.),
The claim recites mathematical modeling ( ML training) , Mathematical attribution ( feature contribution/SHAP type) and mathematical comparison which collectively fall under the Mathematical Concepts and Mental Process ( evaluation/comparison steps). Accordingly, the claims recite an abstract idea.
Step 2A prong 2: Does the claim recite additional elements? Do those additional elements, individually and in combination, integrate the judicial exception into a practical application?
Further, the claim does not recite any additional element which could integrate this abstract idea into a practical application, because the additional elements recited of consist of:
“… a computer implemented method …” (claim1), “a system, one or more processors and memory configured for storing instructions …”, (claim 20) (Generic computer components on which to implement the math abstract idea (see MPEP 2106.05(f));
“… training data records …” ( data collection )
“obtaining a set of computer-readable training data records that each characterize operation of a communication network” (data gathering, insignificant extra solution);
The additional elements are recited at a high level of generality and do not amount to significantly more than the abstract idea (MPEP 2106.05(f)). The claim use a computer to perform a math and does not improve the function of the computer or other technology. Accordingly, the claim does not integrate the abstract idea into practical application.
Thus, the claim is directed towards the abstract idea.
Step 2B: Do the additional elements, considered individually and in combination, amount to significantly more than the judicial exception?
No, As shown above with respect to integration of the abstract idea into a practical application, the additional element of “… a computer implemented method …” (claim1), “a system, one or more processors and memory configured for storing instructions …”, (claim 20) (Generic computer components on which to implement the math abstract idea (see MPEP 2106.05(f));
“… training data records …” ( data collection )
“obtaining a set of computer-readable training data records that each characterize operation of a communication network” (data gathering, insignificant extra solution);
The additional elements, alone and in combination, fail to integrate the abstract idea into a practical application or add “significantly more.” Thus, the claims are not patent eligible. Mere instructions to apply an exception using generic computer components cannot provide an inventive concept. Neither can insignificant extra-solution activity. All of these additional elements as generically claimed are thus considered well-understood, routine, and conventional. Therefore, these limitations, taken alone or in combination, do not integrate the abstract idea into a practical application or recite significantly more that the abstract idea.
Thus, these independent claims are not patent eligible.
The dependent claims respectively recite a judicial exception in limitations of: “for each respective operational feature of the second subset, computing a respective severity metric based on the second respective aggregation of SHAP values across the second subset of the respective operational feature; and for each respective operational feature of the second subset, scaling the respective severity metric by a fraction of the total number of data records in the second subset having feature-value pairs associated with the respective operational feature.”(claims 2/10), “for each respective operational feature of the first subset, computing a respective first statistical distribution of respective first SHAP values across the first subset; for each respective operational feature of the second subset, computing a respective second statistical distribution of respective second SHAP values across the second subset; and for each respective operational feature in common in both the first and second subsets, comparing the respective second statistical distribution with the respective first statistical distribution.”(claims 3/11), “for each respective operational feature of the first subset, computing a respective first statistical distribution of respective first SHAP values across the first subset; for each respective operational feature of the second subset, computing a respective second statistical distribution of respective second SHAP values across the second subset; and for each respective operational feature in common in both the first and second subsets, comparing the respective second statistical distribution with the respective first statistical distribution.”(claims 4/12), “determining respective clusters of operational features within records of the second subset; determining a respective frequency among the records of each respective cluster; identifying respective operational clusters as all respective clusters having respective frequencies above a threshold; for each respective operational cluster of the second subset, computing a respective severity metric based on the second respective aggregation of SHAP values across the second subset for operational features of the respective operational cluster; and for each respective operational feature of the second subset, scaling the respective severity metric by a fraction of the total number of training data records in the second subset having the feature-value pair combinations associated with the respective operational cluster.”(claim 5/15), “for each respective operational event of the second subset, computing a respective severity metric for each respective operational feature based on the second respective aggregation of SHAP values across the second subset during the respective operational event; and for each respective operational feature of the second subset, scaling the respective severity metric by the total number of timepoints of the respective operational event.”(claims 6/16), “identifying problematic case baselines according to the determined respective degradation metrics of specific operational features of the second subset as measured by their association with one or more observed performance characteristics; creating templates of operational features (claims 7/17), “computing a model prediction error in the second subset and using the prediction error to adjust an attributed importance of respective operational features; and qualifying an accuracy of representation based on computed model prediction error.”(Claim 8/18), “wherein the communication network is at least one of a telecommunications network, or a data communications network, wherein each training data record comprises a communication history record or system telemetry from one or more network layers of the communication network, the one or more network layers being at least one of: a 5G Core, a RAN, a User Plane, a Control Plan, a virtualization layer, or a physical infrastructure layer, of the communication network, and wherein the operations further include: monitoring one or more performance characteristics observed during runtime operations of the communication network; and localizing a fault to the operational features of one or more network layers.”(claim 19).
These additional limitations (in claims 2-7, 10-19 ) also constitute concepts performed Mathematical concept or mathematical operation groupings of abstract ideas.
This judicial exception is not integrated into a practical application. Additional elements “computer readable medium comprising: computer program code (in claims 2-7, 10-19 ), all amount to no more than adding insignificant extra-solution activity/specifications related to data gathering, data input, or data transmittal. These additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. The dependent claims do not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional elements of non-transitory computer readable medium comprising: computer program code are again insignificant extra-solution activity steps that cannot provide an inventive concept. All of these additional elements as generically claimed are considered well-understood, routine, and conventional.
Therefore, these limitations, taken alone or in combination, do not integrate the abstract idea into a practical application or recite significantly more that the abstract idea. Thus, all of the dependent claims are also not patent eligible.
Examiner Comments
8. In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
Claim Rejections - 35 USC § 103
9. The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
6. Claims 1-20 are rejected under 35 U.S.C. 103 as being unpatentable over REDDY (Pat. No.: US 9516053 B1, Pub. Date 2016-12-06) in view of Lundberg (NPL: Title: A Unified Approach to Interpreting Model Prediction, Published : May 22, 2017)
Regarding independent claim 1,
Muddu teaches a computer-implemented method comprising:
obtaining a set of computer-readable training data records that each characterize operation of a communication network (see Muddu: Fig.7A and 7B, Col.19, Line 40-48, “machine data may contain a record (e.g., a log) of an event that takes place in the network environment, such as an activity of a customer, a user, an transaction, an application, a server, a network or a mobile device. However, in many instances, machine data can be more than mere logs—it can include configurations, data from APIs, message queues, change events, the output of diagnostic commands, call detail records, sensor data from industrial systems, and so forth.”), wherein each given training data record includes a plurality of operational features of the communication network and one or more observed performance characteristics of the communication network (see Muddu: Fig.7A and 7B, Col.19, Line 31-39, “machine data can include performance data, diagnostic information and/or any of various other types of data indicative of performance or operation of equipment (e.g., an action such as upload, delete, or log-in) in a computing system. Such data can be analyzed to diagnose equipment performance problems, monitor user actions and interactions, and to derive other insights like user behavior baseline, anomalies and threats.”), and wherein each operational feature is associated with one or more feature-value pairs specific to the given training record, and each of the one or more observed performance characteristics corresponds to an observation specific to the given training record (see Muddu: Fig.7A and 7B, Col.19, Line 35-39, “data can be analyzed to diagnose equipment performance problems, monitor user actions and interactions, and to derive other insights like user behavior baseline, anomalies and threats.”)
using at least a portion of the set of training data records to train a machine learning (ML) model of network performance to predict expected performance characteristics given the plurality of operational features in the training data records as input and the one or more observed performance characteristics as ground truths (see Muddu: Fig.7A and 7B, Col.19, Line 31-39, “”), wherein the ML model is configured for computing mappings of given input feature-value pairs to output predicted performance characteristics (see Muddu: Fig.7A and 7B, Col.19, Line 31-39, “”) and wherein, for each input training data record, the mappings represent relationships and/or interactions between one or more combinations among the plurality of operational features and one or more predicted performance characteristics (see Muddu: Fig.24, Col.56, Line 63-66, and Col. 57, Line 1-8, “identifying threat indicators, and identifying threats to network security. The process begins by detecting anomalies in activity on a computer network, based on received event data. As shown in FIG. 24 at step 2402, incoming event data 2302 is processed through a plurality of anomaly models 1 through N, which may be machine learning models as discussed above, and which at step 2404 may output anomaly data 2304 indicative of a plurality of anomalies 1 through M. As shown in FIG. 24, an anomaly is not necessarily detected for a given set of event data 2302. For example, as shown at step 2406, when the event data 2302 is processed by anomaly model N, no anomaly is detected.”)
for each input data record of a first subset of the set of training data records, computing a fair distribution of first respective quantitative contributions of each of the plurality of operational features to the one or more predicted performance characteristics of the trained ML model (see Muddu: Fig.25, Col.559, Line 37-47, “Process 2500 continues at step 2508 with outputting an indicator of a particular anomaly if the anomaly score satisfies a specified criterion (e.g., exceeds a threshold). Continuing with the given example, the specified criterion may be set such that an anomaly is detected if the anomaly score is 6 or above, for example. The specified criterion need not be static, however. In some embodiments, the criterion (e.g., threshold) is dynamic and changes based on situational factors. The situational factors may include volume of event data, presence or absence of pre-conditional events, user configurations, and volume of detected anomalies.”) , wherein the first subset includes at least those training data records sufficient to represent a baseline of observed performance characteristics (see Muddu: Fig.24, Col.55, Line 36-45, “The behavioral baseline establishment technique described above (see discussion of UBA/UEBA) can also be integrated with the model state sharing technique here. That is, in addition or as an alternative to sharing model states, a behavioral baseline established by one engine (e.g., the real-time event processing engine) by using a particular machine learning model can be shared along with the model state with another engine (e.g., the batch event processing engine).”)
for each input data record of a second subset of the set of training data records, computing a fair distribution of second respective quantitative contributions of each of the plurality of operational features to the one or more predicted performance characteristics of the trained ML model (see Muddu: Fig.24, Col.54, Line 8-19, “a second engine uses the same particular machine learning model to process a second set of data for producing a score for detecting a network security-related issue. With the model state sharing, the second engine can use the version of the model that has been trained by the first engine to process the second set of data, thereby leveraging the knowledge gained by the first engine to discover a security-related issue in the second set of data.”), wherein the second subset includes only those training data records representing at least one problematic observed performance characteristic (see Muddu: Fig.24, Col.54, Line 50-60, “the security platform can be configured to enable sharing of model states between the real-time processing engine and the batch processing engine for network security anomaly and threat detection. As described above with respect to the CEP engine and the machine learning models, a particular machine learning model can be configured to process a time slice of data to produce a score for detecting a network security-related issue,”); and
Muddu does not explicitly teach the system wherein:
comparing the first and second respective quantitative contributions to determine a respective degradation metric for associating each of the plurality of operational features of the second subset with the at least one problematic observed performance characteristic of the second subset.
However, Lundberg teaches the system wherein:
comparing the first and second respective quantitative contributions to determine a respective degradation metric for associating each of the plurality of operational features of the second subset with the at least one problematic observed performance characteristic of the second subset (see Lundberg: Section 5, Page 8, “We evaluated the benefits of SHAP values using the Kernel SHAP and Deep SHAP approximation methods. First, we compared the computational efficiency and accuracy of Kernel SHAP vs. LIME and Shapley sampling values. Second, we designed user studies to compare SHAP values with alternative feature importance allocations represented by DeepLIFT and LIME. As might be expected, SHAP values prove more consistent with human intuition than other methods that fail to meet Properties 1-3 (Section 2). Finally, we use MNIST digit image classification to compare SHAP with DeepLIFT and LIME.”)
Because both Muddu and Lundberg are in the same/similar field of endeavor of analyzing output of machine learning models, accordingly, it would have been obvious to a person of ordinary skill in the art, before the effective filing date of the invention, to modify the teaching of Muddu anomality detection system to include the SHAP based feature contribution analysis that compare the first and second respective quantitative contributions to determine a respective degradation metric for associating each of the plurality of operational features as taught by Lundberg. One would have been motivated to make such a combination in order to improve interpretability of a model output, to provides a fair and mathematically determined and proofed feature impact to enable more accurate identification of root cause of performance degradation in network environment..
Regarding Claim 2,
Muddu and Lundberg teaches all the limitations of Claim 1. Lundberg further teaches the computer-implemented method wherein:
computing respective first Shapley Additive Explanations (SHAP) values for each of the plurality of operational features in each input data record of the first subset (see Lundberg: Fig.1, Page 2, Section 2.4, “Shapley regression values are feature importances for linear models in the presence of multicollinearity. This method requires retraining the model on all feature subsets S ⊆ F, where F is the set of all features. It assigns an importance value to each feature that represents the effect on the model prediction of including that feature. To compute this effect, a model fS∪{i} is trained with that feature present, and another model fS is trained with the feature withheld.”), wherein each given SHAP value indicates a quantitative contribution of a given operational feature to a given predicted performance characteristic, wherein, for each input data record of the second subset of the set of training data records, computing the fair distribution of second respective quantitative contributions of each of the plurality of operational features to the one or more predicted performance characteristics of the trained ML model (see Lundberg : Fig.1, Abstract : “we present a unified framework for interpreting predictions, SHAP SHapley Additive exPlanations). SHAP assigns each feature an importance value for a particular prediction. Its novel components include: (1) the identification of a new class of additive feature importance measures, and (2) theoretical results showing there is a unique solution in this class with a set of desirable properties.”), comprises:
computing respective second SHAP values for each of the plurality of operational features in each input data record of the second subset (see Lundberg: Page 2, Section 2.4, “Shapley regression values are feature importances for linear models in the presence of multicollinearity. This method requires retraining the model on all feature subsets S ⊆ F, where F is the set of all features. It assigns an importance value to each feature that represents the effect on the model prediction of including that feature. To compute this effect, a model fS∪{i} is trained with that feature present, and another model fS is trained with the feature withheld.”)
It would have been obvious to a person of ordinary skill in the art, before the effective filing date of the invention, to modify the teaching of Muddu anomality detection system to include the SHAP based feature contribution analysis that compare the first and second respective quantitative contributions to determine a respective degradation metric for associating each of the plurality of operational features as taught by Lundberg. One would have been motivated to make such a combination in order to improve interpretability of a model output, to provides a fair and mathematically determined and proofed feature impact to enable more accurate identification of root cause of performance degradation in network environment.
Regarding Claim 3,
Muddu and Lundberg teaches all the limitations of Claim 2. Lundberg further teaches the computer-implemented method wherein:
comparing the first and second respective quantitative contributions to determine the respective degradation metric for associating each of the plurality of operational features of the second subset with the at least one problematic observed performance characteristic (see Lundberg: Page 8, section 5, “We evaluated the benefits of SHAP values using the Kernel SHAP and Deep SHAP approximation methods. First, we compared the computational efficiency and accuracy of Kernel SHAP vs. LIME and Shapley sampling values. Second, we designed user studies to compare SHAP values with alternative feature importance allocations represented by DeepLIFT and LIME. As might be expected, SHAP values prove more consistent with human intuition than other methods that fail to meet Properties 1-3 (Section 2). Finally, we use MNIST digit image classification to compare SHAP with DeepLIFT and LIME.”), comprises:
for each respective operational feature of the second subset, computing a respective severity metric based on the second respective aggregation of SHAP values across the second subset of the respective operational feature (see Lundberg: Page 3, Section 2.4, “Shapley sampling values are meant to explain any model by: (1) applying sampling approximations to Equation 4, and (2) approximating the effect of removing a variable from the model by integrating over samples from the training dataset. This eliminates the need to retrain the model and allows fewer than 2|F| differences to be computed. Since the explanation model form of Shapley sampling values is the same as that for Shapley regression values, it is also an additive feature attribution method.”);; and
for each respective operational feature of the second subset, scaling the respective severity metric by a fraction of the total number of data records in the second subset having feature-value pairs associated with the respective operational feature (see Lundberg: Fig.1, Page 5, “SHAP (SHapley Additive exPlanation) values attribute to each feature the change in the expected model prediction when conditioning on that feature. They explain how to get from the base value E[f(z)] that would be predicted if we did not know any features to the current output f(x). This diagram shows a single ordering. When the model is non-linear or the input features are not independent, however, the order in which features are added to the expectation matters, and the SHAPvalues arise from averaging the φi values across all possible orderings.”)
It would have been obvious to a person of ordinary skill in the art, before the effective filing date of the invention, to modify the teaching of Muddu anomality detection system to include the SHAP based feature contribution analysis that compare the first and second respective quantitative contributions to determine a respective degradation metric for associating each of the plurality of operational features as taught by Lundberg. One would have been motivated to make such a combination in order to improve interpretability of a model output, to provides a fair and mathematically determined and proofed feature impact to enable more accurate identification of root cause of performance degradation in network environment.
Regarding Claim 4,
Muddu and Lundberg teaches all the limitations of Claim 2. Muddu further teaches the computer-implemented method wherein:
comparing the first and second respective quantitative contributions to determine the respective degradation metric for associating each of the plurality of operational features of the second subset with the at least one problematic observed performance characteristic (see Lundberg: Page 8, section 5, “We evaluated the benefits of SHAP values using the Kernel SHAP and Deep SHAP approximation methods. First, we compared the computational efficiency and accuracy of Kernel SHAP vs. LIME and Shapley sampling values. Second, we designed user studies to compare SHAP values with alternative feature importance allocations represented by DeepLIFT and LIME. As might be expected, SHAP values prove more consistent with human intuition than other methods that fail to meet Properties 1-3 (Section 2). Finally, we use MNIST digit image classification to compare SHAP with DeepLIFT and LIME.”), comprises:
for each respective operational feature of the first subset, computing a respective first statistical distribution of respective first SHAP values across the first subset (see Lundberg: Page 2, section 2.4, “Shapley regression values are feature importances for linear models in the presence of multicollinearity. This method requires retraining the model on all feature subsets S ⊆ F, where F is the set of all features.”)
for each respective operational feature of the second subset, computing a respective second statistical distribution of respective second SHAP values across the second subset (see Lundberg: Page 3, Section 2.4, “Shapley sampling values are meant to explain any model by: (1) applying sampling approximations to Equation 4, and (2) approximating the effect of removing a variable from the model by integrating over samples from the training dataset. This eliminates the need to retrain the model and allows fewer than 2|F| differences to be computed. Since the explanation model form of Shapley sampling values is the same as that for Shapley regression values, it is also an additive feature attribution method.”); and
for each respective operational feature in common in both the first and second subsets, comparing the respective second statistical distribution with the respective first statistical distribution (see Lundberg: Fig.1, Page 5, “SHAP (SHapley Additive exPlanation) values attribute to each feature the change in the expected model prediction when conditioning on that feature. They explain how to get from the base value E[f(z)] that would be predicted if we did not know any features to the current output f(x). This diagram shows a single ordering. When the model is non-linear or the input features are not independent, however, the order in which features are added to the expectation matters, and the SHAPvalues arise from averaging the φi values across all possible orderings.”)
See motivation to combine Muddu and Lundberg in claim 1.
Regarding Claim 5,
Muddu and Lundberg teaches all the limitations of Claim 2. Muddu further teaches the computer-implemented method wherein:
determining respective clusters of operational features within records of the second subset (see Muddu: Fig.58, Col.90, Line 50-53, “In a network security context it may be advantageous to identify clusters of nodes (“node clusters” or “clusters”) in a graph,)”
determining a respective frequency among the records of each respective cluster (see Muddu: Fig.73 Col.110, Line 35-45, “The set of parameters 7300 can include a number of connection requests generated at a device in a predefined period, periodicity of the connection requests, e.g., a period or frequency between the connections, number of different destinations contacted, e.g., a diversity of the Internet Protocol (IP) addresses, a number of web objects downloaded to the device, a number of ports at which the destinations are contacted and a Uniform Resource Identifier (URI) of the destinations.”)
identifying respective operational clusters as all respective clusters having respective frequencies above a threshold (see Muddu: Fig.75, Col.114, Line 45-51, “the anomaly detection module 7435 determines if the groups occurred at least a second threshold number of times in which the second threshold number is greater than the first threshold number. If the groups occurred at least a second threshold number of times, the anomaly detection module 7435 determines the groups as anomalous. If neither of the periodic thresholds is satisfied, the group is determined as likely to be benign traffic.”)
for each respective operational cluster of the second subset, computing a respective severity metric based on the second respective aggregation of SHAP values across the second subset for operational features of the respective operational cluster (see Muddu: Fig.58, Col.90, Line 50-53, “In a network security context it may be advantageous to identify clusters of nodes (“node clusters” or “clusters”) in a graph,)”
for each respective operational feature of the second subset, scaling the respective severity metric by a fraction of the total number of training data records in the second subset having the feature-value pair combinations associated with the respective operational (see Muddu: Fig.27, Col.62, Line 35-43, “ach anomaly 1 through M shown in FIG. 28 is shown as a single anomaly for clarity purposes. However, each anomaly shown in FIG. 28 may also represent a cluster of anomalies that are somehow related to one another. For example, anomaly 1 may represent a single instance of an anomaly, multiple anomalies of the same category, or multiple anomalies with substantially matching profiles or footprints.”)
Regarding Claim 6,
Muddu and Lundberg teaches all the limitations of Claim 2. Muddu further teaches the computer-implemented method wherein:
identifying respective operational events of the second subset as time windows during which a performance characteristic is observed as being problematic (see Muddu: Fig.21, Col.50, Line 1-8, “generate a user interface element to solicit an action command to activate a threat response. In one example, the user interface element triggers the action command for sending a message to the target-side computer system to demand termination of a problematic application, blocking of specific network traffic, or removal of a user account”), and wherein comparing the first and second respective quantitative contributions comprises:
for each respective operational event of the second subset, computing a respective severity metric for each respective operational feature based on the second respective aggregation of SHAP values across the second subset during the respective operational event (see Muddu: Fig.40A, Col.75, Line 43-48, “each “Threat Review” view 4000 can identify a particular threat by its type and provides a summary description 4002 along with a threat score 4003. The threat score, determined based on machine learning from the event data, provides an indication of the severity of the risk for network compromise associated with the threat.”); and
for each respective operational feature of the second subset, scaling the respective severity metric by the total number of timepoints of the respective operational event (see Muddu: Fig.26, Col.62, Line 1-7, “indicator score can be assigned based on the processing of the anomaly data with a threat indicator being identified if the threat indicator score satisfies a specified criterion. For example, the 20 entities associated with a particular anomaly may lead to assigning an threat indicator score of 6 on a scale of 1 to 10. Accordingly, a threat indicator is identified because the assigned threat indicator score is at least 6.”)
Regarding Claim 7,
Muddu and Lundberg teaches all the limitations of Claim 1. Muddu further teaches the computer-implemented method wherein:
identifying problematic case baselines according to the determined respective degradation metrics of specific operational features of the second subset as measured by their association with one or more observed performance characteristics; creating templates of operational features (see Muddu: Fig.24, Col.5, Line 36-45, “he behavioral baseline establishment technique described above (see discussion of UBA/UEBA) can also be integrated with the model state sharing technique here. That is, in addition or as an alternative to sharing model states, a behavioral baseline established by one engine (e.g., the real-time event processing engine) by using a particular machine learning model can be shared along with the model state with another engine (e.g., the batch event processing engine). With both the model state and the behavioral baseline established, one engine can take fuller advantage of the knowledge gained by another engine.”), according to at least one of:
a magnitude of the measured associations of operational features with the one or more observed performance characteristics, or (ii) a relative magnitude of the measured associations between operational features with the one or more observed performance characteristics, or (iii) the positive or negative relationship of the measured associations of operational features with the one or more observed performance characteristics; and categorizing problematic performance by comparing the templates (see Muddu: Fig.24, Col.54, Line 48-56, “behavioral baseline is established for a specific entity, also by the real-time event processing engine. Utilizing the techniques introduced here, the batch event processing engine can locate, in the batch of historic event data, data representing a plurality of events that are associated with the specific entity. Then, the batch event processing engine can perform a behavioral analysis of the entity to detect a behavioral anomaly using the same version of machine learning model that has been trained by the real-time event processing engine to compute a degree of behavioral deviation, as compared to the behavioral baseline specific to the entity.”)
Regarding Claim 8,
Muddu and Lundberg teaches all the limitations of Claim 2. Muddu further teaches the computer-implemented method wherein:
computing a model prediction error in the second subset and using the prediction error to adjust an attributed importance of respective operational features (see Muddu: Fig.24, Col.69, Line 12-64, “The graph library component 3550 can dynamically adjust the granularity. For example, in one embodiment, for network activities that occurred during the last two months, the graph library component 3550 may break down the projection data into files corresponding to each hour of the last two months; whereas, for network activities that occurred prior to the last two months, the graph library component 3550 breaks down the projection data into files corresponding two months the graph library component 3550 may break down the projection data into files corresponding to each week or each month”); and
qualifying an accuracy of representation based on computed model prediction error (see Muddu: Fig.24, Col.85, Line 44-53, “the PST model is to be used in a way that, given an observation window with a number of previous symbols, the PST model can predict what the next symbol may be, to identify whether a target window is anomalous (e.g., by having an anomaly count beyond a baseline). Before the PST model is ready to do so, the PST model needs to receive training so that it can more accurately anticipate or predict the next symbol. For example, the PST model can be trained by a certain set of historical symbols.”)
Regarding Claim independent 9,
Claim 9 is a system claim and has similar/same claim limitation as claim 1 and is rejected under the same rationale.
Regarding Claim 10-18,
Claim 10-18 are directed to a system claim and have similar/same claim limitation as claim 2-8 respectively and are rejected under the same rationale.
Regarding Claim 19,
As shown above, REDDY, SATO and Misu teaches all the limitations of claim 1. REDDY further teaches the at least one processor is configured to further cause the printhead maintenance supervisor at least to:
19. The system of claim 9, wherein the communication network is at least one of a telecommunications network, or a data communications network, wherein each training data record comprises a communication history record or system telemetry from one or more network layers of the communication network, the one or more network layers being at least one of: a 5G Core, a RAN, a User Plane, a Control Plan, a virtualization layer, or a physical infrastructure layer, of the communication network, and wherein the operations further include: monitoring one or more performance charactersitics observed during runtime operations of the communication network; and localizing a fault to the operational features of one or more network layers.
Regarding Claim independent 20,
Claim 20 is a non-transitory computer readable medium and has similar/same claim limitation as claim 1 and rejected under the same rationale.
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
PGPUB
NUMBER:
INVENTOR-INFORMATION:
TITLE / DESCRIPTION
US 20240135235 A1
Kennel, Matthew
Title: Explanatory dropout for machine learning models
Description: The disclosed subject matter relates generally to the field of artificial intelligence (AI) and to technical improvements that promote the efficiency and explainability of complex machine learning models (ML Models)..
US 20240248783 A1
Kersch, Péter
Title: Root cause analysis via causality-aware machine learning
Description: The present disclosure is related to root cause analysis via causality-aware machine learning and more particularly to automated root cause analysis for closed-loop control of mobile networks via causality-aware machine learning explanations.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to ZELALEM W SHALU whose telephone number is (571)272-3003. The examiner can normally be reached M- F 0800am- 0500pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Cesar Paula can be reached at (571) 272-4128. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/Zelalem Shalu/Examiner, Art Unit 2145
/CESAR B PAULA/Supervisory Patent Examiner, Art Unit 2145