Last updated: May 29, 2026
Application No. 17/476,401
FRAUD SUSPECTS DETECTION AND VISUALIZATION

Non-Final OA §103
Filed
Sep 15, 2021
Examiner
TRAN, TAN H
Art Unit
2141
Tech Center
2100 — Computer Architecture & Software
Assignee
International Business Machines Corporation
OA Round
4 (Non-Final)
Interview Optional

— +32.1% interview lift. Interview already conducted in this application's prosecution history. This examiner has a 60% grant rate with +32.1% interview lift. Since an interview has already been tried, recommend written response with narrowed claims based on precedent claim evolution patterns.
Based on 313 resolved cases, 2023–2026
Examiner Intelligence

TRAN, TAN H View full profile →
Grants 60% of resolved cases
Career Allowance Rate
189 granted / 313 resolved
+5.4% vs TC avg
Strong +32% interview lift
Without
With
+32.1%
Interview Lift
resolved cases with interview
Typical timeline
3y 6m
Avg Prosecution
32 currently pending
Career history
368
Total Applications
across all art units
Statute-Specific Performance

§101
2.6%
-37.4% vs TC avg
§103
92.1%
+52.1% vs TC avg
§102
4.8%
-35.2% vs TC avg
§112
0.2%
-39.8% vs TC avg
Black line = Tech Center average estimate • Based on career data from 313 resolved cases
Office Action

§103
Notice of Pre-AIA  or AIA  Status
1.	The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

DETAILED ACTION
2.	This Office Action is sent in response to Applicant’s Communication received on 01/07/2026 for application number 17/476,401. 

Response to Amendments
3.	The Amendment filed 01/07/2026 has been entered. Claims 1, 2, 5, 9, 10, 13, 17, 18, 21, and 25 have been amended. Claims 1-25 remain pending in the application. 

Response to Arguments
Applicant argues that the cited references fail to teach or suggest the features of independent claims, as amended. However, the argument is moot since this is a newly presented limitation, thus changes the scope of the claims. However, a newly found reference, Ypma, is applied.

Claim Rejections – 35 USC § 103
4.	The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains.  Patentability shall not be negated by the manner in which the invention was made.


5.	Claims 1, 9, and 17 are rejected under 35 U.S.C. 103 as being unpatentable over Beauchesne et al. (U.S. Patent Application Pub. No. US 12088600 B1) in view of Freese et al. (U.S. Patent Application Pub. No. US 20170017760 A1) and further in view of Ypma et al. (U.S. Patent Application Pub. No. US 20160246185 A1).

	Claim 1: Beauchesne teaches a computer-implemented method comprising: 
generating a plurality of anomaly score variables (i.e. the anomaly detection pipeline may use more than one outlier detection model for each input dataset 1124. For example, each input feature vector may be examined by several machine learned models to obtain a multiple model results, and these results may be aggregated 1134 using an aggregation technique (e.g. a formula or another ML model) to obtain an overall outlier metric or score for the feature vector. In some embodiments, the formula for computing the aggregated score may be configurable by an administrator. In some embodiments, the formula may be automatically adjusted by the anomaly detection system. For example, the weights assigned to individual model results may be periodically tuned by the system based on the recent performance of the models, or based on how far the results of one model deviate from the others; col. 19, lines 34-49) using a plurality of unsupervised models (i.e. At operation 2328, the anomaly detection system determines an outlier metric for the process within the process's assigned category based on the feature vector and using an ensemble of outlier detection models trained using unsupervised machine learning techniques. In some embodiments, the individual results of the outlier detection models are aggregated using a weighted average formula, a voting scheme, an aggregator model, or some other aggregation technique; col. 28, lines 45-63) based on a set of data records (i.e. The process begins at operation 2310, where observation records are received. The observation records (e.g. observation record 130 of FIG. 1A) may include metadata about processes that were executed on hosts in a client network monitored by the anomaly detection system. In some embodiments, data collection agents may be deployed on hosts in the client network in order to gather the observation records; col. 27 line 63 to col. 28 line 3); 
normalizing the plurality of anomaly score variables into a plurality of normalized variables (i.e. At operation 2330, the processes are ranked based on their determined outlier metrics. In some embodiments, the outlier metrics of the processes may be normalized so that they can be compared across process categories. For example, a particular process's outlier metric may be normalized to indicate how extreme of an outlier it is within its own category. In some embodiments, all observed processes collected for a given observation period are ranked according to their outlier metrics, so that the highest ranked processes are the most extreme outliers within their respective categories; col. 28 line 64 to col. 29 line 7); 
constructing at least one interaction (i.e. For example, if a process P has two features A and B—whenever A takes value 1 and B takes value 0 and vice-verse—a process with both features to 0 would be an anomaly; however, both features would not be considered anomalies individually. Therefore, to explain anomalous interaction among variables, in one embodiment, an interpretability layer is implemented to explain such interactions (e.g., using a tree based approach); col. 26, lines 60-67) between multiple models (i.e. At operation 2328, the anomaly detection system determines an outlier metric for the process within the process's assigned category based on the feature vector and using an ensemble of outlier detection models trained using unsupervised machine learning techniques. In some embodiments, the individual results of the outlier detection models are aggregated using a weighted average formula, a voting scheme, an aggregator model, or some other aggregation technique; col. 28, lines 45-63) outputting a first normalized variable of the plurality of normalized variables and a second normalized variables of the plurality of normalized variables (i.e. the anomaly detection pipeline may use more than one outlier detection model for each input dataset 1124. For example, each input feature vector may be examined by several machine learned models to obtain a multiple model results, and these results may be aggregated 1134 using an aggregation technique (e.g. a formula or another ML model) to obtain an overall outlier metric or score for the feature vector; col. 19, lines 34-49), wherein the first normalized variable corresponds to a first anomaly score variable of the plurality of anomaly score variables and the second normalized variable corresponds to a second one of the plurality of anomaly score variables (i.e. the anomaly detection system may use the same models for all observation categories, but use statistical techniques to generate an outlier score that is specific to each category. For example, the system may use a single model to produce raw outlier scores for all observation categories. However, a raw outlier score may then be ranked against only the scores of observations in the same category, to obtain a percentile ranking of the observation within the category. In some embodiments, the outlier scores may be normalized (e.g. reduced to a value between 0 and 1) so that they can be compared across all categories. In some embodiments, detected outliers for each category may be ranked, and only a specified number of highest ranked outliers are outputted as detected anomalies (e.g. anomalous processes or hosts). In some embodiments, the manner that the aggregated score is calculated may be configurable using a configuration interface of the anomaly detection system; col. 20 lines 13-29); the constructing comprising: 
performing a principal component analysis on features of the plurality of anomaly score variables (i.e. the number of features in the feature vector are reduced using a dimensionality reduction technique. In some embodiments, the dimensionality reduction technique uses a machine learning model that was trained using an unsupervised machine learning technique. In some embodiments, the dimensionality reduction technique may be one or more of PCA, NMF, or an Auto Encoder. The result of the dimensionality reduction operation is an input dataset of smaller feature vectors that are ready for consumption by the outlier detection models; col. 28, lines 35-44) with field (i.e. FIG. 3A is a graph 300 that shows a distribution of the number of features 320 that were extracted for many observed leaves (e.g. process categories 120), according to one study. The x-axis 310 shows the frequency count of leaves that had a particular number of features. As shown, the anomaly detection system generated some leaves with large numbers of features (e.g. greater than 100 process features). This means that the matrix of feature vectors (e.g. matrix 200) is a fairly wide and sparse matrix. However, as shown in FIG. 3B, for most process categories, with their number of features shown in the x-axis 340, the “rank” of the matrix (e.g. the linearly independent features of the process category), shown in the y-axis 340, are much smaller. This result thus suggests that the feature vectors could benefit greatly from dimensionality reduction; col. 9, lines 12-26); 
creating a rule for an anomaly in the set of anomalies based on the principal component analysis (i.e. a raw outlier score may then be ranked against only the scores of observations in the same category, to obtain a percentile ranking of the observation within the category. In some embodiments, the outlier scores may be normalized (e.g. reduced to a value between 0 and 1) so that they can be compared across all categories. In some embodiments, detected outliers for each category may be ranked, and only a specified number of highest ranked outliers are outputted as detected anomalies (e.g. anomalous processes or hosts); col. 20, lines 18-27);
detecting a set of anomalies based on the created rule (i.e. a raw outlier score may then be ranked against only the scores of observations in the same category, to obtain a percentile ranking of the observation within the category. In some embodiments, the outlier scores may be normalized (e.g. reduced to a value between 0 and 1) so that they can be compared across all categories. In some embodiments, detected outliers for each category may be ranked, and only a specified number of highest ranked outliers are outputted as detected anomalies (e.g. anomalous processes or hosts); col. 20, lines 18-27);
creating a combined anomalies visualization based on the detected set of anomalies; and displaying the created combined anomalies visualization to a user (i.e. At operation 2340, a specified number of processes with the highest outlier metric rankings are selected and outputted as detected anomalous processes. For example, the anomaly detection system may identify the detected anomalous process on a GUI or in an alert. In some embodiments, such output may be used by security analysts at a SOC to assign priority to individual processes or hosts for investigative actions. By outputting only a specified number of anomalous processes, the anomaly detection system does not overwhelm security analysts with a huge volume of anomalies during the hunt process; col. 29, lines 8-19);
one or more anomalies in the displayed visualization (i.e. FIG. 13 is a graph 1300 that illustrates a projection of process datapoints analyzed by a machine learning anomaly detection system, according to some embodiments. In one study, when the anomaly detection pipeline 1100 was applied on a real dataset, scores are obtained on each host of a client network for a hunt. The datapoint for each process on each host are projected into a two-dimensional space for visualization purposes (e.g., as shown in FIG. 13). As shown in the figure, the datapoints are projected into dimensions 1310 and 1320 in the two-dimensional space, as determined using T-distributed Stochastic Neighbor Embedding (t-SNE). As shown in the graph 1300, there is high density in the center of the graph that decreases toward the edge of the graph, which indicates that most of the observed processes exhibit fairly common behavior, with only a few outliers. In some embodiments, the anomaly detection system may implement a graphical user interface that provides the results of graph 1300 as an aid to security analysts in the hunt process; col. 22, lines 32-48) based on the performed principal component analysis (i.e. Machine learning models can output suggestions for what a security analyst should prioritize for further investigation and/or security operations. In order to provide the security analyst with a starting point for where to start investigating, it would be beneficial to provide an explanation of why a process or piece of data was flagged as an anomaly. Accordingly, in some embodiments, an anomaly detection system may implement an anomaly detection model interpreter (e.g. model interpreter 2526 of FIG. 25) that quantifies the effect of each feature individually on the anomaly score; col. 34, lines 41-50).
Beauchesne does not explicitly teach the variables with oblong fields; wherein the combined anomalies visualization comprises a first axis of the first variable and a second axis of the second variable; highlighting one or more anomalies.
However, Freese teaches performing a principal component analysis on features of the anomaly score variables with oblong fields (i.e. The statistical techniques currently used in fraud detection outlier models cannot automatically or accurately approximate a monotonically increasing or decreasing score value in the presence of multiple variables when used by themselves without further ‘transformations”. Transformation is here defined as the act of changing or modifying a mathematical expression, such as a Z-Score or group of Z-Score values, Quartile Method results, Cluster Analysis Outcomes, or Principal Component Analysis Output, into another single, scalar expression, such as a “fraud detection score”. This transformed value, the fraud detection score, would be one value that represents the overall risk of fraud, according to a mathematical rule or formula. For example, a Z-Score transformation is the converting or transforming the value of the Z-Scores for 10 variables in a fraud detection outlier model into one single value, which represents overall fraud risk. Common transformations also include collapsing multiple Z-Scores (variables) into individual clusters or dimensions, utilizing Principal Components Analysis, in order to reduce false-positives. Both of these approaches further perpetuate the error caused by skewed or non-normal distributions; para. [0059]).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filling date of the claimed invention to modify the invention of Beauchesne to include the feature of Freese. One would have been motivated to make this modification because it ensures that anomalies related to those elongated distributions are accurately detected and not misclassified as normal data.
However, Ypma teaches creating a rule for an anomaly in the set of anomalies based on the principal component analysis (i.e. applying a statistical threshold to this distribution allows outliers such as point 906 to be identified. For example, 910 in the drawing indicates a Gaussian distribution curve that has been fitted to the data, with its mean centered on the mean value of the coefficient c(PC1). Statistical significance thresholds can be established, as indicated at 912, 914. Point 906 and point 916 lie outside these thresholds, and are identified as being of interest; para. [0123]); detecting a set of anomalies based on the created rule (i.e. Point 906 and point 916 lie outside these thresholds, and are identified as being of interest; para. [0123]); creating a combined anomalies visualization based on the detected set of anomalies (i.e. In FIG. 10(c), additional information can be obtained by plotting the product units against two or more of the identified component vectors. In this illustration, a 2-dimensional plot is shown, with axes corresponding to the coefficients of component vectors PC1 and PC2, that were illustrated in FIGS. 10(a) and (b). This illustration, which may correspond to a printed or displayed report of the apparatus 250, effectively projects all the points in the multidimensional space onto a plane defined by the two component vectors PC1, PC2; para. [0126]), wherein the combined anomalies visualization comprises a first axis of the first variable and a second axis of the second variable (i.e. In FIG. 10(c), additional information can be obtained by plotting the product units against two or more of the identified component vectors. In this illustration, a 2-dimensional plot is shown, with axes corresponding to the coefficients of component vectors PC1 and PC2, that were illustrated in FIGS. 10(a) and (b). This illustration, which may correspond to a printed or displayed report of the apparatus 250, effectively projects all the points in the multidimensional space onto a plane defined by the two component vectors PC1, PC2; para. [0126]); displaying the created combined anomalies visualization to a user (i.e. FIG. 16, visualization module 1202 provides a display 1260 in which wafers are plotted against two of the identified component vectors PC1 and PC2; para. [0126, 0173]); and highlighting one or more anomalies in the displayed visualization based on the performed principal component analysis (i.e. Points identified as being of interest will be distinguished by their black color in this drawing and the following drawings, in contrast to the open circles used for other points. The open and closed circles used herein are merely to present a very simple example, and one that is compatible with the requirements of patent drawings. In a user interface of PCA apparatus 250 and RCA apparatus 252 in a practical embodiment, similar markings, and also flags, color coding, different shapes and the like can be used to distinguish many different subsets of the wafers; para. [0091, 0124]).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filling date of the claimed invention to modify the combination of Beauchesne and Freese to include the feature of Ypma. One would have been motivated to make this modification because it enables a user to visually assess the anomaly landscape in a single view, which improves triage and investigation efficiency.

Claims 9 and 17 are similar in scope to Claim 1 and are rejected under a similar rationale.


6.	Claims 2, 10, and 18 are rejected under 35 U.S.C. 103 as being unpatentable over Beauchesne, Freese, Ypma, and further in view of Saeki et al. (U.S. Patent Application Pub. No. US 20230281197 A1).

Claim 2: Beauchesne, Freese, and Ypma teach the computer-implemented method of claim 1. Beauchesne further teaches comprising: constructing a plurality of interactions based on the plurality of normalized variables (i.e. the anomaly detection pipeline may use more than one outlier detection model for each input dataset 1124. For example, each input feature vector may be examined by several machine learned models to obtain a multiple model results, and these results may be aggregated 1134 using an aggregation technique (e.g. a formula or another ML model) to obtain an overall outlier metric or score for the feature vector; col. 19, lines 34-49), wherein the plurality of interactions comprises the at least one interaction (i.e. the anomaly detection pipeline may use more than one outlier detection model for each input dataset 1124. For example, each input feature vector may be examined by several machine learned models to obtain a multiple model results, and these results may be aggregated 1134 using an aggregation technique (e.g. a formula or another ML model) to obtain an overall outlier metric or score for the feature vector; col. 19, lines 34-49); selecting a set of top m interactions from the plurality of interactions based on of their corresponding interaction values (i.e. the anomaly detection system may use the same models for all observation categories, but use statistical techniques to generate an outlier score that is specific to each category. For example, the system may use a single model to produce raw outlier scores for all observation categories. However, a raw outlier score may then be ranked against only the scores of observations in the same category, to obtain a percentile ranking of the observation within the category. In some embodiments, the outlier scores may be normalized (e.g. reduced to a value between 0 and 1) so that they can be compared across all categories. In some embodiments, detected outliers for each category may be ranked, and only a specified number of highest ranked outliers are outputted as detected anomalies (e.g. anomalous processes or hosts). In some embodiments, the manner that the aggregated score is calculated may be configurable using a configuration interface of the anomaly detection system; col. 20 lines 13-29); and detecting the set of anomalies that correspond to the set of top m interactions (i.e. At operation 2340, a specified number of processes with the highest outlier metric rankings are selected and outputted as detected anomalous processes. For example, the anomaly detection system may identify the detected anomalous process on a GUI or in an alert. In some embodiments, such output may be used by security analysts at a SOC to assign priority to individual processes or hosts for investigative actions. By outputting only a specified number of anomalous processes, the anomaly detection system does not overwhelm security analysts with a huge volume of anomalies during the hunt process; col. 29, lines 8-19).
	Beauchesne does not explicitly teach a variance.
	However, Saeki teaches selecting a set of top m interactions from the plurality of interactions based on a variance of their corresponding interaction values (i.e. The record index generating unit 5313 determines whether or not the variance calculated in step S4004 satisfies a selecting condition 3. If it satisfies the selecting condition 3, the procedure advances to step S4006, or otherwise the procedure advances to step S4007. The selecting condition 3 is, for example, that the variance is greater than or equal to a threshold or is greater than the threshold. The selecting condition 3 may be, for example, that the variance is within the top N (N is a natural number of 1 or more), or the like; para. [0694]).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filling date of the claimed invention to modify the combination of Beauchesne, Freese, and Ypma to include the feature of Saeki. One would have been motivated to make this modification because it improves the efficiency and relevance of the analysis.

Claims 10 and 18 are similar in scope to Claim 2 and are rejected under a similar rationale.

7.	Claims 3, 11, and 19 are rejected under 35 U.S.C. 103 as being unpatentable over Beauchesne, Freese, Ypma, and further in view of Davis et al. (U.S. Patent Application Pub. No. US 20230003721 A1).

Claim 3: Beauchesne, Freese, and Ypma teach the computer-implemented method of claim 1. Beauchesne further teaches comprising: in response to determining that the first anomaly score variable of the plurality of anomaly score variables fails to follow a normal distribution (i.e. the anomaly detection system may use the same models for all observation categories, but use statistical techniques to generate an outlier score that is specific to each category. For example, the system may use a single model to produce raw outlier scores for all observation categories. However, a raw outlier score may then be ranked against only the scores of observations in the same category, to obtain a percentile ranking of the observation within the category. In some embodiments, the outlier scores may be normalized (e.g. reduced to a value between 0 and 1) so that they can be compared across all categories; col. 20, lines 14-29), applying a transformation function to the first anomaly score variable to transform the first anomaly score variable into the first normalized variable (i.e. At operation 2330, the processes are ranked based on their determined outlier metrics. In some embodiments, the outlier metrics of the processes may be normalized so that they can be compared across process categories. For example, a particular process's outlier metric may be normalized to indicate how extreme of an outlier it is within its own category. In some embodiments, all observed processes collected for a given observation period are ranked according to their outlier metrics, so that the highest ranked processes are the most extreme outliers within their respective categories; col. 28 line 64 to col. 29 line 7); and generating a normalized plot using the first normalized variable and the second normalized variable (i.e. FIG. 19 illustrates the models' ability to identify outliers determined using this evaluation strategy (n=1000, k=87). FIG. 19 depicts a histogram of anomaly scores, where the x-axis 1910 is the normalized anomaly scores, and the y-axis indicates an observation count in each score bucket; col. 25, lines 45-60).
Beauchesne does not explicitly teach generating a normalized scatter plot using the first normalized variable and the second normalized variable.
However, Davis teaches generating a normalized scatter plot using the first normalized variable and the second normalized variable (fig. 4, scatter plots of normalized Z-scores; para. [0055]).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filling date of the claimed invention to modify the combination of Beauchesne, Freese, and Ypma to include the feature of Davis. One would have been motivated to make this modification because it allows relationships between the two variables can be visually analyzed.

Claims 11 and 19 are similar in scope to Claim 3 and are rejected under a similar rationale.

8.	Claims 4-5, 12-13, and 20-21 are rejected under 35 U.S.C. 103 as being unpatentable over Beauchesne, Freese, Ypma, Davis, and further in view of Cao et al. (U.S. Patent Application Pub. No. US 20200372383 A1).

Claim 4: Beauchesne, Freese, Ypma, and Davis teach the computer-implemented method of claim 3. Beauchesne further teaches comprising: performing a principal component analysis (PCA) transformation (i.e. the dimensionality reduction stage 1120 may employ a number of dimensionality reduction techniques, such as PCA, NMF, or an Auto Encoder, etc. In some embodiments, other types of dimensionality reduction techniques may also be used; col. 19, lines 1-5); detecting a different set of anomalies; combining the different set of anomalies with the set of anomalies to create a combined set of anomalies (i.e. At operation 2328, the anomaly detection system determines an outlier metric for the process within the process's assigned category based on the feature vector and using an ensemble of outlier detection models trained using unsupervised machine learning techniques. In some embodiments, the individual results of the outlier detection models are aggregated using a weighted average formula, a voting scheme, an aggregator model, or some other aggregation technique. In some embodiments, the outlier detection models in the ensemble may include an Isolation Forest model or a One-Class SVM. In some embodiments, other types of models such as classifiers trained using automatically labeled data may also be used; col. 28, lines 45-60); and transmitting the combined set of anomalies to the user (i.e. At operation 2340, a specified number of processes with the highest outlier metric rankings are selected and outputted as detected anomalous processes. For example, the anomaly detection system may identify the detected anomalous process on a GUI or in an alert. In some embodiments, such output may be used by security analysts at a SOC to assign priority to individual processes or hosts for investigative actions. By outputting only a specified number of anomalous processes, the anomaly detection system does not overwhelm security analysts with a huge volume of anomalies during the hunt process; col. 29, lines 8-19).
Beauchesne does not explicitly teach performing a principal component analysis (PCA) transformation on the normalized scatter plot based on a set of top n components from the PCA transformation.
However, Cao teaches performing a principal component analysis (PCA) transformation on the normalized scatter plot based on a set of top n components from the PCA transformation (i.e. After the dataset is normalized, the top four principle components using singular value decomposition (SVD), i.e., principal components pca 1, pca 2, pca 3 and pca 4, are estimated and taken to re-project all the samples in the dataset and shown in FIG. 3. The positive samples 19 are marked with white stars and the negative samples 21 are marked with gray dots; para. [0033]).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filling date of the claimed invention to modify the combination of Beauchesne, Freese, Ypma, and Davis to include the feature of Cao. One would have been motivated to make this modification because it ensures a more robust, accurate, and comprehensive anomaly process.

Claim 5: Beauchesne, Freese, Ypma, Davis, and Cao teach the computer-implemented method of claim 4. Beauchesne further teaches comprising: detecting a first subset of the different set of anomalies based on a first component from the set of top n components; detecting a second subset of the different set of anomalies based on a second component from the set of top n components; and combining the first subset of the different set of anomalies with the second subset of the different set of anomalies into the different set of anomalies (i.e. the anomaly detection pipeline may use more than one outlier detection model for each input dataset 1124. For example, each input feature vector may be examined by several machine learned models to obtain a multiple model results, and these results may be aggregated 1134 using an aggregation technique (e.g. a formula or another ML model) to obtain an overall outlier metric or score for the feature vector; col. 19, lines 33-41).

Claims 12-13 and 20-21 are similar in scope to Claims 4-5 and are rejected under a similar rationale.

9.	Claims 6-7, 14-15, and 22-23 are rejected under 35 U.S.C. 103 as being unpatentable over Beauchesne, Freese, Raghuramu, Davis, Cao, and further in view of Salunke et al. (U.S. Patent Application Pub. No. US 20190373007 A1).

Claim 6: Beauchesne, Freese, Ypma, Davis, and Cao teach the computer-implemented method of claim 4. Beauchesne further teaches comprising: creating a combined anomalies plot based on the combined set of anomalies, wherein the combined anomalies plot comprises a plurality of data points that graphically identifies the combined set of anomalies; transmitting the combined anomalies plot to the user (i.e. Regarding specificity of the different models in Scenario 1, FIG. 20 illustrates the results for Isolation Forest, FIG. 21 illustrates the results for Bootstrapped Random Forest, and FIG. 22 illustrates the results for Bootstrapped KNN to show the varying performances across models, as well as across the number of features. In these graphs, the x-axis 2010, 2110, and 2210 represent the dimensionality of the data, the y-axis 2020, 2120, and 2220 represent the anomaly scores of the from the models as determined during model evaluation; col. 25, lines 60-67).
Beauchesne does not explicitly teach displaying, to the user, a set of rules utilized in the determination that the selected data point is an anomaly.
However, Salunke teaches displaying, to the user, a set of rules utilized in the determination that the selected data point is an anomaly (i.e. an interactive visualization may allow a user to click or otherwise select temporal regions of a graph to view more details about an anomaly. For example, responsive to an anomaly being detected, an initial chart may be displayed with a temporal region being highlighted where an anomaly was detected. Additional details about the anomaly may be stored in data repository 140 without being initially displayed. Responsive to clicking on the temporal region, the system may access the additional details from data repository 140 and display them to the end user. The additional details may give more specifics about the cause of the anomaly. For instance, if CPU utilization on a target host crosses an upper limit, additional details about the demands (e.g., the number of executions, transactions, user calls, etc.) on the target host may be presented; para. [0094]).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filling date of the claimed invention to modify the combination of Beauchesne, Freese, Davis, Ypma, and Cao to include the feature of Salunke. One would have been motivated to make this modification because it provides valuable insights into how anomalies are detected.

Claim 7: Beauchesne, Freese, Ypma, Davis, Cao, and Salunke teach the computer-implemented method of claim 6. Beauchesne further teaches wherein at least one of the set of rules comprises a PCA rule based on at least one of the top n components (i.e. the anomaly detection pipeline manager 2520 may be implemented to perform at least the following operational steps: (1) access or receive raw data for multiple distinct processes (e.g., software processes) executing or running across multiple distinct hosts, (2) perform pre-processing (e.g., binarization or TF-IDF encoding of the data), (3) perform dimensionality reduction (e.g., using PCA, NMF, or Auto Encoder), (4) invoke one or more machine learning models (e.g., as shown in FIG. 11) to produce respective anomaly scores or results, and (5) aggregate the models' scores or results (e.g., using median, mean, or some other type of aggregation technique) to generate an aggregated score for individual observations of processes or hosts; col. 27, lines 42-55).

Claims 14-15, and 22-23 are similar in scope to Claims 6-7 and are rejected under a similar rationale.

10.	Claims 8, 16, and 24 are rejected under 35 U.S.C. 103 as being unpatentable over Beauchesne, Freese, Ypma, Davis, Cao, Salunke, and further in view of Goldstein et al. (A Comparative Evaluation of Unsupervised Anomaly Detection Algorithms for Multivariate Data, Published: April 19, 2016, pages 1-31).

Claim 8: Beauchesne, Freese, Ypma, Davis, Cao, and Salunke teach the computer-implemented method of claim 6. Beauchesne further teaches adding the set of original anomalies into the combined anomalies plot (i.e. Regarding specificity of the different models in Scenario 1, FIG. 20 illustrates the results for Isolation Forest, FIG. 21 illustrates the results for Bootstrapped Random Forest, and FIG. 22 illustrates the results for Bootstrapped KNN to show the varying performances across models, as well as across the number of features. In these graphs, the x-axis 2010, 2110, and 2210 represent the dimensionality of the data, the y-axis 2020, 2120, and 2220 represent the anomaly scores of the from the models as determined during model evaluation; col. 25, lines 60-67).
Beauchesne does not explicitly teach identifying a set of original anomalies from the plurality of anomaly score variables.
However, Goldstein teaches identifying a set of original anomalies from the plurality of anomaly score variables (i.e. anomaly scores reduced by models before any dimensionality reduction; pages 1-17).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filling date of the claimed invention to modify the combination of Beauchesne, Freese, Ypma, Davis, Cao, and Salunke to include the feature of Goldstein. One would have been motivated to make this modification because it ensures the initial anomalies are captured and considered in the final analysis.

Claims 16 and 24 are similar in scope to Claim 8 and are rejected under a similar rationale.

11.	Claim 25 is rejected under 35 U.S.C. 103 as being unpatentable over Beauchesne in view of Cao, in view of Freese, and further in view of Ypma.

Claim 25: Beauchesne teaches a computer-implemented method comprising: 
generating a plurality of anomaly score variables (i.e. the anomaly detection pipeline may use more than one outlier detection model for each input dataset 1124. For example, each input feature vector may be examined by several machine learned models to obtain a multiple model results, and these results may be aggregated 1134 using an aggregation technique (e.g. a formula or another ML model) to obtain an overall outlier metric or score for the feature vector. In some embodiments, the formula for computing the aggregated score may be configurable by an administrator. In some embodiments, the formula may be automatically adjusted by the anomaly detection system. For example, the weights assigned to individual model results may be periodically tuned by the system based on the recent performance of the models, or based on how far the results of one model deviate from the others; col. 19, lines 34-49) using a plurality of unsupervised models (i.e. At operation 2328, the anomaly detection system determines an outlier metric for the process within the process's assigned category based on the feature vector and using an ensemble of outlier detection models trained using unsupervised machine learning techniques. In some embodiments, the individual results of the outlier detection models are aggregated using a weighted average formula, a voting scheme, an aggregator model, or some other aggregation technique; col. 28, lines 45-63) based on a set of data records (i.e. The process begins at operation 2310, where observation records are received. The observation records (e.g. observation record 130 of FIG. 1A) may include metadata about processes that were executed on hosts in a client network monitored by the anomaly detection system. In some embodiments, data collection agents may be deployed on hosts in the client network in order to gather the observation records; col. 27 line 63 to col. 28 line 3);
normalizing the plurality of anomaly score variables into a plurality of normalized variables (i.e. At operation 2330, the processes are ranked based on their determined outlier metrics. In some embodiments, the outlier metrics of the processes may be normalized so that they can be compared across process categories. For example, a particular process's outlier metric may be normalized to indicate how extreme of an outlier it is within its own category. In some embodiments, all observed processes collected for a given observation period are ranked according to their outlier metrics, so that the highest ranked processes are the most extreme outliers within their respective categories; col. 28 line 64 to col. 29 line 7); 
performing a principal component analysis (PCA) transformation between multiple models (i.e. At operation 2328, the anomaly detection system determines an outlier metric for the process within the process's assigned category based on the feature vector and using an ensemble of outlier detection models trained using unsupervised machine learning techniques. In some embodiments, the individual results of the outlier detection models are aggregated using a weighted average formula, a voting scheme, an aggregator model, or some other aggregation technique; col. 28, lines 45-63) outputting the plurality of variables (i.e. the dimensionality reduction stage 1120 may employ a number of dimensionality reduction techniques, such as PCA, NMF, or an Auto Encoder, etc. In some embodiments, other types of dimensionality reduction techniques may also be used; col. 19, lines 1-5), wherein the PCA transformation indicates a set of top n components (i.e. the anomaly detection system may use the same models for all observation categories, but use statistical techniques to generate an outlier score that is specific to each category. For example, the system may use a single model to produce raw outlier scores for all observation categories. However, a raw outlier score may then be ranked against only the scores of observations in the same category, to obtain a percentile ranking of the observation within the category. In some embodiments, the outlier scores may be normalized (e.g. reduced to a value between 0 and 1) so that they can be compared across all categories. In some embodiments, detected outliers for each category may be ranked, and only a specified number of highest ranked outliers are outputted as detected anomalies (e.g. anomalous processes or hosts). In some embodiments, the manner that the aggregated score is calculated may be configurable using a configuration interface of the anomaly detection system; col. 20 lines 13-29), further comprising:
performing a principal component analysis on features of the plurality of anomaly score variables (i.e. the number of features in the feature vector are reduced using a dimensionality reduction technique. In some embodiments, the dimensionality reduction technique uses a machine learning model that was trained using an unsupervised machine learning technique. In some embodiments, the dimensionality reduction technique may be one or more of PCA, NMF, or an Auto Encoder. The result of the dimensionality reduction operation is an input dataset of smaller feature vectors that are ready for consumption by the outlier detection models; col. 28, lines 35-44) with field (i.e. FIG. 3A is a graph 300 that shows a distribution of the number of features 320 that were extracted for many observed leaves (e.g. process categories 120), according to one study. The x-axis 310 shows the frequency count of leaves that had a particular number of features. As shown, the anomaly detection system generated some leaves with large numbers of features (e.g. greater than 100 process features). This means that the matrix of feature vectors (e.g. matrix 200) is a fairly wide and sparse matrix. However, as shown in FIG. 3B, for most process categories, with their number of features shown in the x-axis 340, the “rank” of the matrix (e.g. the linearly independent features of the process category), shown in the y-axis 340, are much smaller. This result thus suggests that the feature vectors could benefit greatly from dimensionality reduction; col. 9, lines 12-26); 
detecting a set of anomalies from the PCA transformation based on the set of top n components (i.e. the anomaly detection system may use the same models for all observation categories, but use statistical techniques to generate an outlier score that is specific to each category. For example, the system may use a single model to produce raw outlier scores for all observation categories. However, a raw outlier score may then be ranked against only the scores of observations in the same category, to obtain a percentile ranking of the observation within the category. In some embodiments, the outlier scores may be normalized (e.g. reduced to a value between 0 and 1) so that they can be compared across all categories. In some embodiments, detected outliers for each category may be ranked, and only a specified number of highest ranked outliers are outputted as detected anomalies (e.g. anomalous processes or hosts). In some embodiments, the manner that the aggregated score is calculated may be configurable using a configuration interface of the anomaly detection system; col. 20 lines 13-29);
creating a combined anomalies visualization based on the detected set of anomalies; displaying the created combined anomalies visualization to a user (i.e. At operation 2340, a specified number of processes with the highest outlier metric rankings are selected and outputted as detected anomalous processes. For example, the anomaly detection system may identify the detected anomalous process on a GUI or in an alert. In some embodiments, such output may be used by security analysts at a SOC to assign priority to individual processes or hosts for investigative actions. By outputting only a specified number of anomalous processes, the anomaly detection system does not overwhelm security analysts with a huge volume of anomalies during the hunt process; col. 29, lines 8-19); and 
one or more anomalies in the displayed visualization (i.e. FIG. 13 is a graph 1300 that illustrates a projection of process datapoints analyzed by a machine learning anomaly detection system, according to some embodiments. In one study, when the anomaly detection pipeline 1100 was applied on a real dataset, scores are obtained on each host of a client network for a hunt. The datapoint for each process on each host are projected into a two-dimensional space for visualization purposes (e.g., as shown in FIG. 13). As shown in the figure, the datapoints are projected into dimensions 1310 and 1320 in the two-dimensional space, as determined using T-distributed Stochastic Neighbor Embedding (t-SNE). As shown in the graph 1300, there is high density in the center of the graph that decreases toward the edge of the graph, which indicates that most of the observed processes exhibit fairly common behavior, with only a few outliers. In some embodiments, the anomaly detection system may implement a graphical user interface that provides the results of graph 1300 as an aid to security analysts in the hunt process; col. 22, lines 32-48) based on the performed principal component analysis (i.e. Machine learning models can output suggestions for what a security analyst should prioritize for further investigation and/or security operations. In order to provide the security analyst with a starting point for where to start investigating, it would be beneficial to provide an explanation of why a process or piece of data was flagged as an anomaly. Accordingly, in some embodiments, an anomaly detection system may implement an anomaly detection model interpreter (e.g. model interpreter 2526 of FIG. 25) that quantifies the effect of each feature individually on the anomaly score; col. 34, lines 41-50).
Beauchesne does not explicitly teach performing a principal component analysis (PCA) transformation on the plurality of normalized variables; the variables with oblong fields; wherein the combined anomalies visualization comprises a first axis of the first variable and a second axis of the second variable; highlighting one or more anomalies.
However, Cao teaches performing a principal component analysis (PCA) transformation on the plurality of normalized variables, wherein the PCA transformation indicates a set of top n components; detecting a set of anomalies from the PCA transformation based on the set of top n components (i.e. After the dataset is normalized, the top four principle components using singular value decomposition (SVD), i.e., principal components pca 1, pca 2, pca 3 and pca 4, are estimated and taken to re-project all the samples in the dataset and shown in FIG. 3. The positive samples 19 are marked with white stars and the negative samples 21 are marked with gray dots; para. [0033]).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filling date of the claimed invention to modify the invention of Beauchesne to include the feature of Cao. One would have been motivated to make this modification because it ensures a more robust, accurate, and comprehensive anomaly process.
However, Freese teaches performing a principal component analysis on features of the anomaly score variables with oblong fields (i.e. The statistical techniques currently used in fraud detection outlier models cannot automatically or accurately approximate a monotonically increasing or decreasing score value in the presence of multiple variables when used by themselves without further ‘transformations”. Transformation is here defined as the act of changing or modifying a mathematical expression, such as a Z-Score or group of Z-Score values, Quartile Method results, Cluster Analysis Outcomes, or Principal Component Analysis Output, into another single, scalar expression, such as a “fraud detection score”. This transformed value, the fraud detection score, would be one value that represents the overall risk of fraud, according to a mathematical rule or formula. For example, a Z-Score transformation is the converting or transforming the value of the Z-Scores for 10 variables in a fraud detection outlier model into one single value, which represents overall fraud risk. Common transformations also include collapsing multiple Z-Scores (variables) into individual clusters or dimensions, utilizing Principal Components Analysis, in order to reduce false-positives. Both of these approaches further perpetuate the error caused by skewed or non-normal distributions; para. [0059]).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filling date of the claimed invention to modify the combination of Beauchesne and Cao to include the feature of Freese. One would have been motivated to make this modification because it ensures that anomalies related to those elongated distributions are accurately detected and not misclassified as normal data.
However, Ypma teaches creating a combined anomalies visualization based on the detected set of anomalies (i.e. In FIG. 10(c), additional information can be obtained by plotting the product units against two or more of the identified component vectors. In this illustration, a 2-dimensional plot is shown, with axes corresponding to the coefficients of component vectors PC1 and PC2, that were illustrated in FIGS. 10(a) and (b). This illustration, which may correspond to a printed or displayed report of the apparatus 250, effectively projects all the points in the multidimensional space onto a plane defined by the two component vectors PC1, PC2; para. [0126]), wherein the combined anomalies visualization comprises a first axis of the first variable and a second axis of the second variable (i.e. In FIG. 10(c), additional information can be obtained by plotting the product units against two or more of the identified component vectors. In this illustration, a 2-dimensional plot is shown, with axes corresponding to the coefficients of component vectors PC1 and PC2, that were illustrated in FIGS. 10(a) and (b). This illustration, which may correspond to a printed or displayed report of the apparatus 250, effectively projects all the points in the multidimensional space onto a plane defined by the two component vectors PC1, PC2; para. [0126]); displaying the created combined anomalies visualization to a user (i.e. FIG. 16, visualization module 1202 provides a display 1260 in which wafers are plotted against two of the identified component vectors PC1 and PC2; para. [0126, 0173]); and highlighting one or more anomalies in the displayed visualization based on the performed principal component analysis (i.e. Points identified as being of interest will be distinguished by their black color in this drawing and the following drawings, in contrast to the open circles used for other points. The open and closed circles used herein are merely to present a very simple example, and one that is compatible with the requirements of patent drawings. In a user interface of PCA apparatus 250 and RCA apparatus 252 in a practical embodiment, similar markings, and also flags, color coding, different shapes and the like can be used to distinguish many different subsets of the wafers; para. [0091, 0124]).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filling date of the claimed invention to modify the combination of Beauchesne, Cao, and Freese to include the feature of Ypma. One would have been motivated to make this modification because it enables a user to visually assess the anomaly landscape in a single view, which improves triage and investigation efficiency.

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant’s disclosure. 
Chen et al. (Pub. No. US 20220288075 A1), to visualize the sample distribution of the multiple variate patterns, principal component analysis (PCA) was performed. Two principal components (PC1 and PC2) were calculated to build the unsupervised scatter plot.
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action.  Accordingly, THIS ACTION IS MADE FINAL.  See MPEP § 706.07(a).  Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).  
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action.  In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action.  In no event, however, will the statutory period for reply expire later than SIX MONTHS from the date of this final action. 
It is noted that any citation to specific pages, columns, lines, or figures in the prior art references and any interpretation of the references should not be considered to be limiting in any way.  A reference is relevant for all it contains and may be relied upon for all that it would have reasonably suggested to one having ordinary skill in the art.  In re Heck, 699 F.2d 1331, 1332-33, 216 U.S.P.Q. 1038, 1039 (Fed. Cir. 1983) (quoting In re Lemelson, 397 F.2d 1006, 1009, 158 U.S.P.Q. 275, 277 (C.C.P.A. 1968)).
Any inquiry concerning this communication or earlier communications from the examiner should be directed to TAN TRAN whose telephone number is (303)297-4266.  The examiner can normally be reached on Monday - Thursday - 8:00 am - 5:00 pm MT.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Matt Ell can be reached on 571-270-3264.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/TAN H TRAN/Primary Examiner, Art Unit 2141
Read full office action
Prosecution Timeline

Show 15 earlier events
Dec 19, 2025
Applicant Interview (Telephonic)
Jan 07, 2026
Response Filed
Jan 11, 2026
Examiner Interview Summary
Feb 24, 2026
Final Rejection mailed — §103
Mar 18, 2026
Interview Requested
Mar 31, 2026
Applicant Interview (Telephonic)
Apr 10, 2026
Examiner Interview Summary
Apr 22, 2026
Response after Non-Final Action
Precedent Cases

Applications granted by this same examiner with similar technology

18/048,279
Patent 12639613
AUTOMATION DESIGN FOR ACHIEVING LONG TERM STABLE OPERATION OF QUANTUM COMPUTERS
3y 7m to grant Granted May 26, 2026
17/979,488
Patent 12633211
TRAFFIC ACCIDENT PREDICTION SYSTEMS AND METHODS
3y 6m to grant Granted May 19, 2026
17/670,443
Patent 12594668
BRAIN-LIKE DECISION-MAKING AND MOTION CONTROL SYSTEM
4y 1m to grant Granted Apr 07, 2026
17/198,198
Patent 12579420
Analog Hardware Realization of Trained Neural Networks
5y 0m to grant Granted Mar 17, 2026
17/199,407
Patent 12579421
Analog Hardware Realization of Trained Neural Networks
5y 0m to grant Granted Mar 17, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.
Strategy Recommendation AI-generated — please review before filing

Get a prosecution strategy drawn from examiner precedents, rejection analysis, and claim mapping.
Typically takes 5-10 seconds — AI-generated, attorney review required before filing
Prosecution Projections

4-5
Expected OA Rounds
60%
Grant Probability
92%
With Interview (+32.1%)
3y 6m (~0m remaining)
Median Time to Grant
High
PTA Risk
Based on 313 resolved cases by this examiner. Grant probability derived from career allowance rate.