Last updated: April 19, 2026
Application No. 17/486,272
PREDICTIVE ANOMALY DETECTION USING DEFINED INTERACTION LEVEL ANOMALY SCORES

Final Rejection §103
Filed
Sep 27, 2021
Examiner
SPRAUL III, VINCENT ANTON
Art Unit
2129
Tech Center
2100 — Computer Architecture & Software
Assignee
UNITEDHEALTH GROUP, INCORPORATED
OA Round
4 (Final)
Interview Optional

— +34.7% interview lift. This examiner has a relatively high allow rate; a written response may suffice.
Based on 34 resolved cases, 2023–2026
Examiner Intelligence

SPRAUL III, VINCENT ANTON View full profile →
Grants 59% of resolved cases
Career Allow Rate
20 granted / 34 resolved
+3.8% vs TC avg
Strong +35% interview lift
Without
With
+34.7%
Interview Lift
resolved cases with interview
Typical timeline
4y 6m
Avg Prosecution
30 currently pending
Career history
Total Applications
across all art units
Statute-Specific Performance

§101
22.6%
-17.4% vs TC avg
§103
48.4%
+8.4% vs TC avg
§102
9.1%
-30.9% vs TC avg
§112
14.4%
-25.6% vs TC avg
Black line = Tech Center average estimate • Based on career data from 34 resolved cases
Office Action

§103
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Response to Arguments
Applicant’s arguments filed 12/08/2025 have been fully considered.

Regarding the rejection of claims under 35 U.S.C. 103, Applicant submits that amendments to claim 1 overcome the prior art of record, and that Examiner may have agreed to this during the interview of 11/24/2025:

As discussed during the interview, the cited references fail to teach or suggest at least "generating ... a plurality of feature tuple sets based on the plurality of features and a plurality of non-constant interaction levels, wherein (i) the plurality of feature tuple sets comprise different feature counts defined by a plurality of different non-constant interaction levels ranging from a feature count of zero or one to a feature count equivalent to a number of the plurality of datasets," as recited by claim 1. Applicant's representative understood the Examiner to agree to this distinction during the interview. 

	Applicant further submits that:

The present application teaches a system for anomaly detection using a Taylor expansion over features and their interactions. Each "order" of the Taylor expansion corresponds to a feature tuple size. For example 0th order considers no features (global baseline anomaly level), 1st order considers single-feature effects, 2nd order considers pairwise interactions, 3rd order considers triplet features, and N-th order considers interactions across N features/datasets.

Examiner’s summation of the interview is given in the document posted 12/01/2025, in which Examiner states that “based on an initial review, the previous combination of references did not appear to teach all the limitations of the amended claim, in particular, the use of a Taylor series in the generation of feature tuple sets, although further search and consideration would be necessary.” Examiner notes that the amendment proposed in the interview read “generating, by one or more processors, a plurality of feature tuple sets using Taylor expansion based on the plurality of features and a plurality of non-constant interaction levels” (underline in original); however, the amended claim as submitted does not include the underlined phrase.
Examiner submits that the portion of the amended claim referenced by the Applicant, “the plurality of feature tuple sets comprise different feature counts defined by a plurality of different non-constant interaction levels ranging from a feature count of zero or one to a feature count equivalent to a number of the plurality of datasets,” does not limit the generation of feature tuple sets to those produced by a Taylor expansion, but describes more generally that a plurality of feature tuple sets is generated, including sets of different feature counts, where each count lies in the range from zero or one to the number of available datasets. Further, while the amended claim describes a range for each feature count that could begin at zero, the claim does not require the inclusion of a feature tuple set with a feature count of zero in the plurality of feature tuple sets. The amended claim is taught by the previously cited arts of record, as explained below.
	The argument is therefore found unpersuasive. 	

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-21 rejected under 35 U.S.C. 103 over Peng et al., “A Deep Multi-View Framework for Anomaly Detection on Attributed Networks,” 2020, https://ieeexplore.ieee.org/document/9162509 (hereafter Peng) in view of Shekhar et al., US Pre-Grant Publication No. 2021/0233080 (hereafter Shekhar) and Raman et al., US Pre-Grant Publication No. 2020/0242611 (hereafter Raman).

Regarding claim 1 and analogous claims 8 and 15:
Peng teaches:
(bold only) “generating, by one or more processors, a plurality of feature tuple sets based on the plurality of features and a plurality of non-constant interaction levels, wherein (i) the plurality of feature tuple sets comprise different feature counts defined by a plurality of different non-constant interaction levels ranging from a feature count of zero or one to a feature count equivalent to a number of the plurality of datasets and (ii) a feature tuple of at least one feature tuple set of the plurality of feature tuple sets comprises a first feature value of a first dataset of the plurality of datasets and a second feature value of a second dataset of the plurality of datasets;”: Peng, Fig. 1,

    PNG
    media_image1.png
    416
    554
    media_image1.png
    Greyscale

[showing a set of features f1-10, and four selections of those features, shown as perspectives 1-4, hence, generating, …, a plurality of feature tuple sets, based on the plurality of features and a plurality of non-constant interaction levels. These perspectives have varying numbers of features, the number being in the range of one to the total number of available features, hence, wherein (i) the plurality of feature tuple sets comprise different feature counts defined by a plurality of different non-constant interaction levels ranging from a feature count of zero or one to a feature count equivalent to a number of the plurality of datasets, non-constant interaction level interpreted as an indicator of the number of feature values in a tuple].
(bold only) “generating, by one or more processors, a predicted anomaly score based on a plurality of interaction level anomaly scores respectively generated for the plurality of feature tuple sets”: Peng, Fig. 3, “
    PNG
    media_image2.png
    311
    1000
    media_image2.png
    Greyscale
”; Peng, section 4.3, paragraph 3, ”Weighted Aggregation - giving each view-based vector a different weight and then summing them up, which means that each view has a distinct proportion in the final representation. To be more specific, the combined results Z ∈ ℝn x h2 is selected from the following set:

    PNG
    media_image3.png
    71
    348
    media_image3.png
    Greyscale
 [hence, Z is generated using a plurality of feature tuple sets]”; Peng, section 4.4, paragraph 3, “For another, the attribute decoder aims at approximating the original node attributes from the encoded embeddings. To be more specific, we leverage a simple fully-connected layer to reconstruct the attribute information as follows: 

    PNG
    media_image4.png
    44
    186
    media_image4.png
    Greyscale
[the attribute decoder produces output based on the original feature tuple sets, each associated, as shown above, with varying number of feature values, hence the output is based on a plurality of interaction level anomaly scores] ”; Peng, section 4.5, “After the iterative optimization processs [sic], the abnormal score for the ith node can be computed by 
    PNG
    media_image5.png
    35
    316
    media_image5.png
    Greyscale
 where the first term and the second term report the degree of deviation in structure and attribute, respectively, and λ is a trade-off parameter. Since nodes with high scores are more likely to be anomalous, we can rank anomalies according to abnormal scores [abnormality score based on a difference between original feature tuple sets and their reconstructions from a weighted aggregate, hence, generating … a predicted anomaly score based on a plurality of interaction level anomaly scores respectively generated for the plurality of feature tuple sets].”
“and wherein generating an interaction level anomaly score fora non-constant interaction level, of the plurality of different non-constant interaction levels, associated with the at least one feature tuple set comprises: determining a feature tuple anomaly score that indicates a measure of observed anomalous behavior associated with the feature tuple”: Peng, section 4.5, “After the iterative optimization processs [sic], the abnormal score for the ith node can be computed by 
    PNG
    media_image5.png
    35
    316
    media_image5.png
    Greyscale
 where the first term and the second term report the degree of deviation in structure and attribute, respectively, and λ is a trade-off parameter. Since nodes with high scores are more likely to be anomalous, we can rank anomalies according to abnormal scores [determining a feature tuple anomaly score that indicates a measure of observed anomalous behavior associated with the feature tuple].”
“determining a feature tuple weight for the feature tuple that indicates an estimated contribution of the feature tuple to the predicted anomaly score; determining, for the at least one feature tuple set associated with the non-constant interaction level and comprising the feature tuple, a plurality of weighted feature tuple anomaly scores, wherein determining the plurality of weighted feature tuple anomaly scores comprises determining a first weighted feature tuple anomaly score for the feature tuple based at least in part on the feature tuple anomaly score and the feature tuple weight; and determining the interaction level anomaly score for the non-constant interaction level based at least in part on the plurality of weighted feature tuple anomaly scores”: Peng, Fig. 3, “
    PNG
    media_image2.png
    311
    1000
    media_image2.png
    Greyscale
”; Peng, section 4.3, paragraph 3, ”Weighted Aggregation - giving each view-based vector a different weight and then summing them up, which means that each view has a distinct proportion in the final representation [the final representation is used to determine feature set anomaly, hence, determining a feature tuple weight for the feature tuple that indicates an estimated contribution of the feature tuple to the predicted anomaly score]. To be more specific, the combined results Z ∈ ℝn x h2 is selected from the following set:

    PNG
    media_image3.png
    71
    348
    media_image3.png
    Greyscale
 ”; Peng, section 4.4, paragraph 3, “For another, the attribute decoder aims at approximating the original node attributes from the encoded embeddings. To be more specific, we leverage a simple fully-connected layer to reconstruct the attribute information as follows: 

    PNG
    media_image4.png
    44
    186
    media_image4.png
    Greyscale
”; Peng, section 4.5, “After the iterative optimization processs [sic], the abnormal score for the ith node can be computed by 
    PNG
    media_image5.png
    35
    316
    media_image5.png
    Greyscale
 where the first term and the second term report the degree of deviation in structure and attribute, respectively, and λ is a trade-off parameter [feature set anomaly determined by comparison to set regenerated from weighted composite, hence, determining, for the at least one feature tuple set associated with the non-constant interaction level and comprising the feature tuple, a plurality of weighted feature tuple anomaly scores, wherein determining the plurality of weighted feature tuple anomaly scores comprises determining a first weighted feature tuple anomaly score for the feature tuple based at least in part on the feature tuple anomaly score and the feature tuple weight, non-constant interaction level interpreted as an indicator of the number of feature values in a tuple] Since nodes with high scores are more likely to be anomalous, we can rank anomalies according to abnormal scores [and determining the interaction level anomaly score for the non-constant interaction level based at least in part on the plurality of weighted feature tuple anomaly scores, as each non-constant interaction level is an indicator of feature count associated with a particular feature set, this is interpreted as determining the anomalousness  for the particular feature set using multiple weighted feature set scores].”
	Peng does not explicitly teach:
“A computer-implemented method comprising”
“extracting, by one or more processors, a plurality of features from a plurality of datasets respectively associated with a plurality of external computing entities”
(bold only) “generating, by one or more processors, a plurality of feature tuple sets based on the plurality of features and a plurality of non-constant interaction levels, wherein (i) the plurality of feature tuple sets comprise different feature counts defined by a plurality of different non- constant interaction levels ranging from a feature count of zero or one to a feature count equivalent to a number of the plurality of datasets and (ii) a feature tuple of at least one feature tuple set of the plurality of feature tuple sets comprises a first feature value of a first dataset of the plurality of datasets and a second feature value of a second dataset of the plurality of datasets.”
(bold only) “generating, by one or more processors, a predicted anomaly score based on a plurality of interaction level anomaly scores respectively generated for the plurality of feature tuple sets”
“wherein the predicted anomaly score indicates a likelihood that a user is represented by multiple identities within the plurality of datasets, as a super entity”
“and providing, by the one or more processors and using a communication network, the generated predicted anomaly score to at least one of the plurality of external computing entities”
Shekhar teaches:
“A computer-implemented method comprising”: Shekhar, paragraph 0114, “Each of the components 802-816 of the fraudulent transaction detection  system 106 can include software, hardware, or both. For example, the components 802-816 can include one or more instructions stored on a computer readable storage medium and executable by processors of one or more computing devices, such as a client device or server device. When executed by the one or more processors, the computer-executable instructions of the fraudulent transaction detection system 106 can cause the computing device(s) to perform the methods [computer-implemented method] described herein.”
“extracting, by one or more processors, a plurality of features from a plurality of datasets respectively associated with a plurality of external computing entities” and (bold only) “generating, by one or more processors, a predicted anomaly score based on a plurality of interaction level anomaly scores respectively generated for the plurality of feature tuple sets”: Shekhar, paragraph 0114, “Each of the components 802-816 of the fraudulent transaction detection system 106 can include software, hardware, or both. For example, the components 802-816 can include one or more instructions stored on a computer readable storage medium and executable by processors [by one or more processors] of one or more computing devices, such as a client device or server device”; Shekhar, paragraph 0044, “As mentioned above, the system 100 includes the server(s) 102. The server(s) can generate, store, receive, and/or transmit data, including data regarding digital transactions. For example, the server(s) 102 can receive payment information related to payment of a product or service from one or more client devices (e.g., one or more of the client devices 110a-110n) [a plurality of datasets respectively associated with a plurality of external computing entities]. The server(s) 102 can further receive information related to digital identities associated with digital transactions”; Shekhar, paragraph 0046, “Additionally, the server(s) 102 can include the fraudulent transaction detection system 106. In particular, in one or more embodiments, the fraudulent transaction detection system 106 utilizes the server(s) 102 to identify digital identities that correspond to fraudulent transactions”; Shekhar, paragraph 0047, “For example, in one or more embodiments, the fraudulent transaction detection system 106, via the server(s) 102, identifies a plurality of digital identities  corresponding to a plurality of digital transactions. The fraudulent transaction detection system 106 can, via the server(s) 102, generate a transaction map that includes edge connections between a plurality of nodes that correspond to the plurality of digital identities [extracting, by one or more processors, a plurality of features].”
“wherein the predicted anomaly score indicates a likelihood that a user is represented by multiple identities within the plurality of datasets, as a super entity”: Shekhar, paragraph 0022, “As further mentioned, in one or more embodiments, the fraudulent transaction detection system determines whether a digital identity is associated with a fraudulent transaction. Indeed, using a similarity probability between a pair of digital identities generated by the time dependent graph convolutional neural network, the fraudulent transaction detection system can determine that a digital identity from the pair of digital identities is associated with a fraudulent entity (e.g., a fraudulent user). More particularly, knowing a given node is associated with a fraudulent transaction or entity, the fraudulent transaction detection system can compare other nodes to the given node using similarity scores to determine if the other nodes are the same entity as the given node [wherein the predicted anomaly score indicates a likelihood that a user is represented by multiple identities within the plurality of datasets, as a super entity]. Accordingly, the fraudulent transaction detection system can determine that a digital transaction associated with that digital identity is a fraudulent transaction.”
“providing, by the one or more processors and using a communication network, the generated predicted anomaly score to at least one of the plurality of external computing entities”: Shekhar, paragraph 0056, “In some embodiments, the fraudulent transaction detection system 106 transmits a notification indicating that the digital identity corresponds to a fraudulent transaction. For example, the fraudulent transaction detection system 106 can transmit a notification to an administrator or manager of the online retail website from which the fraudulent transaction originated [providing, by the one or more processors and using a communication network, the generated predicted anomaly score to at least one of the plurality of external computing entities].”
Shekhar and Peng are analogous arts as they are both related to anomaly detection. It would have been obvious to a person having ordinary skill in the art prior to the effective filing date of the claimed invention to have combined the detection of shared identity in Shekhar with the teachings of Peng to arrive at the present invention, in order to better detect fraudulent activity, as stated in Shekhar, paragraph 0022, “More particularly, knowing a given node is associated with a fraudulent transaction or entity, the fraudulent transaction detection system can compare other nodes to the given node using similarity scores to determine if the other nodes are the same entity as the given node. Accordingly, the fraudulent transaction detection system can determine that a digital transaction associated with that digital identity is a fraudulent transaction.”
Raman teaches (bold only) “generating, by one or more processors, a plurality of feature tuple sets based on the plurality of features and a plurality of non-constant interaction levels, wherein (i) the plurality of feature tuple sets comprise different feature counts defined by a plurality of different non- constant interaction levels ranging from a feature count of zero or one to a feature count equivalent to a number of the plurality of datasets and (ii) a feature tuple of at least one feature tuple set of the plurality of feature tuple sets comprises a first feature value of a first dataset of the plurality of datasets and a second feature value of a second dataset of the plurality of datasets”: Raman, paragraph 0077, “For example, DR engine [by one or more processors] 410  may select the top five features recommended by each algorithm, or a subset of features that make to the top 50 percent of features as ranked by each algorithm [a feature tuple of at least one feature tuple set of the plurality of feature tuple sets comprises a first feature value of a first dataset of the plurality of datasets and a second feature value of a second dataset of the plurality of datasets]. In some examples, a user selects the selection criteria from user interface 205 using an I/O device 203. In some examples, DR engine 410 selects a maximum number of features, such as the top few features from each algorithm.”
Raman and Peng are analogous arts as they are both related to anomaly detection. It would have been obvious to a person having ordinary skill in the art prior to the effective filing date of the claimed invention to have combined the feature selecting schemes of Raman to the teachings of Peng to arrive at the present invention, in order to base fraud detection on the most relevant features, as stated in Raman, paragraph 0077, “For example, DR engine 410  may select the top five features recommended by each algorithm, or a subset of features that make to the top 50 percent of features as ranked by each algorithm. In some examples, a user selects the selection criteria from user interface 205 using an I/O device 203. In some examples, DR engine 410 selects a maximum number of features, such as the top few features from each algorithm.”

Regarding claim 2 and analogous claims 9 and 16:
Peng as modified by Shekhar and Raman teaches “the computer-implemented method of claim 1.”
Raman further teaches “wherein determining the feature tuple anomaly score comprises: determining a partial derivative measure of an anomaly distribution measure with respect to the feature tuple, and determining the feature tuple anomaly score based at least in part on the partial derivative measure”: Raman, paragraph 0059, “Strategy expansion engine 406 obtains strategy data 416 from initial strategy engine 404, and generates a modified strategy, which is identified and characterized by modified strategy data 316. The modified strategy may be generated based on the same set of features used to generate the initial strategy as identified by strategy data or may be based on a different set of features, as identified by training data 420. In some examples, the modified strategy is based on the application of one or more discrete stochastic gradient descent (DSGD) algorithms by DSGD engine 408”; Raman, paragraph 0071, “The gain for each ith dimension indicates a change in average bad probability of the action space after the in average bad probability of the action space after the dimension value is changed from Θi to Θi-ai. DSGD engine 408 evaluates, for each iteration, the gains of all thresholds Θ1 , . . . , Θk according to equation (8), and updates the threshold Θi whose partial  derivative is the largest [determining a partial derivative measure of an anomaly distribution measure with respect to the feature tuple, and determining the feature tuple anomaly score based at least in part on the partial derivative measure].” 
Raman and Peng are both related to the same field of endeavor (anomaly detection). It would have been obvious to a person having ordinary skill in the art prior to the effective filing date of the claimed invention to have combined the use of gradient descent in Raman to the teachings of Peng to arrive at the present invention, in order to optimize parameter values in the fraud detection system, as stated in Raman, paragraph 0062, “DSGD engine 408 may generate optimal threshold values for the action space expansion in new strategy S' as follows.”

Regarding claim 3 and analogous claims 10 and 17:
Peng as modified by Shekhar and Raman teaches “the computer-implemented method of claim 1.”
Peng further teaches “wherein determining the interaction level anomaly score comprises: combining the plurality of weighted feature tuple anomaly scores for the feature tuple using a summation operation to generate the interaction level anomaly score”: Peng, Fig. 3, “
    PNG
    media_image2.png
    311
    1000
    media_image2.png
    Greyscale
”; Peng, section 4.3, paragraph 3, ”Weighted Aggregation - giving each view-based vector a different weight and then summing them up, which means that each view has a distinct proportion in the final representation. To be more specific, the combined results Z ∈ ℝn x h2 is selected from the following set:

    PNG
    media_image3.png
    71
    348
    media_image3.png
    Greyscale
 [combining the plurality of weighted feature tuple anomaly scores for the feature tuple using a summation operation to generate the interaction level anomaly score]”;

Regarding claim 4 and analogous claims 11 and 18:
Peng as modified by Shekhar and Raman teaches “the computer-implemented method of claim 1.”
Raman further teaches: 
“wherein determining a weighted feature tuple anomaly score of the plurality of weighted feature tuple anomaly scores for the feature tuple comprises: determining a first feature weight associated with the first feature value and a second feature weight associated with the second feature value based at least in part on the feature tuple weight”: Raman, paragraph 0083, “At this step, the transformed features from the feature transformation step are weighted based on characteristics of the transactions ( e.g., a type of transaction) the transformed features are associated with. For example, to create a wider separation effect between good (e.g., not fraudulent) and bad (e.g., fraudulent) transactions, DR engine 410 may weigh the transformed features based on whether they are associated with a good, or bad, transaction. DR engine 410 generates a Multiplication Factor or Index Mj for the normalized features. Mj is defined as the bad (e.g., fraudulent) rate of ith bin for any normalized feature Xj[determining a first feature weight associated with the first feature value and a second feature weight associated with the second feature value based at least in part on the feature tuple weight].”
“determining a first weight deviation measure associated with the first feature value and a second weight deviation measure associated with the second feature value based at least in part on the first feature value, the second feature value, the first feature weight, and the second feature weight”: Raman, paragraph 0069, “For example, taking the initial strategy defined in equation (3) above, DSGD engine 408, may generate learning rates α1 = σ(C(x)), α2 = σ(x1), α3 = σ(x5), α4 = σ(x4), where α stands for the standard deviation of the acting variable (e.g., feature) calculated based on the training set (e.g., training data 420) [determining a first weight deviation measure associated with the first feature value and a second weight deviation measure associated with the second feature value based at least in part on the first feature value, the second feature value, the first feature weight, and the second feature weight].”
“and determining the weighted feature tuple anomaly score based at least in part on the feature tuple anomaly score for the feature tuple and the first weight deviation measure and the second weight deviation measure”: Raman, paragraph 0069, “For example, taking the initial strategy defined in equation (3) above, DSGD engine 408, may generate learning rates α1 = σ(C(x)), α2 = σ(x1), α3 = σ(x5), α4 = σ(x4), where α stands for the standard deviation of the acting variable (e.g., feature) calculated based on the training set (e.g., training data 420)”; Raman, paragraphs 0096-0097, “Proceeding to step 608, an intermediate strategy is generated based on applying at least one discrete stochastic gradient descent (DSGD) algorithm to the output of the trained classifier and the initial strategy. For example, DSGD engine 408 may apply on one or more discrete stochastic gradient descent algorithms to strategy data 416. At step 610, a new strategy is generated based on applying at least one dimensionality reduction (DR) algorithm to the output of the trained classifier and the intermediate strategy. For example, DR engine 410 may apply one or more dimensionality reduction algorithms to a strategy generated by DSGD engine 408 to provide modified strategy data 316. At step 612, a determination is made as to whether all fraudulent transactions of the transaction data were identified as fraud by the new strategy. For example, each transaction of the training data may be identified as fraudulent or not. Fraud detection computing device 102 may compare the fraud identification of each transaction to a fraud determination based on the new strategy [and determining the weighted feature tuple anomaly score based at least in part on the feature tuple anomaly score for the feature tuple and the first weight deviation measure and the second weight deviation measure].”
Raman and Peng are analogous arts as they are both related to anomaly detection. It would have been obvious to a person having ordinary skill in the art prior to the effective filing date of the claimed invention to have combined the feature selecting and weighting schemes of Raman to the teachings of Shekhar to arrive at the present invention, in order to base fraud detection on the most relevant features, as stated in Raman, paragraph 0085, “Thus, feature bins with better odds ratios (probability of good over probability of bad) than others will have higher Multiplication Factors. In other words, in those feature bins with high Multiplication Factors, the transactions labelled as bad ( e.g., fraudulent transactions) are better separated from the transactions labelled as good (e.g., non fraudulent transactions).”

Regarding claim 5 and analogous claims 12 and 19:
Peng as modified by Shekhar and Raman teaches “the computer-implemented method of claim 1.”
Peng further teaches “for a pth-level non-constant defined interaction level of the feature tuple, the feature tuple may comprise p feature values, where p is a positive integer”: Peng, Fig. 1,

    PNG
    media_image1.png
    416
    554
    media_image1.png
    Greyscale

[showing different perspectives of the same global feature set F, each perspective having p feature values, where p is positive, hence, for a pth-level non-constant defined interaction level of the feature tuple, the feature tuple may comprise p feature values, where p is a positive integer].

Regarding claim 6 and analogous claims 13 and 20:
Peng as modified by Shekhar and Raman teaches “the computer-implemented method of claim 5.”
Peng further teaches “wherein the feature tuple is associated with at least one of an ensemble, an ensemble element, or an ensemble element enumeration”: Peng, Fig. 1,

    PNG
    media_image1.png
    416
    554
    media_image1.png
    Greyscale

[showing perspectives as feature tuples, associated with global feature set F, interpreted as an ensemble, hence, wherein the feature tuple is associated with at least one of an ensemble, an ensemble element, or an ensemble element enumeration].

Regarding claim 7 and analogous claims 14 and 21:
	Peng as modified by Shekhar and Raman teaches “the computer-implemented method of claim 5.”
Peng further teaches (bold only) “wherein the feature tuple is associated with an assigned weight value that is determined based on an output of processing the first feature value and the second feature value using a trained regression-based machine learning model”: Peng, section 4.3, paragraph 3, ”Weighted Aggregation - giving each view-based vector a different weight and then summing them up, which means that each view has a distinct proportion in the final representation [the feature tuple is associated with an assigned weight value]. To be more specific, the combined results Z ∈ ℝn x h2 is selected from the following set:

    PNG
    media_image3.png
    71
    348
    media_image3.png
    Greyscale
 when αi = 0, the ith view will be ignored. The aggregation weights { αi}ki=1 play a role similar to the attention mechanism, here we show three ways to obtain this set of weights: Randomly generate a vector e ∈ ℝk and then create the aggregation weights using the softmax function

    PNG
    media_image6.png
    69
    290
    media_image6.png
    Greyscale

Utilizing the error backpropagation algorithm, αi can be continuously optimized and updated []that is determined based on an output of processing the first feature value and the second feature value using a trained regression-based machine learning model. The resulting weights represent the optimal view aggregation ratio considered by the proposed model.”

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 
Noto et al., “Anomaly Detection Using an Ensemble of Feature Models,” 2010, doi:10.1109/ICDM.2010.140, discloses a method for generating ensembles of features for training a machine learning model to determine whether data samples are in or out of a single class.

Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.

Any inquiry concerning this communication or earlier communications from the examiner should be directed to VINCENT SPRAUL whose telephone number is (703) 756-1511. The examiner can normally be reached M-F 9:00 am - 5:00 pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, MICHAEL HUNTLEY can be reached on (303) 297-4307. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/VAS/Examiner, Art Unit 2129              

/MICHAEL J HUNTLEY/Supervisory Patent Examiner, Art Unit 2129
Read full office action
Prosecution Timeline

Sep 27, 2021
Application Filed
Nov 06, 2024
Non-Final Rejection — §103
Jan 17, 2025
Examiner Interview Summary
Jan 17, 2025
Applicant Interview (Telephonic)
Feb 20, 2025
Response Filed
Mar 18, 2025
Final Rejection — §103
May 05, 2025
Applicant Interview (Telephonic)
May 05, 2025
Examiner Interview Summary
May 29, 2025
Request for Continued Examination
Jun 02, 2025
Response after Non-Final Action
Sep 04, 2025
Non-Final Rejection — §103
Nov 24, 2025
Applicant Interview (Telephonic)
Nov 24, 2025
Examiner Interview Summary
Dec 08, 2025
Response Filed
Jan 15, 2026
Final Rejection — §103 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

17/163,383
Patent 12591634
COMPOSITE EMBEDDING SYSTEMS AND METHODS FOR MULTI-LEVEL GRANULARITY SIMILARITY RELEVANCE SCORING
2y 5m to grant Granted Mar 31, 2026
17/249,028
Patent 12591796
INTELLIGENT DISTANCE PROMPTING
2y 5m to grant Granted Mar 31, 2026
17/353,931
Patent 12572620
RELIABLE INFERENCE OF A MACHINE LEARNING MODEL
2y 5m to grant Granted Mar 10, 2026
17/495,214
Patent 12566974
Method, System, and Computer Program Product for Knowledge Graph Based Embedding, Explainability, and/or Multi-Task Learning
2y 5m to grant Granted Mar 03, 2026
17/317,052
Patent 12547616
SEMANTIC REASONING FOR TABULAR QUESTION ANSWERING
2y 5m to grant Granted Feb 10, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.
AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Prosecution Projections

5-6
Expected OA Rounds
59%
Grant Probability
94%
With Interview (+34.7%)
4y 6m
Median Time to Grant
High
PTA Risk
Based on 34 resolved cases by this examiner. Grant probability derived from career allow rate.
PREDICTIVE ANOMALY DETECTION USING DEFINED INTERACTION LEVEL ANOMALY SCORES

Interview Optional

Examiner Intelligence

Statute-Specific Performance

Office Action

Prosecution Timeline

Precedent Cases

Applications granted by this same examiner with similar technology

AI Strategy Recommendation

Prosecution Projections

Ready to respond to this office action?

Sign in with your work email