Prosecution Insights
Last updated: May 29, 2026
Application No. 18/309,755

METHOD FOR OUTLIER ROBUST SUBGROUP INFERENCE VIA CLUSTERING IN THE GRADIENT SPACE

Non-Final OA §101§103
Filed
Apr 28, 2023
Examiner
HOANG, MICHAEL H
Art Unit
2122
Tech Center
2100 — Computer Architecture & Software
Assignee
Massachusetts Institute Of Technology
OA Round
1 (Non-Final)
52%
Grant Probability
Moderate
1-2
OA Rounds
1y 4m
Est. Remaining
76%
With Interview

Examiner Intelligence

Grants 52% of resolved cases
52%
Career Allowance Rate
74 granted / 142 resolved
-2.9% vs TC avg
Strong +24% interview lift
Without
With
+24.3%
Interview Lift
resolved cases with interview
Typical timeline
4y 5m
Avg Prosecution
20 currently pending
Career history
163
Total Applications
across all art units

Statute-Specific Performance

§101
7.0%
-33.0% vs TC avg
§103
82.1%
+42.1% vs TC avg
§102
3.2%
-36.8% vs TC avg
§112
2.5%
-37.5% vs TC avg
Black line = Tech Center average estimate • Based on career data from 142 resolved cases

Office Action

§101 §103
DETAILED ACTION This action is in response to the claims filed 04/28/2023 for Application number 04/28/2023. Claims 1-20 are currently pending. Notice of Pre-AIA or AIA Status The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . Information Disclosure Statement The information disclosure statements (IDS) submitted on 04/28/2023 and 01/21/2026 are in compliance with the provisions of 37 CFR 1.97. Accordingly, the information disclosure statements are being considered by the examiner. Claim Rejections - 35 USC § 101 35 U.S.C. 101 reads as follows: Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title. Claims 1-20 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. Regarding claim 1, Step 1 Analysis: Claim 1 is directed to a process, which falls within one of the four statutory categories. Step 2A Prong 1 Analysis: Claim 1 recites, in part, The limitations of: for each data point in the classification dataset, using gradient space partitioning (GraSP) to identify a gradient representation of each data point by extracting an associated gradient of a logistic regression classification loss with respect to weights of a logistic regression This limitation as drafted, is a process that, under broadest reasonable interpretation, covers the recitation of mathematical calculations which falls within the “Mathematical concepts” grouping of abstract ideas. The limitation of: clustering the gradient representations to provide estimated subgroup labels can be considered to be an evaluation in the human mind This limitation as drafted, is a process that, under broadest reasonable interpretation, covers performance of the limitation in the mind which falls within the “Mental Processes” grouping of abstract ideas. Accordingly, the claim recites an abstract idea. Step 2A Prong 2 Analysis: This judicial exception is not integrated into a practical application. In particular, the claim recites the additional elements - “a machine learning model”. Thus, this element in the claim is recited at a high level of generality such that it amounts to no more than mere instructions to apply the exception using a generic computer component. Please see MPEP 2106.05(f). Accordingly, these additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. The claim further recites: receiving a classification dataset wherein subgroups are unlabeled and outputting cluster assignments as the estimated subgroup labels. These limitations are mere data gathering and outputting steps and thus are insignificant extra-solution activities. Accordingly, these additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. The claim as a whole is directed to an abstract idea. Step 2B Analysis: The claims do not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional elements of utilizing a machine learning model to perform the steps of the claimed process amount to no more than mere instructions to apply the exception using a generic computer component. Mere instructions to apply an exception using a generic computer component cannot provide an inventive concept. Furthermore, the limitations of receiving a classification dataset wherein subgroups are unlabeled and outputting cluster assignments as the estimated subgroup labels. These limitations are mere data gathering and outputting steps and thus are insignificant extra-solution activities is well-understood, routine, and conventional, as evidenced by MPEP §2106.05(d)(II)(I), “receiving or transmitting data over a network”. These limitations therefore remain insignificant extra-solution activity even upon reconsideration, and does not amount to significantly more. Even when considered in combination, these additional elements amount to mere instructions to apply the exception using generic computer components and insignificant extra-solution activity, which cannot provide an inventive concept. The claim is not patent eligible. Regarding claim 2, the rejection of claim 1 is further incorporated, and further, the claim recites: using an outlier-robust clustering algorithm to perform the clustering of the gradient representations. This limitation amounts to mere instructions to apply the judicial exception using a generic computer component. Please see MPEP 2106.05(f). The claim does not include any additional elements that amount to an integration of the judicial exception into a practical application, nor to significantly more than the judicial exception. The claim is not patent eligible. Regarding claim 3, the rejection of claim 1 is further incorporated, and further, the claim recites: wherein classes are labeled in the classification dataset. This limitation amounts to generally linking the judicial exception to a field of use. Please see MPEP 2106.05(h). The claim does not include any additional elements that amount to an integration of the judicial exception into a practical application, nor to significantly more than the judicial exception. The claim is not patent eligible. Regarding claim 4, the rejection of claim 1 is further incorporated, and further, the claim recites: learning group annotations and identifying outliers of the classification dataset. This limitation amounts to additional mental steps in addition to the judicial exception recited in the rejection of claim 1. The claim does not include any additional elements that amount to an integration of the judicial exception into a practical application, nor to significantly more than the judicial exception. The claim is not patent eligible. Regarding claim 5, the rejection of claim 1 is further incorporated, and further, the claim recites: training a robust classifier using the estimated subgroup labels. This limitation amounts to mere instructions to apply the judicial exception using a generic computer component. Please see MPEP 2106.05(f). The claim does not include any additional elements that amount to an integration of the judicial exception into a practical application, nor to significantly more than the judicial exception. The claim is not patent eligible. Regarding claim 6, the rejection of claim 5 is further incorporated, and further, the claim recites: applying distributionally robust optimization (DRO) to train the robust classifier. This limitation amounts to additional mathematical concepts in addition to the judicial exception identified in the rejection of claim 1. The claim does not include any additional elements that amount to an integration of the judicial exception into a practical application, nor to significantly more than the judicial exception. The claim is not patent eligible. Regarding claim 7, the rejection of claim 1 is further incorporated, and further, the claim recites: in response to receiving the classification dataset, applying a non-robust neural network classifier, wherein a last layer representation of the non-robust neural network classifier is extracted as dimension-reduced features. This limitation amounts to mere instructions to apply the judicial exception using a generic computer component. Please see MPEP 2106.05(f). The claim does not include any additional elements that amount to an integration of the judicial exception into a practical application, nor to significantly more than the judicial exception. The claim is not patent eligible. Regarding claim 8, the rejection of claim 5 is further incorporated, and further, the claim recites: wherein the gradient space partitioning is performed on the last-layer representation. This limitation amounts to additional mathematical concepts in addition to the judicial exception identified in the rejection of claim 1. The claim does not include any additional elements that amount to an integration of the judicial exception into a practical application, nor to significantly more than the judicial exception. The claim is not patent eligible. Regarding claim 9, it is substantially similar to a combination of claims 1, 2, 4, and 5 respectively, and is rejected in the same manner, the same art, and reasoning applying. Regarding claims 10-13, they are substantially similar to claims 3 and 6-8 respectively, and are rejected in the same manner, the same art, and reasoning applying. Regarding claim 14, it is substantially similar to claim 1 respectively, and is rejected in the same manner, the same art, and reasoning applying. Claim 14 additionally requires analysis for “A non-transitory computer readable storage medium tangibly embodying a computer readable program code having computer readable instructions that, when executed, causes a computer device to carry out a method of…” however this an additional element that amounts to mere instructions to apply the judicial exception using a generic computer component. Please see MPEP 2106.05(f). Regarding claims 15-20, they are substantially similar to claims 3, 4 and 6-8 respectively, and are rejected in the same manner, the same art, and reasoning applying. Claim Rejections - 35 USC § 103 In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status. The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action: A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made. The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows: 1. Determining the scope and contents of the prior art. 2. Ascertaining the differences between the prior art and the claims at issue. 3. Resolving the level of ordinary skill in the pertinent art. 4. Considering objective evidence present in the application indicating obviousness or nonobviousness. Claims 1, 3, 5-8, 14, and 17-20 are rejected under 35 U.S.C. 103 as being unpatentable over Sohoni et al. ("No Subclass Left Behind: Fine-Grained Robustness in Coarse-Grained Classification Problems" (2022), hereinafter "Sohoni") and further in view of Mu et al. ("GRADIENTS AS FEATURES FOR DEEP REPRESENTATION LEARNIN", hereinafter "Mu"). Regarding claim 1, Sohoni teaches A computer-implemented method for identifying relevant subgroups in a training dataset associated with a machine learning model, comprising: receiving a classification dataset wherein subgroups are unlabeled (“We propose George, a method to both measure and mitigate hidden stratification even when subclass labels are unknown.” [Abstract]); clustering the [gradient] representations to provide estimated subgroup labels (“Our approach relies on estimating unknown subclass labels by clustering a feature representation of the data.” [pg. 3, top para; note: Sohoni teaches clustering a representation.]); and outputting cluster assignments as the estimated subgroup labels (“To obtain a surrogate for this feature space, we leverage the empirical observation that feature representations of deep neural networks trained on a superclass task can carry information about unlabeled subclasses [41]. Next, to improve performance on these estimated subclasses, we minimize the maximum per-cluster average loss, by using the clusters as groups in the GDRO objective [48].” [pg. 5, §4, ¶1]). However fails to explicitly teach for each data point in the classification dataset, using gradient space partitioning (GraSP) to identify a gradient representation of each data point by extracting an associated gradient of a logistic regression classification loss with respect to weights of a logistic regression Mu teaches for each data point in the classification dataset, using gradient space partitioning (GraSP) to identify a gradient representation of each data point by extracting an associated gradient of a logistic regression classification loss with respect to weights of a logistic regression (“These features are gradients of the model parameters with respect to a task-specific loss given an input sample” [Abstract] … “With trivial modifications, our method can easily extend beyond ConvNets and classification, e.g., for a recurrent network as the backbone and/or for a regression task.” [pg. 3, §3., ¶1; See also Eq(1); Mu teaches that features are gradients]) It would have been obvious to one of ordinary skill in the art before the effective filing date to modify Sohoni’s teachings by using gradient representation of each data point as taught by Mu. One would have been motivated to make this modification as Mu notes that with trivial modifications, the method could extend beyond ConvNets and classification and work for regression tasks. [pg. 3, §3, ¶1, Mu] Regarding claim 3, Sohoni/Mu teaches The computer-implemented method of claim 1, Sohoni teaches wherein classes are labeled in the classification dataset (“In real datasets, individual datapoints are typically described by multiple different attributes, yet often only a subset of these are captured by the class labels. For example, a dataset might consist of images labeled “cat” or “dog.”” [pg. 4, §3.1, ¶1]). Regarding claim 5, Sohoni/Mu teaches The computer-implemented method of claim 1, Sohoni teaches further comprising training a robust classifier using the estimated subgroup labels. (“Formally, we train a deep neural network L◦fθ to predict the superclass labels, where fθ : X → Rd is a parametrized “featurizer” and L : Rd → ∆B…outputs classification logits. We then cluster the features output by fθ for the data of each superclass into k clusters, where k is chosen automatically” [pg. 5-6, §4.1, ¶1]) Regarding claim 6, Sohoni/Mu teaches The computer-implemented method of claim 5, Sohoni teaches further comprising applying distributionally robust optimization (DRO) to train the robust classifier. (“We then exploit these estimated subclasses by training a new model to optimize worst-case performance over all estimated subclasses using group distributionally robust optimization (GDRO)” [pg. 2, ¶2]) Regarding claim 7, Sohoni/Mu teaches The computer-implemented method of claim 1, Sohoni teaches further comprising, in response to receiving the classification dataset, applying a non-robust neural network classifier, wherein a last layer representation of the non-robust neural network classifier is extracted as dimension- reduced features. (“The inputs are the datapoints and superclass labels. First, a model is trained with ERM on the superclass classification task. The activations of the penultimate layer are then dimensionality reduced, and clustering is applied to the resulting features to obtain estimated subclasses. Finally, a new model is trained using these clusters as groups for GDRO” [pg. 5, Figure 4 Caption]) Regarding claim 8, Sohoni/Mu teaches The computer-implemented method of claim 7, Sohoni teaches wherein the gradient space partitioning is performed on the last-layer representation. (“The inputs are the datapoints and superclass labels. First, a model is trained with ERM on the superclass classification task. The activations of the penultimate layer are then dimensionality reduced, and clustering is applied to the resulting features to obtain estimated subclasses. Finally, a new model is trained using these clusters as groups for GDRO” [pg. 5, Figure 4 Caption; note: Although Sohoni teaches performing dimensionality reduction on the last layer, the reference doesn’t teach gradient space partitioning, however as noted in claim 1, Mu teaches this feature thus when combined with Sohoni would teach the limitation as recited.]) Same motivation to combine the teachings of Sohoni/Mu as claim 1. Claim 14 recites features similar to claim 1 and is rejected for at least the same reasons therein. Claim 14 additionally requires A non-transitory computer readable storage medium tangibly embodying a computer readable program code having computer readable instructions that, when executed, causes a computer device to carry out a method of…(“ However, we train on 4 GPUs instead of 1” [pg. 22, §CelebA, ¶1]) Regarding claim 17, it is substantially similar to claim 5 respectively, and is rejected in the same manner, the same art, and reasoning applying. Regarding claim 18, it is substantially similar to claim 6 respectively, and is rejected in the same manner, the same art, and reasoning applying. Regarding claim 19, it is substantially similar to claim 7 respectively, and is rejected in the same manner, the same art, and reasoning applying. Regarding claim 20, it is substantially similar to claim 8 respectively, and is rejected in the same manner, the same art, and reasoning applying. Claims 2, 4, 9-13, and 15-16 are rejected under 35 U.S.C. 103 as being unpatentable over Sohoni in view Mu and further in view of Zhai et al. ("DORO:Distributional and Outlier Robust Optimization", hereinafter "Zhai"). Regarding claim 2, Sohoni/Mu teaches The computer-implemented method of claim 1, however fails to explicitly teach further comprising using an outlier-robust clustering algorithm to perform the clustering of the gradient representations. Zhai teaches further comprising using an outlier-robust clustering algorithm to perform the clustering of the gradient representations. (“To resolve this issue, we propose the framework of DORO, for Distributional and Outlier Robust Op timization. At the core of this approach is a refined risk function which prevents DRO from overfitting to potential outliers.” [Abtract]) It would have been obvious to one of ordinary skill in the art before the effective filing date to modify Sohoni’s/Mu’s teachings by implementing the outlier robust algorithm of Zhai. One would have been motivated to make this modification as it prevents DRO from overfitting to potential outliers. [Abstract, Zhai] Regarding claim 4, Sohoni/Mu teaches The computer-implemented method of claim 1, however fails to explicitly teach further comprising learning group annotations and identifying outliers of the classification dataset. Zhai teaches further comprising learning group annotations (“For example, in an algorithmic fairness task, domains are demographic groups defined by a number of protected features such as race and sex. [pg. 2, 2.1, ¶1]) and identifying outliers of the classification dataset. (“After some examination, we pin point one direct cause of this phenomenon: the vulnerablity of DROto outliers that widely exist in modern datasets.” [pg. 4, 3, ¶1]) Same motivation to combine the teachings of Sohoni/Mu/Zhai as claim 2. Regarding claim 9, Sohoni teaches A computer-implemented method for identifying relevant subgroups, in a presence of outliers, for training a classifier to be robust to the identified subgroups, comprising: receiving a classification dataset wherein subgroups are unlabeled (“We propose George, a method to both measure and mitigate hidden stratification even when subclass labels are unknown.” [Abstract]); clustering the [gradient] representations to estimate subgroup labels (“Our approach relies on estimating unknown subclass labels by clustering a feature representation of the data.” [pg. 3, top para; note Mu teaches taking gradient representations of the input thus when combined with Sohoni would teach the recited limitation.]), outputting cluster assignments as the estimated subgroup labels (“To obtain a surrogate for this feature space, we leverage the empirical observation that feature representations of deep neural networks trained on a superclass task can carry information about unlabeled subclasses [41]. Next, to improve performance on these estimated subclasses, we minimize the maximum per-cluster average loss, by using the clusters as groups in the GDRO objective [48].” [pg. 5, §4, ¶1]); and training a robust classifier using the estimated subgroup labels (“Formally, we train a deep neural network L◦fθ to predict the superclass labels, where fθ : X → Rd is a parametrized “featurizer” and L : Rd → ∆B…outputs classification logits. We then cluster the features output by fθ for the data of each superclass into k clusters, where k is chosen automatically” [pg. 5-6, §4.1, ¶1]). However Sohoni fails to explicitly teach for each data point in the classification dataset, using gradient space partitioning (GraSP) to identify a gradient representation of each data point by extracting an associated gradient of a logistic regression classification loss with respect to weights of a logistic regression, Mu teaches for each data point in the classification dataset, using gradient space partitioning (GraSP) to identify a gradient representation of each data point by extracting an associated gradient of a logistic regression classification loss with respect to weights of a logistic regression (“These features are gradients of the model parameters with respect to a task-specific loss given an input sample” [Abstract] … “With trivial modifications, our method can easily extend beyond ConvNets and classification, e.g., for a recurrent network as the backbone and/or for a regression task.” [pg. 3, §3., ¶1; See also Eq(1)]) It would have been obvious to one of ordinary skill in the art before the effective filing date to modify Sohoni’s teachings by using gradient representation of each data point as taught by Mu. One would have been motivated to make this modification as Mu notes that with trivial modifications, the method could extend beyond ConvNets and classification and work for regression tasks. [pg. 3, §3, ¶1, Mu] However Sohoni/Mu fails to explicitly teach wherein the GraSP further learns group annotations and identify outliers wherein clustering further comprises using an outlier-robust clustering algorithm to cluster the gradient representations Zhai teaches wherein the GraSP further learns group annotations (“For example, in an algorithmic fairness task, domains are demographic groups defined by a number of protected features such as race and sex. [pg. 2, 2.1, ¶1]) and identify outliers (“After some examination, we pin point one direct cause of this phenomenon: the vulnerablity of DROto outliers that widely exist in modern datasets.” [pg. 4, 3, ¶1]); wherein clustering further comprises using an outlier-robust clustering algorithm to cluster the gradient representations (“To resolve this issue, we propose the framework of DORO, for Distributional and Outlier Robust Op timization. At the core of this approach is a refined risk function which prevents DRO from overfitting to potential outliers.” [Abtract]) It would have been obvious to one of ordinary skill in the art before the effective filing date to modify Sohoni’s/Mu’s teachings by implementing the outlier robust algorithm of Zhai. One would have been motivated to make this modification as it prevents DRO from overfitting to potential outliers. [Abstract, Zhai] Regarding claims 10-13, they are substantially similar to claims 3 and 6-8 respectively, and are rejected in the same manner, the same art, and reasoning applying. Regarding claim 15, it is substantially similar to claim 2 respectively, and is rejected in the same manner, the same art, and reasoning applying. Regarding claim 16, it is substantially similar to claim 4 respectively, and is rejected in the same manner, the same art, and reasoning applying. Conclusion Any inquiry concerning this communication or earlier communications from the examiner should be directed to MICHAEL H HOANG whose telephone number is (571)272-8491. The examiner can normally be reached Mon-Fri 8:30AM-4:30PM. Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice. If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kakali Chaki can be reached at (571) 272-3719. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300. Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. /MICHAEL H HOANG/PRIMARY EXAMINER, Art Unit 2122
Read full office action

Prosecution Timeline

Apr 28, 2023
Application Filed
Apr 01, 2026
Non-Final Rejection mailed — §101, §103 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

Patent 12639617
RISK ASSESSMENT OF A PROPOSED CHANGE IN A COMPUTING ENVIRONMENT
5y 4m to grant Granted May 26, 2026
Patent 12632793
CLASSIFICATION IN HIERARCHICAL PREDICTION DOMAINS
6y 9m to grant Granted May 19, 2026
Patent 12632175
STOCHASTIC RISK SCORING WITH COUNTERFACTUAL ANALYSIS FOR STORAGE CAPACITY
5y 3m to grant Granted May 19, 2026
Patent 12632512
ULTRASONIC SYSTEM AND METHOD FOR TUNING A MACHINE LEARNING CLASSIFIER USED WITHIN A MACHINE LEARNING ALGORITHM
4y 11m to grant Granted May 19, 2026
Patent 12626782
ARCHITECTURES FOR TRAINING NEURAL NETWORKS USING BIOLOGICAL SEQUENCES, CONSERVATION, AND MOLECULAR PHENOTYPES
8y 5m to grant Granted May 12, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.

Strategy Recommendation AI-generated — please review before filing

Get a prosecution strategy drawn from examiner precedents, rejection analysis, and claim mapping.
Typically takes 5-10 seconds — AI-generated, attorney review required before filing

Prosecution Projections

1-2
Expected OA Rounds
52%
Grant Probability
76%
With Interview (+24.3%)
4y 5m (~1y 4m remaining)
Median Time to Grant
Low
PTA Risk
Based on 142 resolved cases by this examiner. Grant probability derived from career allowance rate.

Sign in with your work email

Enter your email to receive a magic link. No password needed.

Personal email addresses (Gmail, Yahoo, etc.) are not accepted.

Free tier: 3 strategy analyses per month