Last updated: April 19, 2026
Application No. 18/168,027
MULTI-STAGE TRAINING OF MACHINE LEARNING MODELS

Non-Final OA §101§102§103§112
Filed
Feb 13, 2023
Examiner
ZECHER, CORDELIA P K
Art Unit
2100
Tech Center
2100 — Computer Architecture & Software
Assignee
X Development LLC
OA Round
1 (Non-Final)
This examiner grants 50% of cases after interview

— +25.8% interview lift. A telephonic interview to clarify the technical implementation could significantly improve the outcome.
Based on 509 resolved cases, 2023–2026
Examiner Intelligence

ZECHER, CORDELIA P K View full profile →
Grants 50% of resolved cases
Career Allow Rate
253 granted / 509 resolved
-5.3% vs TC avg
Strong +26% interview lift
Without
With
+25.8%
Interview Lift
resolved cases with interview
Typical timeline
3y 8m
Avg Prosecution
287 currently pending
Career history
796
Total Applications
across all art units
Statute-Specific Performance

§101
19.0%
-21.0% vs TC avg
§103
46.8%
+6.8% vs TC avg
§102
13.1%
-26.9% vs TC avg
§112
16.0%
-24.0% vs TC avg
Black line = Tech Center average estimate • Based on career data from 509 resolved cases
Office Action

§101 §102 §103 §112
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Effective Filing Date
The effective filing date of 02/16/2022 is acknowledged.
Information Disclosure Statement
The information disclosure statement(s) submitted on 02/13/2023, 06/052023, 10/04/2024, 08/08/2025, 09/16/2025, 11/03/2025 is/are in compliance with the provisions of 37 CFR 1.97. Accordingly, the information disclosure statement(s) is/are being considered by the examiner.
Status of Claims
The present application is being examined under the claims filed on 02/13/2023.
Claim(s) 1-20 is/are rejected.
Claim(s) 1-20 is/are pending.
Berkheimer References
US 20230075894 A1 - METHODS AND SYSTEMS FOR DYNAMIC RE-CLUSTERING OF NODES IN COMPUTER NETWORKS USING MACHINE LEARNING MODELS (Hereafter, “Walters”).
Prior Art References
Paul, S. and Ganju, S., 2021. Flood segmentation on sentinel-1 SAR imagery with semi-supervised learning. arXiv preprint arXiv:2107.08369. (Hereafter, “Paul”).
US 10535127 B1 - Apparatus, System And Method For highlighting Anomalous Change In Multi-pass Synthetic Aperture Radar Imagery (Hereafter, “Simonson”).
Ronneberger, O., Fischer, P. and Brox, T., 2015, October. U-net: Convolutional networks for biomedical image segmentation. In International Conference on Medical image computing and computer-assisted intervention (pp. 234-241). Cham: Springer international publishing. (Hereafter, “Ronneberger”).
ZhiWei, Y., Yu, Y. and Xiao, X., 2010, June. Image segmentation based on ensemble learning. In 2010 International Conference on Computer and Communication Technologies in Agriculture Engineering (Vol. 2, pp. 423-427). IEEE. (Hereafter, “ZhiWei”).
Lee, S., Lee, S. and Song, B.C., 2022. Contextual gradient scaling for few-shot learning. In Proceedings of the IEEE/CVF winter conference on applications of computer vision (pp. 834-843). (Hereafter, “Lee”).
Rashkovetsky, D., Mauracher, F., Langer, M. and Schmitt, M., 2021. Wildfire detection from multisensor satellite imagery using deep semantic segmentation. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 14, pp.7001-7016. (Hereafter, “Rashkovetsky”).
US 20110313790 A1 - METHOD OF DELIVERING DECISION SUPORT SYSTEMS (DSS) AND ELECTRONIC HEALTH RECORDS (EHR) FOR REPRODUCTIVE CARE, PRE-CONCEPTIVE CARE, FERTILITY TREATMENTS, AND OTHER HEALTH CONDITIONS (Hereafter, “Yao”).
Claim Rejections - 35 U.S.C. § 112(b)
The following is a quotation of 35 U.S.C. § 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.
Claim(s) 5, 6, 7 is/are rejected as being indefinite under 35 U.S.C. § 112(b).
	Claims 5 recites “a maximum metadata label”. This is an indefinite term of degree.
Claims 7 recites “a maximum metadata label” and “a minimum metadata label”. These are indefinite terms of degree.
Per MPEP2173.05(b) Relative Terminology (I) TERMS OF DEGREE – 
“Terms of degree are not necessarily indefinite. "Claim language employing terms of degree has long been found definite where it provided enough certainty to one of skill in the art when read in the context of the invention." […] Thus, when a term of degree is used in the claim, the examiner should determine whether the specification provides some standard for measuring that degree. […] If the specification does not provide some standard for measuring that degree, a determination must be made as to whether one of ordinary skill in the art could nevertheless ascertain the scope of the claim (e.g., a standard that is recognized in the art for measuring the meaning of the term of degree). […]”
It is unclear what is maximal or minimal regarding the metadata label. The specification is insufficient for ascertaining a definite meaning of “a maximum metadata label” and “a minimum metadata label” and one of ordinary skill in the art would not be able to ascertain the scope of the claim. Thus, claims 5 and 7 are rejected under 35 U.S.C. § 112(b).
For the purpose of compact prosecution Examiner will proceed under the interpretation that the labels may have some sort of numeric value that can be compared as suggested by the claim language (e.g. 0 = not flooded; 1 = flooded) and the “maximum metadata label” may be the highest valued label (i.e. 1 = flooded) whereas the “minimum metadata label” may be the lowest valued label.
Claim 6 is rejected under 35 U.S.C. § 112(b) by virtue of dependency on claim 5.
Claim Rejections - 35 U.S.C. § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.
	
Claim(s) 1-20 is/are rejected under 35 U.S.C. 101 because the claimed invention is directed to abstract idea without significantly more. This judicial exception is not integrated into a practical application as outlined in the 2-step analyses for each claim that follows.
Combined Step 1 (Statutory Category) - Is the claim to a process, machine, manufacture or composition of matter?
Yes -
Claims 1-18 recite methods.
Claims 19 and 20 recite machines.
In reference to claim 1.
Step 2A Prong 1 (Recited Judicial Exception) - Does the claim recite an abstract idea, law of nature, or natural phenomenon?
“identifying a selection criterion corresponding to the current training stage that defines a criterion for selecting training examples based on the metadata labels of the training examples,” 
which, but for the inclusion of generic computing equipment, is an evaluation that may be performed mentally by a human with the aid of pen and paper. Refer to MPEP 2106.04(a)(2)(III)(C) for more information about mental processes being performed on a computer.
“selecting a proper subset of the set training examples as training data for the current training stage in accordance with the selection criterion for the current training stage,”
which, but for the inclusion of generic computing equipment, is an evaluation that may be performed mentally by a human with the aid of pen and paper. Refer to MPEP 2106.04(a)(2)(III)(C) for more information about mental processes being performed on a computer.
Step 2A Prong 2 (Integration into a Practical Application) - Does the claim recite additional elements that integrate the judicial exception into a practical application? & Step 2B (Significantly More or Amounting to an Inventive Concept) - Does the claim recite additional elements that amount to significantly more than the judicial exception?
“obtaining a set of training examples; obtaining, for each training example, a respective metadata label that characterizes the training example; and” 
which amounts to insignificant extra-solution activity per MPEP2106.05(g). This is well-understood, routine, conventional computer functionality as recognized by MPEP2106.05(d)(II) iv. Storing and retrieving information in memory.
“training the machine learning model over a sequence of training stages, comprising, at each training stage before a last training stage in the sequence of training stages:”
which merely recites the words apply it (or an equivalent) with the judicial exception, or merely including instructions to implement an abstract idea on a computer, or merely using a computer as a tool to perform an abstract idea, as discussed in MPEP § 2106.05(f). The claim recites generic application of a computer by generic machine learning training.
“updating the machine learning model by training the machine learning model on the training data for the current training stage, and”
which merely recites the words apply it (or an equivalent) with the judicial exception, or merely including instructions to implement an abstract idea on a computer, or merely using a computer as a tool to perform an abstract idea, as discussed in MPEP § 2106.05(f). The claim recites generic application of a computer by generic machine learning training.
“providing the updated machine learning model for further training at a next training stage in the sequence of training stages.”
which amounts to insignificant extra-solution activity per MPEP2106.05(g). This is well-understood, routine, conventional computer functionality as recognized by MPEP2106.05(d)(II) iv. Storing and retrieving information in memory.
In reference to claim 2.
“2. The method of claim 1, wherein for each training example, the metadata label for the training example defines a timestamp corresponding to the training example.”
which recites the same mental process of the parent claim and only provides further details on the training data that is gathered.
In reference to claim 3.
“3. The method of claim 1, wherein for each training example, the metadata label for the training example defines a geographic feature corresponding to the training example.”
which recites the same mental process of the parent claim and only provides further details on the training data that is gathered.
In reference to claim 4.
“4. The method of claim 1, wherein for each training stage in the sequence of training stages: the selection criterion corresponding to the training stage specifies a set of allowable metadata labels for the training stage; and each training example is eligible for selection at the training stage only if the metadata label of the training example is included in the set of allowable metadata labels for the training stage.”
which recites the same mental process of the parent claim and only provides further details on mental selection process.
In reference to claim 5.
“5. The method of claim 4, wherein for each training stage after a first training stage in the sequence of training stages: a maximum metadata label in the set of allowable metadata labels for the training stage is greater than a maximum metadata label in the set of allowable metadata labels for a preceding training stage.”
which recites the same mental process of the parent claim and only provides further details on the label composition from one training stage to the next.
In reference to claim 6.
Step 2A Prong 1 (Recited Judicial Exception) - Does the claim recite an abstract idea, law of nature, or natural phenomenon?
“the selection criterion corresponding to the training stage specifies a respective selection weight for each metadata label in the set of allowable metadata labels; and selecting a proper subset of the set of training examples as training data for the current training stage comprises: determining a probability distribution, over training examples having metadata labels included in the set of allowable metadata labels for the training stage, using the selection weights for the allowable metadata labels; and”
which, but for the inclusion of generic computing equipment, is an evaluation that may be performed mentally by a human with the aid of pen and paper. Refer to MPEP 2106.04(a)(2)(III)(C) for more information about mental processes being performed on a computer.
Step 2A Prong 2 (Integration into a Practical Application) - Does the claim recite additional elements that integrate the judicial exception into a practical application? & Step 2B (Significantly More or Amounting to an Inventive Concept) - Does the claim recite additional elements that amount to significantly more than the judicial exception?
“sampling a plurality of training examples having metadata labels included in set of allowable metadata labels in accordance with the probability distribution.”
which amounts to insignificant extra-solution activity per MPEP2106.05(g). This is well-understood, routine, conventional computer functionality as recognized by Walters (Walters [0045], “In some embodiments the input data may be designed using conventional sampling techniques, such as stratified sampling. For example, the topic modeling machine learning model may be provided with data training subsets that include a single topic. The data training subset may have a variety of subtopics within that topic such that the model is exposed to every subtopic that could be discoverable through topic modeling.”). Examiner notes that Walters shows stratified sampling to be a conventional sampling technique. Under a broadest reasonable interpretation, “sampling […] training examples” based on a “probability distribution” is well-understood, routine, conventional functionality wherein examples are selected in accordance to their stratum.
In reference to claim 7.
“7. The method of claim 6, wherein for one or more training stages in the sequence of training stages: the set of allowable metadata labels for the training stage comprises a plurality of metadata labels; and the selection criterion corresponding to the training stage specifies a higher selection weight for a maximum metadata label in the set of allowable metadata labels than for a minimum metadata label in the set of allowable metadata labels.”
which recites the same mental process of the parent claim and only provides further details on the composition of the set of allowable metadata labels and the selection criterion for the training stages.
In reference to claim 8.
Step 2A Prong 1 (Recited Judicial Exception) - Does the claim recite an abstract idea, law of nature, or natural phenomenon?
Yes – the abstract idea of the parent claim.
Step 2A Prong 2 (Integration into a Practical Application) - Does the claim recite additional elements that integrate the judicial exception into a practical application? & Step 2B (Significantly More or Amounting to an Inventive Concept) - Does the claim recite additional elements that amount to significantly more than the judicial exception?
“determining, for each training example in the training data for the current training stage, an error in a prediction generated by the machine learning model for the training example; updating the machine learning model using the errors in the predictions generated by the machine learning model for the training examples in the training data for the current training stage.”
which merely recites the words apply it (or an equivalent) with the judicial exception, or merely including instructions to implement an abstract idea on a computer, or merely using a computer as a tool to perform an abstract idea, as discussed in MPEP § 2106.05(f). The claim recites generic application of a computer by generic machine learning training.
In reference to claim 9.
Step 2A Prong 1 (Recited Judicial Exception) - Does the claim recite an abstract idea, law of nature, or natural phenomenon?
“wherein the machine learning model is an ensemble model that comprises a plurality of base models,” 
which, but for the inclusion of generic computing equipment, is an evaluation that may be performed mentally by a human with the aid of pen and paper. Refer to MPEP 2106.04(a)(2)(III)(C) for more information about mental processes being performed on a computer. Examiner notes that the use of a computer and ensemble learning does not necessarily preclude mental procedure. An ensemble of machine learning models could be a collection of polynomial regression models which may be computed by a human mentally with the aid of pen and paper.
Step 2A Prong 2 (Integration into a Practical Application) - Does the claim recite additional elements that integrate the judicial exception into a practical application? & Step 2B (Significantly More or Amounting to an Inventive Concept) - Does the claim recite additional elements that amount to significantly more than the judicial exception?
“and wherein updating the machine learning model using the errors in the predictions generated by the machine learning model for training examples in the training data for the current training stage comprises:
determining a prediction target for each training example in the training data for the current training stage based on the error in the prediction generated by the machine learning model for the training example;
generating one or more new base models that are each trained to generate the prediction targets for the training examples in the training data for the current training stage; and
adding the new base models to the ensemble model.”
which merely recites the words apply it (or an equivalent) with the judicial exception, or merely including instructions to implement an abstract idea on a computer, or merely using a computer as a tool to perform an abstract idea, as discussed in MPEP § 2106.05(f). The claim recites generic application of a computer by generic machine learning training.

In reference to claim 10.
“wherein the new base models are decision trees.” which recites the same mental process of the parent claim and only provides further details on the model type used for the ensemble.
In reference to claim 11.
Step 2A Prong 1 (Recited Judicial Exception) - Does the claim recite an abstract idea, law of nature, or natural phenomenon?
“determining a respective weight factor for each training example in the training data for the current training stage based on the error in the prediction generated by the machine learning model for the training example;”
which, but for the inclusion of generic computing equipment, is an evaluation that may be performed mentally by a human with the aid of pen and paper. Refer to MPEP 2106.04(a)(2)(III)(C) for more information about mental processes being performed on a computer.
“training the machine learning model on the training data for the current training stage using the weight factors for the training examples, wherein the weight factor for a training example controls an impact of the training example on the training of the machine learning model.”
which, but for the inclusion of generic computing equipment, is an evaluation that may be performed mentally by a human with the aid of pen and paper. Refer to MPEP 2106.04(a)(2)(III)(C) for more information about mental processes being performed on a computer. Examiner notes that the added specificity of weighting training examples does not preclude mental procedure. For example, consider a decision tree model that assigns more importance to specific training samples when building the decision tree.

Step 2A Prong 2 (Integration into a Practical Application) - Does the claim recite additional elements that integrate the judicial exception into a practical application? & Step 2B (Significantly More or Amounting to an Inventive Concept) - Does the claim recite additional elements that amount to significantly more than the judicial exception?
No.
In reference to claim 12.
Step 2A Prong 1 (Recited Judicial Exception) - Does the claim recite an abstract idea, law of nature, or natural phenomenon?
Yes – the abstract idea of the parent claim.
Step 2A Prong 2 (Integration into a Practical Application) - Does the claim recite additional elements that integrate the judicial exception into a practical application? & Step 2B (Significantly More or Amounting to an Inventive Concept) - Does the claim recite additional elements that amount to significantly more than the judicial exception?
“wherein the machine learning model comprises a neural network,
and wherein training the machine learning model on the training data for the current training stage using the weight factors for the training examples comprises, for each training example:
generating a prediction for the training example using the neural network;”
determining gradients, with respect to neural network parameters of the neural network, of an objective function that depends on the prediction for the training example;
scaling the gradients using the weight factor for the training example; and
updating the neural network parameters of the neural network using the scaled gradients.”
which merely recites the words apply it (or an equivalent) with the judicial exception, or merely including instructions to implement an abstract idea on a computer, or merely using a computer as a tool to perform an abstract idea, as discussed in MPEP § 2106.05(f). The claim recites generic application of a computer by generic neural network application.
In reference to claim 13.
Step 2A Prong 1 (Recited Judicial Exception) - Does the claim recite an abstract idea, law of nature, or natural phenomenon?
“identifying a selection criterion corresponding to the last training stage that defines a criterion for selecting training examples based on the metadata labels of the training examples;”
which, but for the inclusion of generic computing equipment, is an evaluation that may be performed mentally by a human with the aid of pen and paper. Refer to MPEP 2106.04(a)(2)(III)(C) for more information about mental processes being performed on a computer.
“selecting a proper subset of the set training examples as training data for the last training stage in accordance with the selection criterion for the current training stage;”
which, but for the inclusion of generic computing equipment, is an evaluation that may be performed mentally by a human with the aid of pen and paper. Refer to MPEP 2106.04(a)(2)(III)(C) for more information about mental processes being performed on a computer.
Step 2A Prong 2 (Integration into a Practical Application) - Does the claim recite additional elements that integrate the judicial exception into a practical application? & Step 2B (Significantly More or Amounting to an Inventive Concept) - Does the claim recite additional elements that amount to significantly more than the judicial exception?
“updating the machine learning model by training the machine learning model on the training data for the last training stage; and”
which merely recites the words apply it (or an equivalent) with the judicial exception, or merely including instructions to implement an abstract idea on a computer, or merely using a computer as a tool to perform an abstract idea, as discussed in MPEP § 2106.05(f). The claim recites generic application of a computer by generic machine learning training.
“providing the updated machine learning model for use in performing the machine learning task.”
which amounts to insignificant extra-solution activity per MPEP2106.05(g). This is well-understood, routine, conventional computer functionality as recognized by MPEP2106.05(d)(II) iv. Storing and retrieving information in memory.
In reference to claim 14.
“wherein each training example in the set of training examples comprises: (i) a training input to the machine learning model, and (ii) a target output to be generated by the machine learning model by processing the training input.”
which recites the same mental process of the parent claim and only provides further details on the training examples that are gathered.
In reference to claim 15.
“wherein the machine learning model performs a fire prediction task, wherein for each training example: (i) the training input characterizes a geographic region, and (ii) the target output defines, for each of one or more spatial locations in the geographic region, whether the spatial location will be impacted by fire within a time window.”
which recites the same mental process of the parent claim and only provides further details on the training examples that are gathered.
In reference to claim 16.
“wherein the machine learning model performs a flood prediction task, wherein for each training example: (i) the training input characterizes a geographic region, and (ii) the target output defines, for each of one or more spatial locations in the geographic region, whether the spatial location will be impacted by flooding within a time window.”
which recites the same mental process of the parent claim and only provides further details on the training examples that are gathered.
In reference to claim 17.
“wherein the machine learning model performs a health prediction task, wherein for each training example: (i) the training input comprises electronic medical record data characterizing a subject, and (ii) the target output defines a prediction for whether the subject will receive a medical diagnosis within a time window.”
which recites the same mental process of the parent claim and only provides further details on the training examples that are gathered.
In reference to claim 18.
“wherein for each training example, the training input to the machine learning model comprises an image, or features derived from an image.”
which recites the same mental process of the parent claim and only provides further details on the training examples that are gathered.
In reference to claims 19 and 20.
Claims 19 and 20 are substantially similar to claim 1 and thus are rejected as mental processes and therefore abstract ideas following the same 2 step analysis.
Claim Rejections - 35 U.S.C. § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –
(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.
(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.
Claim(s) 1, 3-8, 11, 13, 14, 16, 18-20 is/are rejected under 35 U.S.C. 102 as being anticipated by Paul.
In reference to claim 1.
“1. A method performed by one or more computers for training a machine learning model to perform a machine learning task, the method comprising:”
“obtaining a set of training examples; obtaining, for each training example, a respective metadata label that characterizes the training example; and”
(Paul Abstract, “Concretely, we use a cyclical approach involving multiple stages (1) training an ensemble model of multiple U-Net architectures with the provided high confidence hand-labeled data and, generated pseudo labels or low confidence labels on the entire unlabeled test dataset,”)
“training the machine learning model over a sequence of training stages, comprising, at each training stage before a last training stage in the sequence of training stages:”
“identifying a selection criterion corresponding to the current training stage that defines a criterion for selecting training examples based on the metadata labels of the training examples, selecting a proper subset of the set training examples as training data for the current training stage in accordance with the selection criterion for the current training stage, updating the machine learning model by training the machine learning model on the training data for the current training stage, and”
(Paul 2, “The dataset is imbalanced i.e., the proportion of images with some flood region presence is lower than the images without it, so during training, we ensure each batch contains at least 50% samples having some amount of flood region present through stratified sampling.”)
“providing the updated machine learning model for further training at a next training stage in the sequence of training stages.”
(Paul 3, “Thus, a new training dataset is created which is composed of (1) original training data with available ground truth, referred to as “high confidence” labels, and, (2) filtered pseudo labels or “low confidence” labels on the unlabeled test dataset. This assimilated dataset is used for the next round of training individual U-Net, U-Net++ and the ensemble models.”)
In reference to claim 3.
“3. The method of claim 1, wherein for each training example, the metadata label for the training example defines a geographic feature corresponding to the training example.”
(Paul 3, “The contest dataset consists of 66k tiled images from various geographic locations.”)
In reference to claim 4.
“4. The method of claim 1, wherein for each training stage in the sequence of training stages:”
“the selection criterion corresponding to the training stage specifies a set of allowable metadata labels for the training stage; and each training example is eligible for selection at the training stage only if the metadata label of the training example is included in the set of allowable metadata labels for the training stage.”
(Paul 2, “The dataset is imbalanced i.e., the proportion of images with some flood region presence is lower than the images without it, so during training, we ensure each batch contains at least 50% samples having some amount of flood region present through stratified sampling.”)
In reference to claim 5.
“5. The method of claim 4, wherein for each training stage after a first training stage in the sequence of training stages: a maximum metadata label in the set of allowable metadata labels for the training stage is greater than a maximum metadata label in the set of allowable metadata labels for a preceding training stage.”
(Paul 3, “Thus, a new training dataset is created which is composed of (1) original training data with available ground truth, referred to as “high confidence” labels, and, (2) filtered pseudo labels or “low confidence” labels on the unlabeled test dataset. This assimilated dataset is used for the next round of training individual U-Net, U-Net++ and the ensemble models.”)
(Paul 2, “The dataset is imbalanced i.e., the proportion of images with some flood region presence is lower than the images without it, so during training, we ensure each batch contains at least 50% samples having some amount of flood region present through stratified sampling.”)
It remains unclear what is maximal regarding a “maximum metadata label”. Refer to the 35 USC 112(b) rejections in this document. In light of the context of flood pixel prediction discussed in reference Paul, Examiner will proceed under the interpretation that the labels are arbitrary distinct integers (e.g. 0 = not flooded; 1 = flooded) and the “maximum metadata label” is the largest integer (i.e. 1 = flooded). Examiner further interprets the “greater than” comparison in the claim language to be comparing the sizes of the label sets. Since the system of Paul increases the number of training samples each iteration and maintains a 50 percent split of flooded and non-flooded training set, the “maximum metadata label” of an iteration will always be greater than that of the previous iterations. 
In reference to claim 6.
“6. The method of claim 4, wherein for one or more training stages in the sequence of training stages:”
“the selection criterion corresponding to the training stage specifies a respective selection weight for each metadata label in the set of allowable metadata labels; and
selecting a proper subset of the set of training examples as training data for the current training stage comprises:
determining a probability distribution, over training examples having metadata labels included in the set of allowable metadata labels for the training stage, using the selection weights for the allowable metadata labels; and
sampling a plurality of training examples having metadata labels included in set of allowable metadata labels in accordance with the probability distribution.”
(Paul 2, “The dataset is imbalanced i.e., the proportion of images with some flood region presence is lower than the images without it, so during training, we ensure each batch contains at least 50% samples having some amount of flood region present through stratified sampling.”)
In reference to claim 7.
“7. The method of claim 6, wherein for one or more training stages in the sequence of training stages:”
“the set of allowable metadata labels for the training stage comprises a plurality of metadata labels; and the selection criterion corresponding to the training stage specifies a higher selection weight for a maximum metadata label in the set of allowable metadata labels than for a minimum metadata label in the set of allowable metadata labels.”
(Paul 2, “The dataset is imbalanced i.e., the proportion of images with some flood region presence is lower than the images without it, so during training, we ensure each batch contains at least 50% samples having some amount of flood region present through stratified sampling.”)
As in the interpretation of parent claim 6, it is unclear what is meant by maximal and minimal “metadata labels”. In light of the context of flood pixel prediction discussed in reference Paul, examiner will proceed under the interpretation that the labels are arbitrary distinct integers (e.g. 0 = not flooded; 1 = flooded) and the “maximum metadata label” is the largest integer (i.e. 1 = flooded) and the “minimum metadata label” is the smallest integers (e.g. 0 = not flooded). Since the proportion of flooded samples in Paul is lower than that of the non-flooded regions, in order to maintain a 50% balance during training, the selection weight for the flooded regions is necessarily higher.
In reference to claim 8.
“8. The method of claim 1, wherein for one or more training stages in the sequence of training stages, updating the machine learning model by training the machine learning model on the training data for the current training stage comprises:”
“determining, for each training example in the training data for the current training stage, an error in a prediction generated by the machine learning model for the training example;”
(Paul 3, “We first train two models with U-Net [30] and U-Net++ [43] both with MobileNetv2 backbones [31] and combined dice and focal loss on the available training data.”)
“updating the machine learning model using the errors in the predictions generated by the machine learning model for the training examples in the training data for the current training stage.”
(Paul 3, “With the training data now composed of the original training dataset and pseudo labels from the test dataset, we perform training from scratch of the U-Net and U-Net++ models, and fine-tuning of the U-Net from the previous iteration.”)
In reference to claim 11.
“11. The method of claim 8, wherein updating the machine learning model using the errors in the predictions generated by the machine learning model for the training examples in the training data for the current training stage comprises:”
“determining a respective weight factor for each training example in the training data for the current training stage based on the error in the prediction generated by the machine learning model for the training example; training the machine learning model on the training data for the current training stage using the weight factors for the training examples, wherein the weight factor for a training example controls an impact of the training example on the training of the machine learning model.”
(Paul 2, “The dataset is imbalanced i.e., the proportion of images with some flood region presence is lower than the images without it, so during training, we ensure each batch contains at least 50% samples having some amount of flood region present through stratified sampling.”)
In reference to claim 13.
“13. The method of claim 1, wherein training the machine learning model at the last stage in the sequence of training stages comprises:”
“identifying a selection criterion corresponding to the last training stage that defines a criterion for selecting training examples based on the metadata labels of the training examples; selecting a proper subset of the set training examples as training data for the last training stage in accordance with the selection criterion for the current training stage; updating the machine learning model by training the machine learning model on the training data for the last training stage; and”
(Paul 2, “The dataset is imbalanced i.e., the proportion of images with some flood region presence is lower than the images without it, so during training, we ensure each batch contains at least 50% samples having some amount of flood region present through stratified sampling.”)
“providing the updated machine learning model for use in performing the machine learning task.”
(Paul 2, “(3) Additionally, we benchmark the inference pipeline and show that it can be performed in real time aiding in real time disaster mitigation efforts.”)
In reference to claim 14.
“14. The method of claim 1, wherein each training example in the set of training examples comprises: (i) a training input to the machine learning model, and (ii) a target output to be generated by the machine learning model by processing the training input.”
(Paul 2, “The contest dataset consists of 66k tiled images from various geographic locations. […] First, we train an ensemble model of multiple U-Net architectures with the provided high confidence hand-labeled data […]”)
In reference to claim 16.
“16. The method of claim 14, wherein the machine learning model performs a flood prediction task, wherein for each training example: (i) the training input characterizes a geographic region, and (ii) the target output defines, for each of one or more spatial locations in the geographic region, whether the spatial location will be impacted by flooding within a time window.”
(Paul 3, “The contest dataset consists of 66k tiled images from various geographic locations.”)
(Paul 2, “Motivated by prior art [44, 45, 7, 19], we look at semi-supervised techniques, that assume predicted labels with maximally predicted probability as ground truth, and we apply it to flood segmentation.”)
In reference to claim 18.
“18. The method of claim 14, wherein for each training example, the training input to the machine learning model comprises an image, or features derived from an image.”
(Paul 2, “The contest dataset consists of 66k tiled images from various geographic locations.”)
In reference to claim 19 and 20.
Claims 19 and 20 are substantially similar to claim 1 and thus are rejected under the same prior art.

Claim Rejections - 35 U.S.C. § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claim(s) 2 is/are rejected under 35 U.S.C. 103 as being unpatentable over Paul in view of Simonson.
Claim(s) 9 is/are rejected under 35 U.S.C. 103 as being unpatentable over Paul in view of Ronneberger.
Claim(s) 10 is/are rejected under 35 U.S.C. 103 as being unpatentable over Paul in view of Ronneberger in further view of ZhiWei.
Claim(s) 12 is/are rejected under 35 U.S.C. 103 as being unpatentable over Paul in view of Ronneberger in further view of Lee.
Claim(s) 15 is/are rejected under 35 U.S.C. 103 as being unpatentable over Paul in view of Rashovetsky.
Claim(s) 17 is/are rejected under 35 U.S.C. 103 as being unpatentable over Paul in view of Yao.
In reference to claim 2.
“2. The method of claim 1, wherein for each training example, the metadata label for the training example defines a timestamp corresponding to the training example.”
(Simonson [0014], “Each SAR image in the SAR images 212 can be assigned a timestamp that is indicative of when the SAR image was generated”
Motivation to combine Paul, Simonson.
It would have been obvious to one of ordinary skill in the art before the effective filing date of the present application to combine Paul, Simonson.
Paul discloses a supervised learning method on synthetic aperture radar (SAR) imagery.
Simonson discloses a system and method for finding anomalies in synthetic aperture radar data.
One would be motivated to combine these references because Simonson provides added details regarding the format of SAR data that are not readily apparent in the disclosure of Paul. The added details of Simonson would necessarily aid an implementation of Paul.
Further, MPEP § 2143(I) EXAMPLES OF RATIONALES sets forth the Supreme Court rationales for obviousness, including:
(A) Combining prior art elements according to known methods to yield predictable results;
(B) Simple substitution of one known element for another to obtain predictable results;  
(F) Known work in one field of endeavor may prompt variations of it for use in either the same field or a different one based on design incentives or other market forces if the variations are predictable to one of ordinary skill in the art;
(G) Some teaching, suggestion, or motivation in the prior art that would have led one of ordinary skill to modify the prior art reference or to combine prior art reference teachings to arrive at the claimed invention.
In reference to claim 9.
“9. The method of claim 8,”
“wherein the machine learning model is an ensemble model that comprises a plurality of base models,”
(Paul 2, “First, we train an ensemble model of multiple U-Net architectures […]”)
“and wherein updating the machine learning model using the errors in the predictions generated by the machine learning model for training examples in the training data for the current training stage comprises:”
“determining a prediction target for each training example in the training data for the current training stage based on the error in the prediction generated by the machine learning model for the training example; generating one or more new base models that are each trained to generate the prediction targets for the training examples in the training data for the current training stage; and”
(Ronneberger 4, “The input images and their corresponding segmentation maps are used to train the network with the stochastic gradient descent implementation of Caffe [6].”)
“adding the new base models to the ensemble model.”
(Paul 3, “With the training data now composed of the original training dataset and pseudo labels from the test dataset, we perform training from scratch of the U-Net and U-Net++ models, and fine-tuning of the U-Net from the previous iteration.”)
Motivation to combine Paul, Ronneberger.
It would have been obvious to one of ordinary skill in the art before the effective filing date of the present application to combine Paul, Ronneberger.
Paul discloses a supervised learning method on synthetic aperture radar (SAR) imagery.
Ronneberger discloses a specific neural architecture for image segmentation.
One would be motivated to combine these references because Paul directly references Ronneberger and utilizes the neural architecture that it teaches.
Further, MPEP § 2143(I) EXAMPLES OF RATIONALES sets forth the Supreme Court rationales for obviousness, including:
(G) Some teaching, suggestion, or motivation in the prior art that would have led one of ordinary skill to modify the prior art reference or to combine prior art reference teachings to arrive at the claimed invention.
In reference to claim 10.
“10. The method of claim 9, wherein the new base models are decision trees.”
(ZhiWei 423, “In this paper, a method based on ensemble of decision tree with the view from the data point is presented.”)
Motivation to combine Paul, Ronneberger, ZhiWei.
It would have been obvious to one of ordinary skill in the art before the effective filing date of the present application to combine Paul, Ronneberger, ZhiWei.
Paul, Ronneberger discloses an ensemble supervised learning method for segmentation on synthetic aperture radar (SAR) imagery.
ZhiWei discloses ensemble learning methodologies for image segmentation.
One would be motivated to combine these references because the ensemble learning of ZhiWhei may improve the generalization performance of Paul (ZhiWei 423, “The purpose is to improve the generalization performance by using the otherness of simple classifiers.”).
Further, MPEP § 2143(I) EXAMPLES OF RATIONALES sets forth the Supreme Court rationales for obviousness, including:
(A) Combining prior art elements according to known methods to yield predictable results;
(C) Use of known technique to improve similar devices (methods, or products) in the same way;
(D) Applying a known technique to a known device (method, or product) ready for improvement to yield predictable results;
In reference to claim 12.
“12. The method of claim 11,”
“wherein the machine learning model comprises a neural network,”
(Ronneberger 2, “In this paper, we build upon a more elegant architecture, the so-called “fully convolutional network"”)
“and wherein training the machine learning model on the training data for the current training stage using the weight factors for the training examples comprises, for each training example:”
“generating a prediction for the training example using the neural network; determining gradients, with respect to neural network parameters of the neural network, of an objective function that depends on the prediction for the training example; [scaling the gradients using the weight factor for the training example; and] updating the neural network parameters of the neural network using the scaled gradients.”
(Ronneberger 4, “The input images and their corresponding segmentation maps are used to train the network with the stochastic gradient descent implementation of Caffe [6].”)
“scaling the gradients using the weight factor for the training example; and”
(Lee Abstract, “To resolve or mitigate this problem, we propose contextual gradient scaling (CxGrad), which scales gradient norms of the backbone to facilitate learning task-specific knowledge in the inner-loop. Since the scaling factors are generated from task-conditioned parameters, gradient norms of the backbone can be scaled in a task-wise fashion.”)
Motivation to combine Paul, Ronneberger, Lee.
It would have been obvious to one of ordinary skill in the art before the effective filing date of the present application to combine Paul, Ronneberger, Lee.
Paul, Ronneberger discloses an ensemble supervised learning method for segmentation on synthetic aperture radar (SAR) imagery.
Lee discloses a gradient scaling methodology for enabling few-shot learning.
One would be motivated to combine these references because Paul is concerned with a specific machine learning problem where labels are scarce and manual annotation is expensive (Paul 2, “Feature based machine learning techniques are also prominent [27] but are intractable as human annotators and featurizers cannot scale. Manual annotation in real time can easily exceed $62,500 daily, and a manual solution quickly becomes intractable.”). The methodology of Lee provides a way to alleviate this issue by decreasing the number of samples required during the training phase.
Further, MPEP § 2143(I) EXAMPLES OF RATIONALES sets forth the Supreme Court rationales for obviousness, including:
(A) Combining prior art elements according to known methods to yield predictable results;
(C) Use of known technique to improve similar devices (methods, or products) in the same way;
(D) Applying a known technique to a known device (method, or product) ready for improvement to yield predictable results;
(F) Known work in one field of endeavor may prompt variations of it for use in either the same field or a different one based on design incentives or other market forces if the variations are predictable to one of ordinary skill in the art;
In reference to claim 15.
“15. The method of claim 14, wherein the machine learning model performs a [fire prediction] task, wherein for each training example: (i) the training input characterizes a geographic region, and (ii) the target output defines, for each of one or more spatial locations in the geographic region, [whether the spatial location will be impacted by fire within a time window].”
(Paul 3, “The contest dataset consists of 66k tiled images from various geographic locations.”)
“fire prediction”; “whether the spatial location will be impacted by fire within a time window”
(Rashovetsky 7002, “1) We demonstrate the generation of an annotated wildfire dataset combining openly available satellite imagery and information from a public wildfire database. […] 3) We investigate the predictive potential of the individual satellite data sources using a fairly standard deep semantic segmentation architecture”)
Motivation to combine Paul, Rashovetsky.
It would have been obvious to one of ordinary skill in the art before the effective filing date of the present application to combine Paul, Rashovetsky.
Paul discloses an ensemble supervised learning method for segmentation on synthetic aperture radar (SAR) imagery.
Rashovetsky discloses wildfire detection methodology on SAR data.
One would be motivated to combine these references because the system of Rashovetsky provides a practical application for the segmentation disclosure of Paul.
Further, MPEP § 2143(I) EXAMPLES OF RATIONALES sets forth the Supreme Court rationales for obviousness, including:
(A) Combining prior art elements according to known methods to yield predictable results;
(B) Simple substitution of one known element for another to obtain predictable results;  
(C) Use of known technique to improve similar devices (methods, or products) in the same way;
(D) Applying a known technique to a known device (method, or product) ready for improvement to yield predictable results;
(E) "Obvious to try" – choosing from a finite number of identified, predictable solutions, with a reasonable expectation of success;
(F) Known work in one field of endeavor may prompt variations of it for use in either the same field or a different one based on design incentives or other market forces if the variations are predictable to one of ordinary skill in the art;
In reference to claim 17.
“17. The method of claim 14, wherein the machine learning model performs a health prediction task, wherein for each training example: (i) the training input comprises electronic medical record data characterizing a subject, and (ii) the target output defines a prediction for whether the subject will receive a medical diagnosis within a time window.”
(Yao [0030], “The present invention includes methods of developing and delivering a DSS to healthcare providers and individuals. The DSS of the present invention may be a stand-alone DSS or a DSS integrated with an EHR (i.e., an EHR-driven DSS). An advantage of the DSS methods of the present invention is that they allow for the development and validation of clinic-specific, region-specific, and/or population-specific prediction models by using non-overlapping training and test sets that pertain to the same patient population.”)
(Yao [0075], “Proper implementation and use of the EHR platforms of the present invention may have additional benefits, such as for example, employer and/or insurance incentives for healthy lifestyle changes; disease prediction tools; and/or disease management. As previously discussed, the EHR platforms may receive data from EMRs and may display lab results to show the individual's health status over time.”)
Motivation to combine Paul, Yao.
It would have been obvious to one of ordinary skill in the art before the effective filing date of the present application to combine Paul, Yao.
Paul discloses an ensemble supervised learning method for segmentation.
Yao discloses disease prediction using historic electronic health records.
One would be motivated to combine these references because the system of Yao provides a practical application for the segmentation disclosure of Paul.
Further, MPEP § 2143(I) EXAMPLES OF RATIONALES sets forth the Supreme Court rationales for obviousness, including:
(A) Combining prior art elements according to known methods to yield predictable results;
(B) Simple substitution of one known element for another to obtain predictable results;  
(C) Use of known technique to improve similar devices (methods, or products) in the same way;
(D) Applying a known technique to a known device (method, or product) ready for improvement to yield predictable results;
(E) "Obvious to try" – choosing from a finite number of identified, predictable solutions, with a reasonable expectation of success;
(F) Known work in one field of endeavor may prompt variations of it for use in either the same field or a different one based on design incentives or other market forces if the variations are predictable to one of ordinary skill in the art;
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to CODY RYAN GILLESPIE whose telephone number is (571)272-1331. The examiner can normally be reached M-F, 8 AM - 5 PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Viker A Lamardo can be reached on 5172705871. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/CODY RYAN GILLESPIE/Examiner, Art Unit 2147                                                                                                                                                                                                        
/VIKER A LAMARDO/Supervisory Patent Examiner, Art Unit 2147
Read full office action
Prosecution Timeline

Feb 13, 2023
Application Filed
Feb 12, 2026
Non-Final Rejection — §101, §102, §103 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

18/121,725
Patent 12583466
VEHICLE CONTROL MODULES INCLUDING CONTAINERIZED ORCHESTRATION AND RESOURCE MANAGEMENT FOR MIXED CRITICALITY SYSTEMS
2y 5m to grant Granted Mar 24, 2026
18/448,891
Patent 12578751
DATA PROCESSING CIRCUITRY AND METHOD, AND SEMICONDUCTOR MEMORY
2y 5m to grant Granted Mar 17, 2026
18/062,207
Patent 12561162
AUTOMATED INFORMATION TECHNOLOGY INFRASTRUCTURE MANAGEMENT
2y 5m to grant Granted Feb 24, 2026
18/364,680
Patent 12536291
PLATFORM BOOT PATH FAULT DETECTION ISOLATION AND REMEDIATION PROTOCOL
2y 5m to grant Granted Jan 27, 2026
18/411,841
Patent 12393641
METHODS FOR UTILIZING SOLVER HARDWARE FOR SOLVING PARTIAL DIFFERENTIAL EQUATIONS
2y 5m to grant Granted Aug 19, 2025
Study what changed to get past this examiner. Based on 5 most recent grants.
AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Prosecution Projections

1-2
Expected OA Rounds
50%
Grant Probability
76%
With Interview (+25.8%)
3y 8m
Median Time to Grant
Low
PTA Risk
Based on 509 resolved cases by this examiner. Grant probability derived from career allow rate.