Last updated: May 29, 2026
Application No. 17/822,029
MACHINE LEARNING CONTEXT BASED CONFIDENCE CALIBRATION

Final Rejection §101§103
Filed
Aug 24, 2022
Examiner
ROSTAMI, MOHAMMAD S
Art Unit
2154
Tech Center
2100 — Computer Architecture & Software
Assignee
Adobe Inc.
OA Round
2 (Final)
Interview Optional

— +26.1% interview lift. Interview already conducted in this application's prosecution history. This examiner has a 67% grant rate with +26.1% interview lift. Since an interview has already been tried, recommend written response with narrowed claims based on precedent claim evolution patterns.
Based on 636 resolved cases, 2023–2026
Examiner Intelligence

ROSTAMI, MOHAMMAD S View full profile →
Grants 67% — above average
Career Allowance Rate
425 granted / 636 resolved
+11.8% vs TC avg
Strong +26% interview lift
Without
With
+26.1%
Interview Lift
resolved cases with interview
Typical timeline
3y 9m
Avg Prosecution
23 currently pending
Career history
677
Total Applications
across all art units
Statute-Specific Performance

§101
0.7%
-39.3% vs TC avg
§103
93.1%
+53.1% vs TC avg
§102
5.1%
-34.9% vs TC avg
§112
0.1%
-39.9% vs TC avg
Black line = Tech Center average estimate • Based on career data from 636 resolved cases
Office Action

§101 §103
DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Status of Claims
	Claims 1-20 are pending of which claims 1, 9 and 16 are in independent form.
	Claims 1-20 are rejected under 35 U.S.C. 101.
	Claims 1-20 are rejected under 35 U.S.C. 103.

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claims 1-20 are rejected under 35 U.S.C. 101 because the claimed invention is directed to a judicial exception (i.e., a law of nature, a natural phenomenon, or an abstract idea) without significantly more.  

The claim(s) recite(s) confidence calibration using ML models in an object rejection in image frames.

With respect to step 1 of the patent subject matter eligibility analysis, the claims are directed to a process, machine, manufacture, or composition of matter. 
Independent claim 1 is directed to a system, including one or more processors, which is a machine.
Independent claim 9 directed to non-transitory computer-readable media, which is directed to one of the four statutory subject matters.  
Independent claim 16 is directed to a method, which is a process.
Independent All other claims depend on claims 1, 9 and 16. As such, claims 1-20 are directed to a statutory category.

Regarding claims 1, 10 and 18: 
With respect to step 2A, prong one (Judicial Exception), the claims recite an abstract idea, law of nature, or natural phenomenon. Specifically, the following limitations recite mathematical concepts and/or mental processes and/or certain methods of organizing human activity.
The claim recites the following limitations directed to an abstract idea:
Obtaining an image frame;
Generating, with a first machine learning model: a confidence score, a bounding box, an instance embedding (all are inference output from a model)
Computing, with a second ML model, a calibrated confidence score.

These limitations involve:
Mathematical concepts:
Generating embeddings;
Outputting bounding boxes;
ML inference operations are mathematical relationships (complex statistical functions). USPTO repeatedly classifies ML inferences, scoring, weighting, embedding and confidence calibration as mathematical concepts.
See Electric Power Group, LLC v. Alstom S.A, 830 F.3d 150 (Fed. Cir. 2016); Content Extraction & Transmission LLC v. Wells Fargo Bank, 776 F.3d 1343 (Fed. Cir. 2014).

With respect to step 2A, Prong Two (Particular Application), the claims do not recite additional elements that integrate the judicial exception into a practical application. The following limitations are considered “additional elements” and explanation will be given as to why these “additional elements” do not integrate the judicial exception into a practical application.
The claims recite the use of: 
System with memory components (generic computing),
Obtaining an image frame with is a generic data gathering step,
Using a first ML model and a second ML model is implementation of the abstract idea using a computer, which has no specific improvement to computer architectures,
What we were looking for was “an improvement to database technology”, “a specific technical solution”, “unconventional technical architecture” or “a transformation of the physical technology”, however, NONE of these improvements have been presented in this claim. All the of the mentioned technology presented in the claims are generic computer components to carry out abstract steps mentioned in the claims. 
The claims do not improve database technology itself, do not recite a particular improvement to computer functionality. The claims merely take data, applies two ML inference models and produces scores. Therefore, there is no integration of the abstract idea into a practical application. 

With respect to Step 2B. The claims do not include additional elements that are sufficient to amount to significantly more than the judicial exception. The recited components “hybrid database storage system”, “first database”, “second database”, “control layer” and “change logs”, are merely generic computer/database elements performing their routine, well-understood, and conventional functions. See Alive, MPEP 2016.05(d). 
The steps mentioned in the independent claims merely constitutes standard distributed-database behavior, such and basic replication, mirroring, and ownership transfer. Courts have consistently helped such high level information management operations are conventional. 
The claims recite only functional, result oriented language (“detecting”, “propagating”, “transferring”,…), without specifying any technical mechanism for performing these operations in a non-conventional manner. 
Considering claims as a whole, the ordered combination of elements also reflects nothing more than the typical workflow of distributed systems, and therefore DOES NOT add “significantly more” than the abstract idea.
Such generic, high‐level, and nominal involvement of a computer or computer‐based elements for carrying out the invention merely serves to tie the abstract idea to a particular technological environment, which is not enough to render the claims patent‐eligible, as noted at pg.74624 of Federal Register/Vol. 79, No. 241, citing Alice, which in turn cites Mayo. Further, See, e.g., Alice Corp. Pty. Ltd. v. CLS Bank Int'l, 134 S. Ct. 2347, 2359‐60, 110 USPQ2d 1976, 1984 (2014). See also OIP Techs. v. Amazon.com, 788 F.3d 1359, 1364, 115 USPQ2d 1090, 1093‐94 (Fed. Cir. 2015) ("Just as Diehr could not save the claims in Alice, which were directed to 'implement[ing] the abstract idea of intermediated settlement on a generic computer', it cannot save O/P's claims directed to implementing the abstract idea of price optimization on a generic computer.") (citations omitted). See also, Affinity Labs of Texas LLC v. DirecTV LLC, 838 F.3d 1253, 1257‐1258 (Fed. Cir. 2016) (mere recitation of a GUI does not make a claimpatent‐eligible); Intellectual Ventures I LLC v. Capital One Bank, 792 F.3d 1363, 1370 (Fed. Cir. 2015) ("the interactive interface limitation is a generic computer element".).
The additional elements are broadly applied to the abstract idea at a high level of generality ("similar to how the recitation of the computer in the claims in Alice amounted to mere instructions to apply the abstract idea of intermediated settlement on a generic computer,") as explained in MPEP § 2106.05(f)) and they operate in a well‐understood, routine, and conventional manner.
MPEP § 2106.0S(d)(II) sets forth the following:
The courts have recognized the following computer functions as well-understood, routine, and conventional functions when they are claimed in a merely generic manner (e.g., at a high level of generality) or as insignificant extra-solution activity.
• Receiving or transmitting data over a network, e.g., using the Internet to gather data, Symantec ... ; TLI Communications LLC v. AV Auto. LLC ... ; OIP Techs., Inc., v. Amazon.com, Inc ... ; buySAFE, Inc. v. Google, Inc ... ;
• Performing repetitive calculations, Flook ... ; Bancorp Services v. Sun Life ... ;
• Electronic recordkeeping, Alice Corp ... ; Ultramercial ... ;
• Storing and retrieving information in memory, Versata Dev. Group, Inc. v. SAP Am., Inc ... ;
• Electronically scanning or extracting data from a physical document, Content Extraction and Transmission, LLC v. Wells Fargo Bank ... ; and
• A web browser's back and forward button functionality, Internet Patent
• Corp. v. Active Network, Inc. ...

. . . Courts have held computer-implemented processes not to be significantly more than an abstract idea (and thus ineligible) where the claim as a whole amounts to nothing more than generic computer functions merely used to implement an abstract idea, such as an idea that could be done by a human analog (i.e., by hand or by merely thinking).
In addition, when taken as an ordered combination, the ordered combination adds nothing that is not already present as when the elements are taken individually. There is no indication that the combination of elements integrate the abstract idea into a practical application. Their collective functions merely provide conventional computer implementation. Therefore, when viewed as a whole, these additional claim elements do not provide meaningful limitations to transform the abstract idea into a practical application of the abstract idea or that the ordered combination amounts to significantly more than the abstract idea itself.
The dependent claims have been fully considered as well, however, similar to the findings for claims above, these claims are similarly directed to the “Mental Processes” grouping of abstract ideas set forth in the 2019 PEG, without integrating it into a practical application and with, at most, a general purpose computer that serves to tie the idea to a particular technological environment, which does not add significantly more to the claims. The ordered combination of elements in the dependent claims (including the limitations inherited from the parent claim(s)) add nothing that is not already present as when the elements are taken individually. There is no indication that the combination of elements improves the functioning of a computer or improves any other technology. Their collective functions merely provide conventional computer implementation. Accordingly, the subject matter encompassed by the dependent claims fails to amount to significantly more than the abstract idea.

Looking at the claim as a whole does not change this conclusion and the claim is ineligible.
	
	Regarding claims 2 and 10: 
	This adds only that both models are “neutral networks”. This is still generic AI/ML processing. Neural networks are well established as mathematical model executed on a generic hardware, and their inclusion does not:
Improve hardware performance;
Recite a new neural-net architecture;
Recite a new training scheme;
Or effect a specific technological transformation.
Therefore, claims are still abstract, does not integrate into a practical application and no significant more. 
	
Regarding claims 3: 
This merely limits the type of data fed to the abstract ML process. Data source/type limitations are considered field of use, which is explicitly insufficient under the Alice framework.
Therefore, claims are still abstract, does not integrate into a practical application and no significant more. 

Regarding claims 4, 11, 18 and 19: 
	This recites sequential training of two ML models. However:
It is purely mathematical model training;
There is no novel training architecture;
Training ML models are a conventional routine activity;
No hardware improvement is claimed.
The claims describe training workflow, which is still an abstract mathematical practice performed by a computer. 
Therefore, claims are abstract mathematical training, does not integrate into a practical application and no inventive concept. 

Regarding claims 5, 12, and 20: 
	This merely recite that the second model is trained using:
An annotated ground-truth bounding box;
A binary classification score;
An adjustment step;
These are standard computer vision ML techniques (supervised training with ground truth labels). Ground truth annotation and loss-based adjustment are generic ML training.
Therefore, claims are abstract, generic ML training method and no significantly more. 

Regarding claims 6, and 13: 
	This merely recite:
Comparing scores;
Thresholding;
Querying a training set for similar images;
Generating similar sample set.
These are all routine ML data processing operation (thresholding, dataset filtering). Thresholding and dataset lookup are classic mathematical processing and generic post processing steps.
There are no specialized hardware, image sensor improvement, or model architecture improvement is introduced.
Therefore, claims are abstract, no practical application and no significantly more. 

Regarding claims 7, 14 and 17: 
	This merely recite:
Scoring similar samples with models 1 and 2;
Comparing scores;
Generating an indication of annotation error;
This is pure mathematical consistency checking operation used in ML to detect labeling errors.
There is still, no improvement to the computer technology; no constraints on how scoring is implemented; all steps are data analysis/mathematical comparison.
Therefore, it is pure evaluation logic, well within the abstract idea domain.
Therefore, claims are abstract, no practical application and no significantly more. 

Regarding claims 8 and 15: 
	This merely recite:
Generating object instance;
Comparing scores;
Generating filtered dataset;
Clustering the filtered dataset; 
Using instance embedding.
All steps are textbook ML pipeline operations: instance extraction; score evaluation; dataset filtering; clustering based on embedding.
Clustering and instance based filtering are mathematical algorithm, repeatedly held to be abstract. 
There is still, no improvement to model architecture, no hardware optimization, no rendering improvement.
Therefore, claims are abstract, no practical application and no significantly more. 


Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


Claim(s) 1-4, 6, 8-11, 13, 15, 16, 18 and 19 are rejected under 35 U.S.C. 103 as being unpatentable over KURMA; Sai Sree Bhargav et al. (US 20220019734 A1) [Kurma] in view of Joshi; Siddharth Vivek et al. (US 11928558 B1) [Joshi].

	Regarding claims 1, 9 and 16, Kurma discloses, a system comprising: a memory component; and one or more processing devices coupled to the memory component (see Fig. 1), the one or more processing devices to perform operations comprising: obtaining an image frame (receiving an image and a text corresponding to the image, wherein the image comprises one or more embedded texts and wherein the image and the text correspond to a downstream task ¶ [0006]-[0007]); 
generating, with a first machine learning model, a confidence score, a bounding box, and an instance embedding, corresponding to an object instance inferred from the image frame (receiving an image and a text corresponding to the image, wherein the image comprises one or more embedded texts and wherein the image and the text correspond to a downstream task; converting using a first deep learning model, the image into (i) one or more image captions corresponding to the image (ii) a set of bounding box coordinate tuples, wherein each bounding box coordinate tuple from the set of bounding box coordinate tuples corresponds to one of the one or more image captions and (iii) a set of confidence scores wherein each confidence score from the set of confidence scores corresponds to one of the one or more image captions; extracting one or more extracted texts from the one or more embedded texts in the image using a second deep learning model; converting using a neural network, the set of bounding box coordinate tuples into a set of positional embeddings; ordering the one or more image captions based on (i) the set of confidence scores ¶ [0006]-[0007], [0013], [0034]); and 
However, Kurma does not explicitly facilitate computing, with a second machine learning model, a calibrated confidence score for the object instance based on the instance embedding, the confidence score, and the bounding box.
Joshi discloses, computing, with a second machine learning model, a calibrated confidence score for the object instance based on the instance embedding, the confidence score, and the bounding box (In some instances, the ML models may be retrained or calibrated from a calibration set of data within the dataset. In some instances, the calibration set may include predicted outputs from the ML models as well as outputs provided by the reviewers. The calibration set may, in some instances, represent new content recently added to the dataset as well as old content within the dataset. For example, old content within the dataset may be periodically removed from the calibration set based on various expiration and/or sampling strategies. In some instances, content within the dataset may be randomly sampled for inclusion within the calibration set. Additionally, or alternatively, a percent or sampling of newly added content to the dataset may be randomly chosen for inclusion within the calibration set. Through the calibration set, the confidence thresholds of the ML models may be re-computed by iterating the data within the dataset and then comparing the predicted outputs with human review. The desired confidence thresholds may be influenced, in some instances, by accuracy, precision, and/or other recall configurations [col. 6, ll. 5-24]. Also see [col. 10, ll. 10-28], [col. 12, ll. 56-col. 13, ll. 5] and [col. 32, ll. 1-43], [col. 32, ll. 57-col. 33, ll. 2]).
It would have been obvious before the effective filing date of the claimed invention to combine the teachings of the cited references because Joshi’s system would have allowed Kurma to facilitate computing, with a second machine learning model, a calibrated confidence score for the object instance based on the instance embedding, the confidence score, and the bounding box. The motivation to combine is apparent in the Kurma’s reference, because there is a need to improve ML models universalness or scalability to accept various conditional inputs when analyzing content and outputting predictions.

Regarding claims 2 and 10, the combination of Kurma and Joshi discloses, wherein the first machine learning model and the second machine learning model are executed via a neutral network (Kurma: Both contextual visio-linguistic reasoner and contextual language model reasoners are neural networks and neural networks do not perform well if test data varies much from the data it is trained on. This makes a contextual language model reasoner better suited as a general-purpose model ¶ [0025]. Also see ¶ [0006]-[0007]).

Regarding claim 3, the combination of Kurma and Joshi discloses, wherein the image frame comprises an image of at least one of text, a graphic, a video image frame, and a photograph (Kurma:  receiving an image and a text corresponding to the image, wherein the image comprises one or more embedded texts and wherein the image and the text correspond to a downstream task ¶ [0006]-[0007]. Also see Fig. 8). 

Regarding claims 4, 11, 18 and 19, the combination of Kurma and Joshi discloses, wherein the first machine learning model is trained to generate the object instance based on a first training set comprising image frame samples (Joshi: At 302, the process 300 may analyze a dataset using a ML model to train the ML model to recognize one or more field(s) of interest or item(s) within content. For example, the dataset may include various forms of content, such as documents, PDFs, images, videos, and so forth that are searchable by the ML model. The ML model may be instructed to analyze the dataset or to be trained on the dataset, or content within the dataset, for use in recognizing or searching for item(s) within content at later instances. In some instances, human reviewers may label or classify samples within the dataset (e.g., a calibration set) and the ML model may accept these as input these as inputs for training the ML model. For example, the ML model may be trained to identify certain objects within the content, such as dogs or cats. That is, utilizing the dataset and/or the labels provided by human reviews, the ML models may be trained to recognize or identify dogs or cats with presented content [col. 20, ll. 58-col. 21, ll. 8]. Also see [co. 3, ll. 6-59], [col. 4, ll. 57-col. 5, ll. 18]); and wherein the second machine learning model is trained separately from the first machine learning model using a second training set after training of the first machine learning model with the first training set is completed (Joshi: In some instances, more than one ML model(s) 126 may be utilized when carrying out requests. For example, a first ML model may identify objects within an image and a second ML model may label the objects. In some instances, each of the ML model(s) 126 may be previously trained from a specific subset of the content data 124 and/or a calibration set within the content data 124 [col. 10, ll. 10-28]. For example, as illustrated, at 810 the process 800 may determine a calibration set for the second ML model. The calibration set used to train the second ML model may include random samplings of content or content that has been identified with high confidences. In other instances, the calibration set may include content labeled by human reviewers. The calibration set may therefore be utilized to train the second ML model to identify, search, or review particular field(s) of interest or content [col. 32, ll. 14-43]. Also see [col. 2, ll. 34-col. 3, ll. 59]).

Regarding claims 6 and 13, the combination of Kurma and Joshi discloses, the operations further comprising: responsive to determining that a difference between the confidence score and the calibrated confidence score exceeds a first threshold, searching a set of training image samples for similar object instances based on the instance embedding, wherein the first machine learning model was trained using the set of training image samples (Joshi: If the ML models determine that the confidence score of a prediction is less than a defined confidence (e.g., threshold), the content (or a portion thereof) may be sent for human review. Alternatively, if the ML model(s) determine that the confidence score is greater than the defined confidence threshold, the content may not be sent for human review. Users may therefore define the conditions when predictions or results of the ML model(s) are sent for human review. Based on the review of the ML model(s), the ML models may be trained to increase the confidence and accuracy of the ML models [col. 2, ll. 51-61]. However, the results of the human review may be utilized to train the ML models to increase their associated accuracy. For example, if the ML model(s) are accurate, the confidence threshold for screening the results of the predicted outputs may be reduced as the outputs of the ML model(s) are accurate [col. 4, ll. 51-56]. Also see [col. 5, ll. 56-col. 6, ll. 4], [col. 28, ll. 52-col. 28, ll. 2] and [col. 30, ll. 4-15]).

Regarding claims 8 and 15, the combination of Kurma and Joshi discloses, generating, with the first machine learning model, a set of object instances from a first set of image samples, wherein for each object instance of the set of object instances, the first machine learning model computes a respective confidence score and a respective instance embedding; computing, with the second machine learning model, a respective calibrated confidence score for each object instance of the set of object instances; generating a second set of image samples based on one or more object instances from the first set of image samples for which a difference between the respective confidence score and the respective calibrated confidence score exceeds a threshold; and clustering object instances from the second set of image samples based on the respective instance embedding for each of the one or more object instances (Joshi: For example, the predictions may include text classification or labeling (e.g., assigning tags, categorizing text, mining text, etc.), image classification (e.g., categorizing images into classes), object detection (e.g., locating objects in images via bounding boxes), or semantic segmentation (e.g., locating objects in images with pixel-level precision) associated with the content. In some instances, when generating predictions or analyzing the content, the ML models may utilize conditions or user-defined criteria. For example, users may define confidence scores that are associated with the predicted outputs. If the ML models determine that the confidence score of a prediction is less than a defined confidence (e.g., threshold), the content (or a portion thereof) may be sent for human review. Alternatively, if the ML model(s) determine that the confidence score is greater than the defined confidence threshold, the content may not be sent for human review. Users may therefore define the conditions when predictions or results of the ML model(s) are sent for human review. Based on the review of the ML model(s), the ML models may be trained to increase the confidence and accuracy of the ML models [col. 2, ll. 34-61]. Also see [col. 4, ll. 57-col. 5, ll. 18]).


Claim(s) 5, 12, 17 and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Kurma in view of Joshi in view of Lin; Zhe et al. (US 20200151448 A1) [Lin].

Regarding claims 5, 12 and 20, the combination of Kurma and Joshi teaches all the limitations of claims 4, 11 and 16.
However, neither Kurma nor Joshi does not explicitly facilitate wherein the second machine learning model is trained by: computing a binary classification score for the bounding box responsive to determining that the bounding box corresponds to an annotated ground truth bounding box; and adjusting the second machine learning model based on a difference between the calibrated confidence score and the binary classification score.
Lin discloses, wherein the second machine learning model is trained by: computing a binary classification score for the bounding box responsive to determining that the bounding box corresponds to an annotated ground truth bounding box; and adjusting the second machine learning model based on a difference between the calibrated confidence score and the binary classification score (In one example, a conditional detection network includes a binary classifier that assigns a positive training label to detection outputs of the conditional detection network that substantially overlap with a ground truth bounding box for the word-based concept. The binary classifier assigns a negative training label to other detection outputs of the conditional detection network that are not assigned a positive training label ¶ [0026]. Training module 154 can train any suitable network according to any suitable loss function. To keep a conditional detection network label agnostic, a binary loss can be used. For instance, in conditional detection network 300 of FIG. 3, CNN 304 includes binary classifier 322. In binary classifier 322, a binary sigmoid cross-entropy loss is used, rather than a softmax cross-entropy loss of Faster R-CNN. A binary classifier, such as binary classifier 322, can assign a positive label (e.g., a positive training label) to detection outputs of the conditional detection network that substantially overlap with a ground truth bounding box that corresponds to a given word-based concept, and a negative label (e.g., a negative training label) to other detection outputs that are not assigned a positive label. Additionally or alternatively, a binary classifier can be used to train a conditional detection network for negative classes of inputs. For instance, a negative class for a word-based concept can be provided to a conditional detection network, and a negative label can be assigned by a binary classifier to detection outputs that substantially overlap with a ground truth bounding box corresponding to the word-based concept. Training with positive and negative training labels, and with negative classes of inputs is illustrated in FIG. 4 ¶ [0086]. Also see ¶ [0044]).
It would have been obvious before the effective filing date of the claimed invention to combine the teachings of the cited references because Lin’s system would have allowed Kurma and Joshi to facilitate wherein the second machine learning model is trained by: computing a binary classification score for the bounding box responsive to determining that the bounding box corresponds to an annotated ground truth bounding box; and adjusting the second machine learning model based on a difference between the calibrated confidence score and the binary classification score. The motivation to combine is apparent in the Kurma and Joshi’s reference, because there is a need to improve object detectors trained using heterogeneous training datasets.

Regarding claim 17, the combination or Kurma, Joshi and Lin discloses, computing a training correction using ground truth images used by another machine learning model to generate the instance embedding, the confidence score, and the bounding box for each of the one or more object instances (Lin: In one example, a conditional detection network includes a binary classifier that assigns a positive training label to detection outputs of the conditional detection network that substantially overlap with a ground truth bounding box for the word-based concept. The binary classifier assigns a negative training label to other detection outputs of the conditional detection network that are not assigned a positive training label ¶ [0026]-[0027]. Also see ¶ [0086], [0091]-[0094]).


Claim(s) 7 and 14 are rejected under 35 U.S.C. 103 as being unpatentable over Kurma in view of Joshi in view of Mehra; Ashutosh et al. (US 20210133439 A1) [Mehra].

Regarding claims 7 and 14, the combination of Kurma and Joshi teaches discloses, determining, using the first machine learning model, a respective confidence score for each of the similar object instances from the set of similar image samples; determining, using the second machine learning model, a respective calibrated confidence score for each of the similar object instances from the set of similar image samples (Joshi: In some instances, the ML models may be retrained or calibrated from a calibration set of data within the dataset. In some instances, the calibration set may include predicted outputs from the ML models as well as outputs provided by the reviewers. The calibration set may, in some instances, represent new content recently added to the dataset as well as old content within the dataset. For example, old content within the dataset may be periodically removed from the calibration set based on various expiration and/or sampling strategies. In some instances, content within the dataset may be randomly sampled for inclusion within the calibration set. Additionally, or alternatively, a percent or sampling of newly added content to the dataset may be randomly chosen for inclusion within the calibration set. Through the calibration set, the confidence thresholds of the ML models may be re-computed by iterating the data within the dataset and then comparing the predicted outputs with human review. The desired confidence thresholds may be influenced, in some instances, by accuracy, precision, and/or other recall configurations [col. 6, ll. 5-24]. Also see [col. 10, ll. 10-28], [col. 12, ll. 56-col. 13, ll. 5] and [col. 32, ll. 1-43], [col. 32, ll. 57-col. 33, ll. 2])
However, neither Kurma nor Joshi does not explicitly facilitate responsive to determining that, for a first similar object instance, a difference between the respective confidence score and the respective calibrated confidence score exceeds a second threshold, generating an indication of a potential training data annotation error.
Mehra discloses, responsive to determining that, for a first similar object instance, a difference between the respective confidence score and the respective calibrated confidence score exceeds a second threshold, generating an indication of a potential training data annotation error (Training or tuning of the CNN or any machine learning model can include minimizing a loss function between the target variable or output (e.g., 0.90) and the expected output (e.g., 100%). Accordingly, it may be desirable to arrive as close to 100% confidence of a particular classification as possible so as to reduce the prediction error. This may happen overtime as more training images/documents and baseline data sets are fed into the learning models so that classification/detection can occur with higher prediction probabilities. Accordingly, in some embodiments, block 1008 represents tuning or training, which is done in various stages (e.g., a first stage and a second stage) to reduce prediction error. In these embodiments for example, a first training set can be created (e.g., a first document with content order values) and training can occur in a first stage using the first training set and then a second training set can be created (e.g., a first document with other content order values) and training can occur in a second stage using the second training set to reduce error rate or tune the model. In other embodiments, the prediction at block 1008 represents prediction on a deployed model that has already been trained ¶ [0086]. Also see ¶ [0068]).
It would have been obvious before the effective filing date of the claimed invention to combine the teachings of the cited references because Mehra’s system would have allowed Kurma and Joshi to facilitate responsive to determining that, for a first similar object instance, a difference between the respective confidence score and the respective calibrated confidence score exceeds a second threshold, generating an indication of a potential training data annotation error. The motivation to combine is apparent in the Kurma and Joshi’s reference, because there is a need to improve the detection of objects within documents using machine learning models.

Conclusion
The examiner requests, in response to this Office action, support be shown for language added to any original claims on amendment and any new claims. That is, indicate support for newly added claim language by specifically pointing to page(s) and line no(s) in the specification and/or drawing figure(s). This will assist the examiner in prosecuting the application.
When responding to this office action, Applicant is advised to clearly point out the patentable novelty which he or she thinks the claims present, in view of the state of the art disclosed by the references cited or the objections made. He or she must also show how the amendments avoid such references or objections See 37 CFR 1.111(c).
Any inquiry concerning this communication or earlier communications from the examiner should be directed to MOHAMMAD S ROSTAMI whose telephone number is (571)270-1980. The examiner can normally be reached Mon-Fri From 9 a.m. to 5 p.m..
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Boris Gorney can be reached at (571)270-5626. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





11/19/2025
/MOHAMMAD S ROSTAMI/               Primary Examiner, Art Unit 2154
Read full office action
Prosecution Timeline

Aug 24, 2022
Application Filed
Nov 24, 2025
Non-Final Rejection mailed — §101, §103
Jan 29, 2026
Interview Requested
Feb 05, 2026
Applicant Interview (Telephonic)
Feb 06, 2026
Examiner Interview Summary
Feb 19, 2026
Response Filed
May 27, 2026
Final Rejection mailed — §101, §103 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

18/623,662
Patent 12639262
MACHINE LEARNING BASED CONVERSION OF UNSTRUCTURED VOICE DATA INTO EXECUTABLE FILES
2y 1m to grant Granted May 26, 2026
17/877,533
Patent 12619920
DISTRIBUTED ADAPTIVE MACHINE LEARNING TRAINING FOR INTERACTION EXPOSURE DETECTION AND PREVENTION
3y 9m to grant Granted May 05, 2026
17/850,319
Patent 12614063
IMPLEMENTATION OF ARGMAX OR ARGMIN IN HARDWARE
3y 10m to grant Granted Apr 28, 2026
17/981,024
Patent 12614077
METHOD OF PROCESSING MULTIMODAL TASKS, AND AN APPARATUS FOR THE SAME
3y 5m to grant Granted Apr 28, 2026
18/419,378
Patent 12613932
SEARCH RANKER WITH CROSS ATTENTION ENCODER TO JOINTLY COMPUTE RELEVANCE SCORES OF KEYWORDS TO A QUERY
2y 3m to grant Granted Apr 28, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.
Strategy Recommendation AI-generated — please review before filing

Get a prosecution strategy drawn from examiner precedents, rejection analysis, and claim mapping.
Typically takes 5-10 seconds — AI-generated, attorney review required before filing
Prosecution Projections

3-4
Expected OA Rounds
67%
Grant Probability
93%
With Interview (+26.1%)
3y 9m (~0m remaining)
Median Time to Grant
Moderate
PTA Risk
Based on 636 resolved cases by this examiner. Grant probability derived from career allowance rate.