Last updated: April 19, 2026
Application No. 18/165,022
METHOD AND SYSTEM FOR EVALUATING PERFORMANCE OF IMAGE TAGGING MODEL

Non-Final OA §101§102
Filed
Feb 06, 2023
Examiner
CAI, PHUONG HAU
Art Unit
2673
Tech Center
2600 — Communications
Assignee
Naver Corporation
OA Round
3 (Non-Final)
Interview Optional

— +20.9% interview lift. This examiner has a relatively high allow rate; a written response may suffice.
Based on 107 resolved cases, 2023–2026
Examiner Intelligence

CAI, PHUONG HAU View full profile →
Grants 81% — above average
Career Allow Rate
87 granted / 107 resolved
+19.3% vs TC avg
Strong +21% interview lift
Without
With
+20.9%
Interview Lift
resolved cases with interview
Typical timeline
3y 0m
Avg Prosecution
32 currently pending
Career history
139
Total Applications
across all art units
Statute-Specific Performance

§101
22.6%
-17.4% vs TC avg
§103
38.5%
-1.5% vs TC avg
§102
21.3%
-18.7% vs TC avg
§112
14.0%
-26.0% vs TC avg
Black line = Tech Center average estimate • Based on career data from 107 resolved cases
Office Action

§101 §102
DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submissions, filed on January 06th, 2026, have been entered.

Status of Claims
Claims 1-4, 6-15 and 19-20 are pending, claims 1, 4 and 20 have been amended, claims 5 and 16-18 are cancelled. Claims 1-4, 6-15 and 19-20 remains rejected.

Response to Argument(s)
Regarding the 101 rejections, the Applicant’s argument the examiner respectfully finds them to be non-persuasive.
In view of the Amendments to independent claims 1 and 20 the previously applied prior art rejections are withdrawn. Applicants’ arguments are rendered moot in view of the new grounds of rejection set forth below. However, the arguments regarding the 103 rejections, the examiner find some to be non-persuasive. Which the examiner finds them to be taught by the same prior arts (see 102 rejections below for more details).
101 rejection:
In pages 10-15 of the remarks, the Applicants argue that the claims, such as claim 1, has an indication of an integration of the judicial exceptions into a practical application, directing to an improvement in the function of a computer in comparing and evaluating performance of an image tagging model, under Step 2A Prong 2 of the 101 requirements.  
	In support of the above argument, the Applicants state that the claimed method is directed to solving a problem in which it is difficult to compare performance between models when the models do not share the same label set with brining in the support of paragraph [0005] of the instant application’s publication. This is done by the claimed method generates a first label-verification class mapping table based on a first output label set of the first model and correct values, and convert output values of the first model into output values associated with a verification class set using the first label-verification class mapping table; similarly, of the same claimed method for the second output label set (last paragraph of page 13 of the remarks) with bringing in support of figure 10 further starting that the claimed method generate the label-verification class mapping table using output values of the first model and the second model, where the label-verification class mapping table serves to map the different labels used by the first model and the second model to common classes; therefore, the claimed limitation of “generating a label-verification class mapping table and converting outputs of the first model and the second model using the mapping table” constitutes a technical means for resolving label inconsistencies between models which is a practical application under Step 2A Prong 2 and enable the model performance to be compared without additional training (“significantly more” under Step 2B).
	Examiner’s reply:
	The examiner respectfully disagrees with the Applicant’s arguments and find them to be incommensurate with the scope of the claim. Importantly, the Applicants are reminded that the claims are construed based on BRI (broadest reasonable interpretation) in light of the specification, therefore, the specification is used to understand the claimed method not to be imported as the scope of the claim, the claimed are interpreted solely based on its own language as the requirement of interpretation the claim under BRI of the 101 requirements. 
	Therefore, the bringing in support of the disclosure such as paragraph [0005] of the application to then state the problem being solved is comparing performance between models when the models do not share same label set and bringing in of the figure 10 to further state that the claimed method of “generating a label-verification class mapping table and converting outputs of the first model and the second model using the mapping table” constitutes a technical means for resolving label inconsistencies between models which is a practical application under Step 2A Prong 2, are not reflected in the instant scope of the claims and enable the model performance to be compared without additional training (“significantly more” under Step 2B).
	There is no recitation of any problem of inconsistencies between models and that the models do not share same label set, neither reciting any solving of a problem the solve the label inconsistencies neither recite any training at all to indicate model performance to be compared without additional training, etc. as stated above. The examiner notice the claim does recite some features in relation such as, that the first label set and the second label set are different from each other, however, does not explicitly indicate that they have inconsistencies in certain aspects, moreover, these first label set and the second label set are recited to be merely “associated with” the first image tagging model and the second image tagging model, under BRI, do not indicate a specific relation or meaning of these label sets to be adequate of a problem of them being inconsistent in certain aspects, to then, if possible, solve the problem of comparing models. Importantly, these models are not even recited to be of machine learning or neural networks to, if possible, even be concluded to involve any training, learning. Importantly, there are a lot of other features in the claims are recited to be merely “associated with” which does not hold tight the direct relation or involvement of outputting, inputting, must be involved with in the BRI scope of the claims. Furthermore, the recitation of the “label-verification class mapping table” is recited to be merely used to convert the certain values to be another value output in association with a class set, cannot be hold significance to be persuasive that it constitutes a technical means for resolving label inconsistencies between models which is a practical application under Step 2A Prong 2, are not reflected in the instant scope of the claims and enable the model performance to be compared without additional training (“significantly more” under Step 2B).
	The important point of the requirements of the Step 2A Prong 2 of the 101 is that there must be additional elements recited in the claim that are indicative of an integration of the judicial exceptions/abstract ideas into a practical application, and the same approach for step 2B there must be recitation of additional elements that considered the claim to be significantly more. In the remarks for 101 of the Applicants, the Applicants fail to indicate or locate any additional elements to meet the requirement of these two steps. Therefore, the argument does not meet the requirements and are not persuasive. 
	Examiner’s suggestion:
	The Applicants are suggested to incorporate the inventive concept of the Applicants’ argument into the claims that the first label set and the second label set are training data sets, respectively, for the first model and the second model (not just being associated with, but directly training data sets for these models) which are different in label and different to the verification data set’s labels and the first model and the second model are machine learning models according to the disclosure’s [0095]. And that by using the label-verification class mapping table to define mapping relationship from the first label set to the verification data set by comparing correct values of the first verification class with the output values in the output set of the first label set to calculate a performance score of the first label for the first verification class and repeat this process for each label to map the label having the highest performance score for the first verification class or the labels having the performance score equal to or higher than a threshold value to the first verification class, and the same process is applied for the second label set according to [0097]. Then indicate the improvement aspect that this will have quantitatively compare and evaluate the performance of the models without using additional training based on the mapping result according to [0046].
	103 rejection:
	In pages 15-20 of the remarks, the Applicants centrally argue that the proposed prior arts in combination does not teach of suggest the feature of the claims:
	“the verification class set, the first label…., and the second label set…are different from each other”
	In support of the above argument, the Applicants assert that the proposed Labatut discloses that the confusion matrices must have the same format, which means that rows and columns of the confusion matrices must be defined based on the same label set hence, different to the claimed feature above.
	Moreover, the Applicants further states that the proposed Ferri discloses performance metrics in classification where Ferri relates to correlation analysis between evaluation metrics, therefore, does not remedy the deficiencies of Labatut.
	Examiner’s reply:
	The examiner respectfully disagrees with the Applicants’ argument, as previously explained in the previous Office Action, by BRI, the claim can cover a teaching from the prior arts that the taught analogous features still fall within the scope of the claimed method. Therefore, the examiner finds that the proposed Labatut to teach the recited the verification class set (Labatut’s section 7.4, 1st par., where it discloses a dataset being used to for classifiers hence, by BRI, to be analogous to the recited verification class set), the first label set of the first image tagging model (Labatut’s classifier’s output to be analogous to the label set, therefore, any of the classifiers can be understood to be the first tagging model and its output to be the first label set), and the second label set of the second image tagging model (Labatut’s any of the other classifiers to be analogous to the second tagging model and its output to be the recited second label set, by BRI) are different from each other (therefore, the input being the dataset used for the classifiers [as discussed previously to be the verification class set] and each of the classifiers; outputs are different from each other, by BRI, covers the recited feature that they are different from each other). See the 103 rejections below for more details. Therefore, the teachings of the prior arts teach these claimed features, by BRI. any data sets used for different models, used at different times, from different outputting are already different to each other. Moreover, the examiner finds the arguments to be incommensurate with the scope of the claim since, the claimed feature do not exclude that the Labatut discloses that the confusion matrices must have the same format, which means that rows and columns of the confusion matrices must be defined based on the same label set hence, different to the claimed feature above. the claimed feature simply reciting that these sets are different to each other, and not reciting that the formats are different or structure or definition are different in some specific ways.

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.

Claims 1-20 are rejected under 35 U.S.C. 101 

Regarding Independent Claim 1 and its dependent claims 2-19,
Step 1 Analysis: Claim 1 is directed to a method/process, which falls within one of the four statutory categories. 

Step 2A Prong 1 Analysis: Claim 1 recites, in part:
“comparing the first output values from the first image tagging model with the correct values for each of the plurality of verification images and calculating a first performance score for the first image tagging model; and 
comparing the second output values from the second image tagging model with the correct values for each of the plurality of verification images and calculating a second performance score for the second image tagging mode; and 
evaluating a difference in performance between the first image tagging model and the second image tagging model by quantitively comparing the first performance score and the second performance score;
using the first image tagging model when the first performance score is greater than the second performance score ;and
using the second image tagging model when the second performance score is greater than the first performance score,
wherein each of the correct values is associated with at least one verification class of a verification class set, wherein the first image tagging model is associated with a first label set, wherein the second image tagging model is associated with a second label set, and wherein the verification class set, the first label set, and the second label set are different from each other.”

The limitations as drafted, are processes that, under broadest reasonable interpretation, covers the performance of the limitation in the mind which falls within the “Mathematical ” and “Mental Process” grouping of abstract ideas. 
Such as the steps of calculating as shown above are steps that, by BRI (broadest reasonable interpretation), a human mind can also perform mentally through process of observation and evaluation such as, the human mind can observe some values or data/information and compare them based on certain condition or rule with pen and paper to following such steps recited in the claim, and the steps of calculating are explicitly mathematical calculations. And the wherein clauses in the claim are just limitations of further specifying what the data/information are for such mental processes and mathematical calculations to perform on, hence, still merely, part of the mental process and mathematical calculation.
Accordingly, the claim recites an abstract idea.

Step 2A Prong 2 Analysis: This judicial exception is not integrated into a practical application. particular, the claim recites the following additional element(s) – 
one or more processors,
receiving a verification data set including a plurality of verification images and a plurality of correct values associated with the plurality of verification images, wherein each of the correct values is associated with at least one verification class of a verification class set; 
receiving a first image tagging model and a second image tagging model; 
inputting the plurality of verification images to the first image tagging model and the second image tagging model and outputting first output values from the first image tagging model and second output values from the second image tagging model;
generating a label-verification class mapping table based on the plurality of correct values, the first output values, and the second output values, the label-verification class mapping table defining a mapping relationship from a first label set associated with the first image tagging model and a second label set associated with the second image tagging model to the verification class set;
converting, using the label-verification class mapping table, the first output values into converted first output values associated with the verification class set and converting the second output values into converted second output values associated with the verification class set.

The additional elements as shown above are merely just insignificant extra-solution activities of data gathering and generic computer components such as processor to be perform generic function of a generic processor recited at high level of generality, and the data gathering steps include receiving, inputting data/information, generating data/information, converting data/information according to certain condition or rule, moreover, by further specifying what the data/information are of still merely just data gathering steps. Accordingly, these additional element does not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. Please see MPEP §2106.04.(a)(2).III.C. 
In view of the of the foregoing, the additional step does not integrate the abstract idea into a practical application.

Step 2B Analysis: there are no additional elements that amount to significantly more than the judicial exception. Moreover, the additional element as mentioned above does not amount to significantly more for the claim as a whole. Please see MPEP §2106.05. The claim is directed to an abstract idea.
For all of the foregoing reasons, claim 1 does not comply with the requirements of 35 USC 101. 

For all of the foregoing reasons, claim 1 does not comply with the requirements of 35 USC 101. 
Accordingly, the dependent claims 2-19 do not provide elements that overcome the deficiencies of the independent claim 1. Moreover, claims 2, 5, 10-11, 13-15 and 17-19 recites, “wherein” clauses of further specification limitations that further provide information or condition to the limitation of the claims each of these claims depend on, hence no more than abstract ideas and additional elements of the same type to the elements that further specifies. Claim 3 recites, in part, “wherein the calculating the first performance score includes: inputting the plurality of verification images to the first image tagging model to generate an output value set including the first output values; and determining the first label set associated with the first image tagging model based on the output value set” which is a further specification limitation of the calculation step now is recited to includes an inputting step which, under Step 2A Prong 2, to be an additional element of an insignificant extra-solution activity of data gathering, and the determining step to be a mental process can be done with a pen and a paper of an observation and evaluation wherein the human mind can observe and evaluate to determine a certain association between information. Claim 4 recites, in part, “converting labels in the output value set into verification classes using the label-verification class mapping table” is an insignificant extra-solution activity of converting data/information. Claim 6 recites, in part, “wherein the calculating the first performance score further includes calculating the first performance score standardized in terms of a verification class based on the converted output value set and the plurality of correct values” which recites a mathematical calculation which is an abstract idea falls under mathematical operation type of abstract idea. Claim 7 recites, in part, “wherein the generating the label-verification class mapping table includes: calculating a performance score of each of first labels in the first label set for a first verification class; mapping a first label having a highest performance score for the first verification class to the first verification class; calculating a performance score of each of the first labels in the first label set for a second verification class; and mapping a second label having a highest performance score for the second verification class to the second verification class” which recites a series of calculation and mapping steps wherein the calculation step is a mathematical operation type of abstract idea, and the mapping steps can be understood either a mathematical relationships type of abstract idea or mental process of observation and judgement can be done through the human mind with a pen and a paper that human mind can map information together such as the steps recited in the claim, hence they are all abstract ideas essentially. Claim 8 recites, in part, “wherein the calculating the performance score of each of the first labels in the first label set for the first verification class includes calculating a performance score of the first label for the first verification class by comparing correct values associated with the first verification class with output values in the output value set which are associated with the first label” includes mathematical operation type of abstract idea. Claim 9 recites, in part, “wherein the generating the label-verification class mapping table includes: calculating a performance score of each of first labels in the first label set for a first verification class; mapping labels having performance scores equal to or greater than a threshold value for the first verification class to the first verification class; calculating a performance score of each of the first labels in the first label set for a second verification class; and mapping labels having performance scores equal to or greater than a threshold value for the second verification class to the second verification class” which recites a series of calculation and mapping steps wherein the calculation step is a mathematical operation type of abstract idea, and the mapping steps can be understood either a mathematical relationships type of abstract idea or mental process of observation and judgement can be done through the human mind with a pen and a paper that human mind can map information together such as the steps recited in the claim, hence they are all abstract ideas essentially. Claim 12 recites, in part, “based on the first performance score and the second performance score, which are performance scores standardized in terms of verification classes, quantitatively evaluating a difference in performance between the first image tagging model and the second image tagging model” is a step that a human mind can also perform of an evaluation step based on a certain criteria such as recited in the claim. Claim 16 recites, in part, “wherein the calculating the first performance score includes: determining a first label set associated with the first image tagging model; generating a label-verification class mapping table defining a mapping relationship from the first label set to the verification data set; and converting an output of the first image tagging model using the label-verification class mapping table” includes a step that a human mind can also perform of the determining step as recited in the claim wherein with a pen a paper that human mind can determine label set based on a certain association therefore, this is a mental process abstract idea, and the step of “generating…” and “converting…” are steps of additional elements, under Step 2A Prong 2, to be insignificant extra-solution activities of data-gathering and data-converting. 
Accordingly, the dependent claims 2-19 are not patent eligible under 101.

Regarding Independent Claim 20:
Claim 20 is a system/device claim, which recites analogous limitations to the analogous independent claim 1 hence, can be analyzed under the same approach for the 101 analysis as discussed above. Moreover, claim 20 recites, further additional elements of, “a memory, one or more processors,…., one or more programs includes instructions…” which are generic computer components, recited at high level of generality, hence are not indicative of a practical applications, hence, still rejected under 101. Therefore, claim 20 is rejected under 101.
Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.

Claims 1-20 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Vincent Labatut, et. al. (“Evaluation of Performance Measures for Classifiers Comparison,  Dec. 2011, arXiv, Computer Science, Machine Learning, Cornell University” hereinafter as “Labatut”) and further evidenced and explained by C. Ferri et. al. (“An Experimental Comparison of Performance Measures for Classification, Jan. 2009, Pattern Recognition Letters, Vol. 30, Issue 1, pp. 27-38” hereinafter as “Ferri”).
	
	Regarding claim 1, Labatut discloses a method for comparing and evaluating performance of an image tagging model, the method being performed by one or more processors and comprising (title): receiving a verification data set including a plurality of verification images and a plurality of correct values associated with the plurality of verification images (section 7.4, 1st paragraph, discloses the goal of this paper is to compare classifiers on a given dataset hence, the same dataset is being used for the classifiers, to then assess the classifiers and identify the best classifier; therefore, this dataset is used for assessment of the classifiers which is analogous to the recited “verification data set” as claimed; wherein the dataset includes true classes and test data and the true data for constructing the confusion matrix as disclosed in section 3; which is analogous to the plurality of verification images a set of correct values [the true classes]), wherein each of the correct value is associated with at least one verification class of a verification class set (the true classes are discussed previously to be the correct values recited, moreover, it can be understood that the true classes are associated with the dataset used for the classifiers, as previously discussed to be the verification class set, moreover, each of the true class would correct to each of the class in the dataset hence, by BRI, covers the scope of the claimed limitation); receiving a first image tagging model and a second image tagging model (any of the classifiers can be understood as the first image tagging model and the any of the others is the second image tagging model as claimed, by BRI); inputting the plurality of verification images to the first image tagging model and the second image tagging model and outputting first output values from the first image tagging model and second output values from the second image tagging models (the classifiers as discussed previously, are used with the data from the datasets to test the performance of the classifiers, hence, by BRI, it can be understood that the classifiers produce output therefore, any of the classifiers can be understood to be the recited first image tagging model and its output would be analogous to the recited first output values as claimed, and ant of the other classifiers is analogous to the second image tagging model and its output is analogous to the second output values as claimed, by BRI, covers the scope of the claimed limitation); generating a label-verification class mapping table based on the plurality of correct values, the first output values, and the second output values (section 3, table 1, shows a confusion matrix which is analogous to the recited label-verification class mapping table since they are both based on the correct values and the output value set [the estimated classes] including the output of the first output values and the second output values; since the classifiers as discussed in claim 2 being used to assess and evaluate their performances, each classifier is being used on the dataset to generate or classify the data to output classes, therefore, it is analogous to the images being input into the first image tagging model to generate an output value set as claimed [the output of the classifier] which includes the output values as discussed), the label-verification class mapping table defining a mapping relationship from a first label set associated with the first image tagging model and a second label set associated with the second image tagging model to the verification class set (the confusion matrix as discussed above, section 3, table 1, shows the relationship from the estimated classes [the first label set] to the true classes or the verification class set as discussed above, and the same goes for the second label set which, respectively, are associated with the first tagging model and the second tagging model such as discussed previously); converting, using the label-verification class mapping table, the first output values into converted first output values associated with the verification class set and converting the second output values into converted second output values associated with the verification class set (section 3, 3rd paragraph, discloses inverting the estimated and true classes to result in a transposed matrix, which the inverting is analogous to the recited “converting” of the labels into the transposed matrix which is to result into a verification classes based on the original confusion matrix [analogous to the using the label-verification class mapping table as claimed, by BRI]; section 4, 1st paragraph, discloses the measures are calculated from the confusion matrixes, therefore, analogous to the claimed “first performance score standardized in terms of the verification class,” wherein if the transposed matrix is being used such as disclosed in section 3, 3rd paragraph, then the measure being calculated based on the transposed matrix or the performance score being calculated based on the converted output value set and together with the correct values, by BRI, covers the claimed limitation; section 7.3, 2nd paragraph, discloses the dataset can be fixed bound or normalized, in case where it is normalized it can be understood to be analogous to being standardized in terms of the dataset [including the classes or the verification class as claimed, by BRI] for the calculation of the measures [the scores as claimed, therefore, it’s analogous to the scores being standardized according to the dataset including different classes or in terms of a verification class as claimed, by BRI])); comparing the first output values from the first image tagging model with the correct values for each of the plurality of verification images (the discrimination plot as discussed above in claim 7 is being used to compare the behaviors of the classifiers [section 7.4, 1st paragraph] based on the confusion matrix therefore, the correct values associated with the first class with the estimated class being compared which is analogous to the claimed limitation, by BRI, to be of a first image tagging model and the correct values being compared as claimed) and calculating a first performance score for the first image tagging model (section 6, 1st paragraph and section 6.1, 1st paragraph, discloses computing measures for each of the classifiers for evaluation of their performances, therefore, the measure can be understood as the performance score, therefore, the measure for the first classifier [as discussed previously of any of the classifiers] can be understood to be analogous to the first performance score as claimed, by BRI); comparing the second output values from the second image tagging model with the correct values for each of the plurality of verification images (the discrimination plot as discussed above in claim 7 is being used to compare the behaviors of the classifiers [section 7.4, 1st paragraph] based on the confusion matrix therefore, the correct values associated with the first class with the estimated class being compared which is analogous to the claimed limitation, by BRI) and calculating a second performance score for the second image tagging model (and the measure for the second classifier [as discussed previously to be any of the other classifiers] can be understood to be analogous to the second performance score as claimed, by BRI); evaluating a difference in performance between the first image tagging model and the second image tagging model (section 4, 1st 2 pars., discloses comparing the classifiers based on several measures to determine their performances, therefore comparing is analogous to the recited comparing and the taking measures is analogous to the evaluating to find the difference in performance of the classifiers, hence, by BRI, covers the scope of the claimed limitation) by quantitively comparing the first performance score and the second performance score (section 4 discloses that the comparing as discussed is based on comparing the values, scores [quantitively] the scores of the performances of the first and the second performances as discussed previously), using the first image tagging model when the first performance score is greater than the second performance score; and using the second image tagging model when the second performance score is greater than the first performance score (as the comparing of the performance of the models, it can be understood as when one model is performing better than the other it is preferably used, therefore, when the performance score of whatever model is greater than expected of certain threshold then that model is preferably used over the other such as discussed in section 7.1), wherein each of the correct values is associated with at least one verification class of a verification class set (the correct classes [as discussed above] is associated with each class of the dataset which is analogous to being associated with at least one verification class of a verification class set as claimed by BRI), wherein the first image tagging model is associated with a first label set (Labatut’s classifier’s output to be analogous to the label set, therefore, any of the classifiers can be understood to be the first tagging model and its output to be the first label set), wherein the second image tagging model is associated with a second label set (Labatut’s any of the other classifiers to be analogous to the second tagging model and its output to be the recited second label set, by BRI), and wherein the verification class set, the first label set, and the second label set are different from each other (therefore, the input being the dataset used for the classifiers [as discussed previously to be the verification class set] and each of the classifiers; outputs are different from each other, by BRI, covers the recited feature that they are different from each other).
	However, Labatut does not explicitly disclose that the tagging models being image tagging models and the verification data set includes a plurality of verification images.
	In the same field of performance measures for classification (title, Ferri), Ferri discloses the tagging models being image tagging models and the verification data set includes a plurality of verification images (table 2 shows the dataset being used for the classification include segmentation for example which includes images for the classification models which are analogous to the models of labatut, therefore, it can be understood the Labatut’s classification model covers image classification models and their dataset includes images).
	
	Regarding claim 2, Labatut as evidence by Ferri, wherein Labatut discloses the method according to claim 1, wherein the first performance score and the second performance score are scores standardized in terms of a verification class (section 7.3, 2nd paragraph, discloses the dataset can be fixed bound or normalized, in case where it is normalized it can be understood to be analogous to being standardized in terms of the dataset [including the classes or the verification class as claimed, by BRI] for the calculation of the measures [the scores as claimed, therefore, it’s analogous to the scores being standardized according to the dataset including different classes or in terms of a verification class as claimed, by BRI]).

	Regarding claim 3, Labatut as evidence by Ferri, wherein Labatut discloses the method according to claim 1, wherein the calculating the first performance score includes: inputting the plurality of verification images to the first image tagging model to generate an output value set including the first output values (since the classifiers as discussed in claim 2 being used to assess and evaluate their performances, each classifier is being used on the dataset to generate or classify the data to output classes, therefore, it is analogous to the images being input into the first image tagging model to generate an output value set as claimed [the output of the classifier] which includes the output values as discussed above in claim 1); and determining the first label set associated with the first image tagging model based on the output value set (as discussed previously, any of the other classifiers include receiving the input from the dataset to generate an output dataset being the classes or the label set associated with the image tagging model based on the output value set, by BRI).

	Regarding claim 4, Labatut as evidence by Ferri, wherein Labatut discloses the method according to claim 3, wherein the calculating the first performance score further includes: converting labels in the output value set into verification classes using the label-verification class mapping table (section 3, 3rd paragraph, discloses inverting the estimated and true classes to result in a transposed matrix, which the inverting is analogous to the recited “converting” of the labels into the transposed matrix which is to result into a verification classes based on the original confusion matrix [analogous to the using the label-verification class mapping table as claimed, by BRI]).

	Regarding claim 5, Labatut as evidence by Ferri, wherein Labatut discloses the method according to claim 4, wherein the label-verification class mapping table defines a mapping relationship from the first label set to the verification class set (the confusion matrix as discussed above in claim 4 shows the relationship from the estimated classes [the first label set] to the true classes or the verification class set as discussed above in claim 1).

	Regarding claim 6, Labatut as evidence by Ferri, wherein Labatut discloses the method according to claim 4, wherein the calculating the first performance score further includes calculating the first performance score standardized in terms of a verification class based on the converted output value set and the plurality of correct values (section 4, 1st paragraph, discloses the measures are calculated from the confusion matrixes, therefore, analogous to the claimed “first performance score standardized in terms of the verification class,” wherein if the transposed matrix is being used such as disclosed in section 3, 3rd paragraph, then the measure being calculated based on the transposed matrix or the performance score being calculated based on the converted output value set and together with the correct values, by BRI, covers the claimed limitation; section 7.3, 2nd paragraph, discloses the dataset can be fixed bound or normalized, in case where it is normalized it can be understood to be analogous to being standardized in terms of the dataset [including the classes or the verification class as claimed, by BRI] for the calculation of the measures [the scores as claimed, therefore, it’s analogous to the scores being standardized according to the dataset including different classes or in terms of a verification class as claimed, by BRI]).

	Regarding claim 7, Labatut as evidence by Ferri, wherein Labatut discloses the method according to claim 4, wherein the generating the label-verification class mapping table includes: calculating a performance score of each of first labels in the first label set for a first verification class (section 4.7, 2nd paragraph, discloses calculating a class-specific measure based on the specific class’s confusion matrix, therefore, this case the class-specific measure being the performance score as claimed which is calculated for each of the estimated class by the classifier for that class [for a first verification class as claimed, by BRI]); mapping a first label having a highest performance score for the first verification class to the first verification class (section 7.4, 1st paragraph, discloses comparing classifiers can be done through numerical values are enough to assess the classifier to be the best based on measured accuracies which is based on a discrimination plot [according to section 8, 2nd paragraph] such as shown in FIG. 1 where the best accuracy being favored for the classification of the classifiers to be picked as best therefore, it’s analogous to the claimed limitation wherein such as shown in FIG. 1 and section 6.2, 1st paragraph, for each class, it is to have the highest measure shown for the classifier performing to the best one, therefore, it can be understood as that this classifier has its estimated class being correct or being the same as the true class therefore, the most accurate or the highest performance score being associated to the classifier with the most accurate estimation or analogous to the recited “first verification class to the first verification class” as claimed); calculating a performance score of each of the first labels in the first label set for a second verification class (section 5.2, 1st paragraph, discloses 3 classes being used for the method, therefore, any of the classes can be understood to be the first verification class and any of the others can be understood to be the second verification class as claimed, by BRI); and mapping a second label having a highest performance score for the second verification class to the second verification class (for the other of the three classes, the same step can be understood to happen wherein the highest performance score of the highest measure being for the second verification class to the second verification class as claimed by BRI).

	Regarding claim 8, Labatut as evidence by Ferri, wherein Labatut discloses the method according to claim 7, wherein the calculating the performance score of each of the first labels in the first label set for the first verification class includes calculating a performance score of the first label for the first verification class by comparing correct values associated with the first verification class with output values in the output value set which are associated with the first label (the discrimination plot as discussed above in claim 7 is being used to compare the behaviors of the classifiers [section 7.4, 1st paragraph] based on the confusion matrix therefore, the correct values associated with the first class with the estimated class being compared which is analogous to the claimed limitation, by BRI).

	Regarding claim 9, Labatut as evidence by Ferri, wherein Labatut discloses the method according to claim 4, wherein the generating the label-verification class mapping table includes: calculating a performance score of each of first labels in the first label set for a first verification class (the measures as discussed above being computed for each of the first labels or the output of the classifier as discussed above for a first verification class being the true class such as disclosed in section 3 wherein each class being computed a measure from the confusion matrix which represent the instances being distributed over estimated and true classes for expressing accuracy measures); mapping labels having performance scores equal to or greater than a threshold value for the first verification class to the first verification class (section 5.2 discloses table 7 showing the accuracy values associated with the confusion matrix of table 6, wherein the different classes as being mapped according to the classifiers wherein section 5.2, 3rd paragraph, discloses the low TPR indicates that the classifier has trouble recognizing all instances and only 56% are correctly classifier, therefore, it can be understood as the accuracy scores indicates how well the classifier performs, hence, for it to be recognized as high performance classifier, its accuracy measure has to be higher than a certain number [greater than or equal to a threshold as claimed, by BRI] which is analogous to the claim’s limitation, by BRI); calculating a performance score of each of the first labels in the first label set for a second verification class (the same step being repeated for any of the other classifiers with the second verification class; the measures as discussed above being computed for each of the first labels or the output of the classifier as discussed above for a first verification class being the true class such as disclosed in section 3 wherein each class being computed a measure from the confusion matrix which represent the instances being distributed over estimated and true classes for expressing accuracy measures); and mapping labels having performance scores equal to or greater than a threshold value for the second verification class to the second verification class (the same step as disclosed in section 5.2, can be understood to happen for the second verification class, by BRI, covers the scope of the claim, any of the class in table 7 can be understood to be the second verification class with the accuracy measure have to be greater than a certain value to be recognized as high performance classifier at that class).

	Regarding claim 10, Labatut as evidence by Ferri, wherein Labatut discloses the method according to claim 4, wherein a label in the first label set which is not mapped to the verification class is excluded when calculating the first performance score (section 4.6, 1st paragraph, discloses in computing of the measure [calculating the first performance score as discussed above] include removing the change agreement of the classes from the observed ones for the computing of the measure, therefore, it can be understood as the first label set which is not mapped to the verification class as claimed, by BRI, since the observed agreements and the change agreement are based on the classes of the classification of the classifiers).

	Regarding claim 11, Labatut as evidence by Ferri, wherein Labatut discloses the method according to claim 4, wherein a label in the first label set which is not mapped to the verification class is mapped to a specific verification class having the highest performance score (as discussed above, the 4.1, 2nd paragraph, discloses for an association measure, it’s assessed by having the high association with the maximum association value not always indicate that the estimated and true classes match, therefore, this covers in the instances wherein the estimated and the true classes do not match [the first label set and the verification class is not mapped to each other], then it’s now mapped to perfect classification having the maximal association value [highest performance score as claimed by BRI], and the perfect classification here being the “specific verification class” as claimed, by BRI).

	Regarding claim 12, Labatut as evidence by Ferri, wherein Labatut discloses the method according to claim 1, further comprising, based on the first performance score and the second performance score, which are performance scores standardized in terms of verification classes (section 7.3, 2nd paragraph, discloses the dataset can be fixed bound or normalized, in case where it is normalized it can be understood to be analogous to being standardized in terms of the dataset [including the classes or the verification class as claimed, by BRI] for the calculation of the measures [the scores as claimed, therefore, it’s analogous to the scores being standardized according to the dataset including different classes or in terms of a verification class as claimed, by BRI]; which is done for different classes [verification classes] according to section 6.2, 1st paragraph), quantitatively evaluating a difference in performance between the first image tagging model and the second image tagging model (section 7.4, 1st paragraph, discloses numerically evaluate the difference in performance between the classifiers).

	Regarding claim 13, Labatut as evidence by Ferri, wherein Labatut discloses the method according to claim 1, wherein the first image tagging model and the second image tagging model are trained using different training data (the classifiers are trained using different training data [section 1, 3rd to the last paragraph]).

	Regarding claim 14, Labatut as evidence by Ferri, wherein Labatut discloses the method according to claim 1, wherein the first image tagging model is associated with a first label set (as discussed above in claim 1, the output of any of the classifiers can be understood to be the first label set being associated with the first image tagging model as claimed by BRI), the second image tagging model is associated with a second label set (as the output of any of the other classifiers can be understood to be analogous to the second label set being associated with the second image tagging model as claimed, by BRI), and the verification class set, the first label set, and the second label set are different from each other (therefore, the first label set and the second label set and the verification class set are different to each other since the verification class set being the correct value set as discussed previously above).

	Regarding claim 15, Labatut as evidence by Ferri, wherein Labatut discloses the method according to claim 14, wherein the first label set and the second label set include different numbers of labels from each other (as discussed above, the output of the first classifier and the second classifiers would include different output or different numerical results, hence, by BRI, can be understood to have different results or numbers to be classified under different classes hence, covers the scope of the claims wherein they have different numbers of labels from each other, by BRI).

	Regarding claim 16, Labatut as evidence by Ferri, wherein Labatut discloses the method according to claim 1, wherein the calculating the first performance score includes: determining a first label set associated with the first image tagging model (as discussed above in claim 1, the output of any of the classifiers can be understood to be the first label set being associated with the first image tagging model as claimed by BRI); generating a label-verification class mapping table (section 3, table 1, shows a confusion matrix which is analogous to the recited label-verification class mapping table since they are both based on the correct values and the output value set [the estimated classes]) defining a mapping relationship from the first label set to the verification data set (the confusion matrix as discussed above in claim 4 shows the relationship from the estimated classes [the first label set] to the true classes or the verification class set as discussed above in claim 1); and converting an output of the first image tagging model using the label-verification class mapping table (section 3, 3rd paragraph, discloses inverting the estimated and true classes to result in a transposed matrix, which the inverting is analogous to the recited “converting” of the labels into the transposed matrix which is to result into a verification classes based on the original confusion matrix [analogous to the using the label-verification class mapping table as claimed, by BRI]).

	Regarding claim 17, Labatut as evidence by Ferri, wherein Labatut discloses the method according to claim 16, wherein the first performance score and the second performance score are scores standardized in terms of a verification class ((section 7.3, 2nd paragraph, discloses the dataset can be fixed bound or normalized, in case where it is normalized it can be understood to be analogous to being standardized in terms of the dataset [including the classes or the verification class as claimed, by BRI] for the calculation of the measures [the scores as claimed, therefore, it’s analogous to the scores being standardized according to the dataset including different classes or in terms of a verification class as claimed, by BRI]).

	Regarding claim 18, Labatut as evidence by Ferri, wherein Labatut discloses the method according to claim 16, wherein the second image tagging model is associated with a second label set (as discussed above in claim 1, the output of any of the classifiers can be understood to be the first label set being associated with the first image tagging model as claimed by BRI; as the output of any of the other classifiers can be understood to be analogous to the second label set being associated with the second image tagging model as claimed, by BRI), and the verification class set, the first label set, and the second label set are different from each other (therefore, the first label set and the second label set and the verification class set are different to each other since the verification class set being the correct value set as discussed previously above).

	Regarding claim 19, Labatut as evidence by Ferri, wherein Labatut discloses a non-transitory computer-readable recording medium storing instructions that, when executed by one or more processors, cause the one or more processors to perform the method according to claim 1 (the method of claim 1 as discussed above, moreover, this invention is for computer method for the classification task for the network hence, can be understood to have the use of a computer which includes a non-transitory computer-readable recording medium such as a RAM or ROM storing instructions to be executed by a processor).

	Regarding claim 20, Labatut discloses an information processing system, comprising: a memory; and one or more processors connected to the memory and configured to execute one or more computer-readable programs stored in the memory, wherein the one or more programs include instructions for (this invention is for computer method for the classification task for the network hence, can be understood to have the use of a computer which includes a non-transitory computer-readable recording medium such as a RAM or ROM storing instructions to be executed by a processor): receiving a verification data set including a plurality of verification images and a plurality of correct values associated with the plurality of verification images (section 7.4, 1st paragraph, discloses the goal of this paper is to compare classifiers on a given dataset hence, the same dataset is being used for the classifiers, to then assess the classifiers and identify the best classifier; therefore, this dataset is used for assessment of the classifiers which is analogous to the recited “verification data set” as claimed; wherein the dataset includes true classes and test data and the true data for constructing the confusion matrix as disclosed in section 3; which is analogous to the plurality of verification images a set of correct values [the true classes]), wherein each of the correct value is associated with at least one verification class of a verification class set (the true classes are discussed previously to be the correct values recited, moreover, it can be understood that the true classes are associated with the dataset used for the classifiers, as previously discussed to be the verification class set, moreover, each of the true class would correct to each of the class in the dataset hence, by BRI, covers the scope of the claimed limitation); receiving a first image tagging model and a second image tagging model (any of the classifiers can be understood as the first image tagging model and the any of the others is the second image tagging model as claimed, by BRI); inputting the plurality of verification images to the first image tagging model and the second image tagging model and outputting first output values from the first image tagging model and second output values from the second image tagging models (the classifiers as discussed previously, are used with the data from the datasets to test the performance of the classifiers, hence, by BRI, it can be understood that the classifiers produce output therefore, any of the classifiers can be understood to be the recited first image tagging model and its output would be analogous to the recited first output values as claimed, and ant of the other classifiers is analogous to the second image tagging model and its output is analogous to the second output values as claimed, by BRI, covers the scope of the claimed limitation); generating a label-verification class mapping table based on the plurality of correct values, the first output values, and the second output values (section 3, table 1, shows a confusion matrix which is analogous to the recited label-verification class mapping table since they are both based on the correct values and the output value set [the estimated classes] including the output of the first output values and the second output values; since the classifiers as discussed in claim 2 being used to assess and evaluate their performances, each classifier is being used on the dataset to generate or classify the data to output classes, therefore, it is analogous to the images being input into the first image tagging model to generate an output value set as claimed [the output of the classifier] which includes the output values as discussed), the label-verification class mapping table defining a mapping relationship from a first label set associated with the first image tagging model and a second label set associated with the second image tagging model to the verification class set (the confusion matrix as discussed above, section 3, table 1, shows the relationship from the estimated classes [the first label set] to the true classes or the verification class set as discussed above, and the same goes for the second label set which, respectively, are associated with the first tagging model and the second tagging model such as discussed previously); converting, using the label-verification class mapping table, the first output values into converted first output values associated with the verification class set and converting the second output values into converted second output values associated with the verification class set (section 3, 3rd paragraph, discloses inverting the estimated and true classes to result in a transposed matrix, which the inverting is analogous to the recited “converting” of the labels into the transposed matrix which is to result into a verification classes based on the original confusion matrix [analogous to the using the label-verification class mapping table as claimed, by BRI]; section 4, 1st paragraph, discloses the measures are calculated from the confusion matrixes, therefore, analogous to the claimed “first performance score standardized in terms of the verification class,” wherein if the transposed matrix is being used such as disclosed in section 3, 3rd paragraph, then the measure being calculated based on the transposed matrix or the performance score being calculated based on the converted output value set and together with the correct values, by BRI, covers the claimed limitation; section 7.3, 2nd paragraph, discloses the dataset can be fixed bound or normalized, in case where it is normalized it can be understood to be analogous to being standardized in terms of the dataset [including the classes or the verification class as claimed, by BRI] for the calculation of the measures [the scores as claimed, therefore, it’s analogous to the scores being standardized according to the dataset including different classes or in terms of a verification class as claimed, by BRI])); comparing the first output values from the first image tagging model with the correct values for each of the plurality of verification images (the discrimination plot as discussed above in claim 7 is being used to compare the behaviors of the classifiers [section 7.4, 1st paragraph] based on the confusion matrix therefore, the correct values associated with the first class with the estimated class being compared which is analogous to the claimed limitation, by BRI, to be of a first image tagging model and the correct values being compared as claimed) and calculating a first performance score for the first image tagging model (section 6, 1st paragraph and section 6.1, 1st paragraph, discloses computing measures for each of the classifiers for evaluation of their performances, therefore, the measure can be understood as the performance score, therefore, the measure for the first classifier [as discussed previously of any of the classifiers] can be understood to be analogous to the first performance score as claimed, by BRI); comparing the second output values from the second image tagging model with the correct values for each of the plurality of verification images (the discrimination plot as discussed above in claim 7 is being used to compare the behaviors of the classifiers [section 7.4, 1st paragraph] based on the confusion matrix therefore, the correct values associated with the first class with the estimated class being compared which is analogous to the claimed limitation, by BRI) and calculating a second performance score for the second image tagging model using the verification data set (and the measure for the second classifier [as discussed previously to be any of the other classifiers] can be understood to be analogous to the second performance score as claimed, by BRI); and evaluating a difference in performance between the first image tagging model and the second image tagging model (section 4, 1st 2 pars., discloses comparing the classifiers based on several measures to determine their performances, therefore comparing is analogous to the recited comparing and the taking measures is analogous to the evaluating to find the difference in performance of the classifiers, hence, by BRI, covers the scope of the claimed limitation, the correct classes [as discussed above] is associated with each class of the dataset which is analogous to being associated with at least one verification class of a verification class set as claimed by BRI), by quantitively comparing the first performance score and the second performance score (section 4 discloses that the comparing as discussed is based on comparing the values, scores [quantitively] the scores of the performances of the first and the second performances as discussed previously), using the first image tagging model when the first performance score is greater than the second performance score; and using the second image tagging model when the second performance score is greater than the first performance score (as the comparing of the performance of the models, it can be understood as when one model is performing better than the other it is preferably used, therefore, when the performance score of whatever model is greater than expected of certain threshold then that model is preferably used over the other such as discussed in section 7.1), wherein the first image tagging model is associated with a first label set (Labatut’s classifier’s output to be analogous to the label set, therefore, any of the classifiers can be understood to be the first tagging model and its output to be the first label set), wherein the second image tagging model is associated with a second label set (Labatut’s any of the other classifiers to be analogous to the second tagging model and its output to be the recited second label set, by BRI), and wherein the verification class set, the first label set, and the second label set are different from each other (therefore, the input being the dataset used for the classifiers [as discussed previously to be the verification class set] and each of the classifiers; outputs are different from each other, by BRI, covers the recited feature that they are different from each other).
	However, Labatut does not explicitly disclose that the tagging models being image tagging models and the verification data set includes a plurality of verification images.
	In the same field of performance measures for classification (title, Ferri), Ferri discloses the tagging models being image tagging models and the verification data set includes a plurality of verification images (table 2 shows the dataset being used for the classification include segmentation for example which includes images for the classification models which are analogous to the models of labatut, therefore, it can be understood the Labatut’s classification model covers image classification models and their dataset includes images).

Pertinent Prior Art(s)
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure:
Cheng Ma et. al. (“US 11983742 B1”) discloses training one or more machine learning models to predict association data (abstract) to determine performance which model performs better (column 11, 2nd par.) based on different training data sets (column 12, 3rd to the last par.) based on conversion predictions and conversion outcomes of the attribute conversions (column 18, last 2 paragraphs).

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to PHUONG HAU CAI whose telephone number is (571)272-9424. The examiner can normally be reached M-F 8:30 am - 5:00pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Chineyere Wills-Burns can be reached at (571) 272-9752. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/PHUONG HAU CAI/Examiner, Art Unit 2673                                                                                                                                                                                                        
/CHINEYERE WILLS-BURNS/Supervisory Patent Examiner, Art Unit 2673
Read full office action
Prosecution Timeline

Feb 06, 2023
Application Filed
Apr 18, 2025
Non-Final Rejection — §101, §102
Jul 17, 2025
Response Filed
Oct 05, 2025
Final Rejection — §101, §102
Jan 06, 2026
Request for Continued Examination
Jan 22, 2026
Response after Non-Final Action
Feb 06, 2026
Non-Final Rejection — §101, §102 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

18/027,687
Patent 12602833
IMAGE ANALYSIS DEVICE AND IMAGE ANALYSIS METHOD
2y 5m to grant Granted Apr 14, 2026
18/071,943
Patent 12602940
SINGLE CELL IDENTIFICATION FOR CELL SORTING
2y 5m to grant Granted Apr 14, 2026
18/034,513
Patent 12597223
INFORMATION PROCESSING APPARATUS, INFORMATION PROCESSING METHOD, AND COMPUTER PROGRAM
2y 5m to grant Granted Apr 07, 2026
17/915,489
Patent 12592064
METHOD AND APPARATUS FOR TRAINING TARGET DETECTION MODEL, METHOD AND APPARATUS FOR DETECTING TARGET
2y 5m to grant Granted Mar 31, 2026
18/212,891
Patent 12591616
METHOD, SYSTEM AND NON-TRANSITORY COMPUTER-READABLE RECORDING MEDIUM FOR SEARCHING SIMILAR PRODUCTS USING A MULTI TASK LEARNING MODEL
2y 5m to grant Granted Mar 31, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.
AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Prosecution Projections

3-4
Expected OA Rounds
81%
Grant Probability
99%
With Interview (+20.9%)
3y 0m
Median Time to Grant
High
PTA Risk
Based on 107 resolved cases by this examiner. Grant probability derived from career allow rate.