Office Action Analysis: 17651974 — EVALUATION DEVICE, EVALUATION METHOD, AND STORAGE MEDIUM

Office Action

§101 §103
DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . Claims 1, 4-14 are pending and have been examined.

Amendments
This Office Action is in response to the amendment filed on August 13, 2025. 
Claims 1, 4-11, 13, and 14 have been amended. 
Claims 2 and 3 have been cancelled.
No new claims have been added. 
The objections and rejections from the prior correspondence that are not restated herein are withdrawn.

Response to Arguments
Applicant's arguments filed on August 13, 2025 have been fully considered.
Applicant’s arguments regarding claims that have been interpreted as invoking 35 U.S.C. 112(f) of the previous office action have been fully considered, and are persuasive. Accordingly, the claims, as amended, are definite and do not invoke an interpretation under 35 U.S.C. 112(f).
Applicant’s arguments regarding the 35 U.S.C. 112(b) rejections of the previous office action have been fully considered, and are persuasive. Accordingly, the rejections under 35 U.S.C. 112(b) are withdrawn.
Applicant’s arguments regarding the 35 U.S.C. 101 rejections of the previous office action have been fully considered but are not persuasive. Applicant argues that the claims now require a display device to display an evaluation result screen including a first index value and a second index value, the evaluation result screen further including a first accepter configured to accept a user's designation of at least one of the first index value and the second index value which the user desires to raise at a time of re-training; determining a training policy of the training model to raise the index value accepted using the first accepter; and outputting an instruction for performing training based on the determined training policy, thereby improving the functioning of the overall model lifecycle management process in a manner that is rooted in computer technology.
Examiner respectfully disagrees. A display device to display an evaluation result screen with the calculated first index value and second index value is considered insignificant extra-solution activity of presenting information on a screen. The evaluation screen including a first accepter configure to accept a user’s designation of at least one of the first index value and the second index value which the user desires to raise at a time of re-training are mere instructions to apply an exception. Determining a training policy of the training model to raise the index value accepted using the first accepter is a mental process that can be practically performed in the human mind. Outputting an instruction for performing training is considered insignificant extra-solution activity to the judicial exception. Further, the additional elements, individually or in combination, do not integrate the abstract idea into a practical application and do not amount to significantly more than the judicial exception, as shown in the 101 rejections below.
Applicant’s arguments regarding the 35 U.S.C. 102, 103 rejections action have been fully considered but are not persuasive. Applicant argues that the references of Guelman, Dasgupta, Janson, Ludwig, Hsieh, and Chen, either individually or in combination, fail to disclose or suggest at least the features recited in amended independent claims 1, 13, and 14:
calculate a first index value representing the functional quality;
calculate at least one second index value representing the non-functional quality;
cause the display device to display the evaluation result screen including the first index value and the second index value, the evaluation result screen further including a first accepter configured to accept user's designation of at least one of the fir.5t index value and the second index value which the user desires to raise at a time of re-training;
determine a training policy of the training model to raise the index value accepted using the first accepter at the time of re-training; and
output an instruction for performing training based on the determined training policy.
Examiner respectfully disagrees. The combined references of Guelman and Dasgupta teach or suggest all features (a)-(e) mentioned above as shown in the rejections below for claims 1, 13, and 14.

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claims 1 and 4-14 are rejected under 35 U.S.C.101 because the claimed invention is directed to an
abstract idea without significantly more.

Step 1: Claims 1, 4-12 and 14 are directed to a machine or an article of manufacture. Claim 13 is directed to a process.

With respect to claims 1, 13, and 14:
2A Prong 1: The claims recite an abstract idea. Specifically:
evaluate a functional quality of the training model based on output data acquired by inputting the evaluation data to the training model; (Mental process – evaluating a functional quality can be practically performed in the human mind, or by a human using a pen and paper as a physical aid – see MPEP § 2106.04(a)(2)(III))
evaluate a non-functional quality of the training model based on the output data; (Mental process – evaluating a non-functional quality can be practically performed in the human mind, or by a human using a pen and paper as a physical aid – see MPEP § 2106.04(a)(2)(III))
calculate a first index value representing the functional quality, (Mathematical concept – calculating a first index value involves mathematical calculations  – see MPEP § 2106.04(a)(2)(I))
calculate at least one second index value representing the non-functional quality, (Mathematical concept – calculating a second index value involves mathematical calculations  – see MPEP § 2106.04(a)(2)(I))
determine a training policy of the training model to raise the index value accepted […] (Mental process – determining a training policy of the training model can be practically performed in the human mind, or by a human using a pen and paper as a physical aid – see MPEP § 2106.04(a)(2)(III)) 
2A Prong 2: The additional elements recited in the claims do not integrate the abstract idea into a practical application, individually or in combination.
Additional elements:
(Claim 1) An evaluation device comprising a central processing unit (CPU) (Mere recitation of a generic computer component – see § MPEP 2106.05(b)(I)) 
(Claim 14) A non-transitory computer-readable storage medium storing a program (Mere recitation of a generic computer component – see § MPEP 2106.05(b)(I)) 
acquire a training model that is an evaluation target and evaluation data; (Mere data gathering – see § MPEP2106.05(g).) 
[…] output data acquired by inputting the evaluation data to the training model; (Adding insignificant extra-solution activity to the judicial exception – see § MPEP2106.05(g).)
output an evaluation result screen including a first evaluation result of the functional quality and a second evaluation result of the non-functional quality cause a display device to display the evaluation result screen; (Adding insignificant extra-solution activity to the judicial exception – see § MPEP2106.05(g).)
cause the display device to display the evaluation result screen including the first index value and the second index value, (Adding insignificant extra-solution activity to the judicial exception – see § MPEP2106.05(g).)
the evaluation result screen further including a first accepter configured to accept user's designation of at least one of the first index value and the second index value which the user desires to raise at a time of re-training; (Adding the words “apply it” (or an equivalent) with the judicial exception, or mere instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea – see MPEP 2106.05(f).)
[…] using the first accepter at the time of re-training; (Adding the words “apply it” (or an equivalent) with the judicial exception, or mere instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea – see MPEP 2106.05(f).)
output an instruction for performing training based on the determined training policy. (Adding insignificant extra-solution activity to the judicial exception – see § MPEP2106.05(g).) 
2B: The claims do not include additional elements that are sufficient to amount to significantly more than the judicial exception.
Additional elements:
(Claim 1) An evaluation device comprising a central processing unit (CPU) (Mere recitation of a generic computer component – see § MPEP 2106.05(b)(I)) 
(Claim 14) A non-transitory computer-readable storage medium storing a program (Mere recitation of a generic computer component – see § MPEP 2106.05(b)(I)) 
acquire a training model that is an evaluation target and evaluation data; (Simply appending well-understood, routine, conventional activities previously known to the industry, specified at a high level of generality, to the judicial exception (WURC)- see MPEP § 2106.05(d)(ll)(i)- Receiving or transmitting data over a network, e.g., using the Internet to gather data, Symantec, 838 F.3d at 1321, 120 USPQ2d at 1362 (utilizing an intermediary computer to forward information).)
[…] output data acquired by inputting the evaluation data to the training model; (Simply appending well-understood, routine, conventional activities previously known to the industry, specified at a high level of generality, to the judicial exception (WURC)- see MPEP § 2106.05(d)(ll)(i)- Receiving or transmitting data over a network, e.g., using the Internet to gather data, Symantec, 838 F.3d at 1321, 120 USPQ2d at 1362 (utilizing an intermediary computer to forward information).)
output an evaluation result screen including a first evaluation result of the functional quality and a second evaluation result of the non-functional quality cause a display device to display the evaluation result screen; (Simply appending well-understood, routine, conventional activities previously known to the industry, specified at a high level of generality, to the judicial exception (WURC)- see MPEP § 2106.05(d)(ll)(i)- Receiving or transmitting data over a network, e.g., using the Internet to gather data, Symantec, 838 F.3d at 1321, 120 USPQ2d at 1362 (utilizing an intermediary computer to forward information).)
cause the display device to display the evaluation result screen including the first index value and the second index value, (Simply appending well-understood, routine, conventional activities previously known to the industry, specified at a high level of generality, to the judicial exception (WURC) - see § MPEP 2106.05(d)(II)) - Presenting offers and gathering statistics, OIP Techs., 788 F.3d at 1362-63, 115 USPQ2d at 1092-93)
the evaluation result screen further including a first accepter configured to accept user's designation of at least one of the first index value and the second index value which the user desires to raise at a time of re-training; (Adding the words “apply it” (or an equivalent) with the judicial exception, or mere instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea – see MPEP 2106.05(f).)
[…] using the first accepter at the time of re-training; (Adding the words “apply it” (or an equivalent) with the judicial exception, or mere instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea – see MPEP 2106.05(f).)
output an instruction for performing training based on the determined training policy. (Simply appending well-understood, routine, conventional activities previously known to the industry, specified at a high level of generality, to the judicial exception (WURC)- see MPEP § 2106.05(d)(ll)(i) - Receiving or transmitting data over a network, e.g., using the Internet to gather data, Symantec, 838 F.3d at 1321, 120 USPQ2d at 1362 (utilizing an intermediary computer to forward information).)  
Therefore, the claim is ineligible.

With respect to claim 4:
2A Prong 2: The additional elements recited in the claim do not integrate the abstract idea into a practical application, individually or in combination.
Additional elements:
wherein the first accepter does not accept the user's designation of an index value that is unable to improve quality through a training process. (Adding the words “apply it” (or an equivalent) with the judicial exception, or mere instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea - see MPEP 2106.05(f).)
2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception.
Additional elements:
wherein the first accepter does not accept the user's designation of an index value that is unable to improve quality through a training process. (Adding the words “apply it” (or an equivalent) with the judicial exception, or mere instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea - see MPEP 2106.05(f).)
Therefore, the claim is ineligible.

With respect to claim 5:
2A Prong 1: The claim recites an abstract idea. Specifically:
evaluate a functional quality of each of the plurality of training models based on a plurality of pieces of output data acquired by inputting the evaluation data to the plurality of training models, (Mental process – evaluating a functional quality is a judgment/observation that can be practically performed in the human mind, or by a human using a pen and paper as a physical aid – see MPEP § 2106.04(a)(2)(III))
evaluate a non-functional quality of each of the plurality of training models based on the plurality of pieces of output data, (Mental process – evaluating a non-functional quality is a judgment/observation that can be practically performed in the human mind, or by a human using a pen and paper as a physical aid – see MPEP § 2106.04(a)(2)(III))
2A Prong 2: The additional elements recited in the claim do not integrate the abstract idea into a practical application, individually or in combination.
Additional elements:
acquire a plurality of training models that are evaluation targets, (Mere data gathering – see § MPEP2106.05(g).)
cause the display device to display the evaluation result screen in which evaluation results of the plurality of training models overlap each other. (Adding insignificant extra-solution activity to the judicial exception – see § MPEP2106.05(g).)
2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception.
Additional elements:
acquire a plurality of training models that are evaluation targets, (Simply appending well-understood, routine, conventional activities previously known to the industry, specified at a high level of generality, to the judicial exception (WURC)- see MPEP § 2106.05(d)(ll)(i)- Receiving or transmitting data over a network, e.g., using the Internet to gather data, Symantec, 838 F.3d at 1321, 120 USPQ2d at 1362 (utilizing an intermediary computer to forward information).)
cause the display device to display the evaluation result screen in which evaluation results of the plurality of training models overlap each other. (Simply appending well-understood, routine, conventional activities previously known to the industry, specified at a high level of generality, to the judicial exception (WURC) - see § MPEP 2106.05(d)(II)) - Presenting offers and gathering statistics, OIP Techs., 788 F.3d at 1362-63, 115 USPQ2d at 1092-93) 
Therefore, the claim is ineligible.

With respect to claim 6:
2A Prong 2: The additional elements recited in the claim do not integrate the abstract idea into a practical application, individually or in combination.
Additional elements:
wherein the evaluation result screen includes a second accepter that accepts user's designation of a training model used for the operation among the plurality of training models, and (Adding the words “apply it” (or an equivalent) with the judicial exception, or mere instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea - see MPEP 2106.05(f).)
the CPU […] (Mere recitation of a generic computer component – see § MPEP 2106.05(b)(I)) 
[…] is further configured to output an instruction for performing an operation using the designated training model based on the designation of the training model that has been accepted by the second accepter. (Adding the words “apply it” (or an equivalent) with the judicial exception, or mere instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea - see MPEP 2106.05(f).)
2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception.
Additional elements:
wherein the evaluation result screen includes a second accepter that accepts user's designation of a training model used for the operation among the plurality of training models, and (Adding the words “apply it” (or an equivalent) with the judicial exception, or mere instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea - see MPEP 2106.05(f).)
the CPU […] (Mere recitation of a generic computer component – see § MPEP 2106.05(b)(I)) 
[…] is further configured to output an instruction for performing an operation using the designated training model based on the designation of the training model that has been accepted by the second accepter. (Adding the words “apply it” (or an equivalent) with the judicial exception, or mere instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea - see MPEP 2106.05(f).)
Therefore, the claim is ineligible.

With respect to claim 7:
2A Prong 1: The claim recites an abstract idea. Specifically:
[…] generate first augmented data by adding, to the evaluation data, first noise that is perceivable for persons (Mental process/Mathematical concept – generating augmented data can be practically performed in the human mind, or by a human using a pen and paper as a physical aid – see MPEP § 2106.04(a)(2)(III). Additionally, adding […] first noise involves mathematical calculations – see MPEP § 2106.04(a)(2)(I))
and evaluates resistance to the first noise based on output data acquired by inputting the first augmented data to the training model. (Mental process – evaluating resistance to noise is an observation that can be practically performed in the human mind, or by a human using a pen and paper as a physical aid – see MPEP § 2106.04(a)(2)(III))
Therefore, the claim is ineligible.

With respect to claim 8:
2A Prong 1: The claim recites an abstract idea. Specifically:
[…]  generates second augmented data by adding, to the evaluation data, second noise that is unperceivable for persons (Mental process/Mathematical concept – generating augmented data can be practically performed in the human mind, or by a human using a pen and paper as a physical aid – see MPEP § 2106.04(a)(2)(III) . Additionally, adding […] first noise involves mathematical calculations – see MPEP § 2106.04(a)(2)(I)))
and evaluates resistance to the second noise based on output data acquired by inputting the second augmented data to the training model.  (Mental process – evaluating resistance to noise is an observation that can be practically performed in the human mind, or by a human using a pen and paper as a physical aid – see MPEP § 2106.04(a)(2)(III))
Therefore, the claim is ineligible.

With respect to claim 9:
2A Prong 1: The claim recites an abstract idea. Specifically:
[…] convert the at least one second index value […] calculated based on the output data into the second index value represented using one axis.  (Mathematical concept – converting an index value requires mathematical calculations  – see MPEP § 2106.04(a)(2)(I))
2A Prong 2: The additional elements recited in the claim do not integrate the abstract idea into a practical application, individually or in combination.
Additional elements:
[…] which is represented using multiple axes […] (Generally linking the use of a judicial exception to a particular technological environment or field of use, as discussed in MPEP § 2106.05(h).)
2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception.
Additional elements:
[…] which is represented using multiple axes […] (Generally linking the use of a judicial exception to a particular technological environment or field of use, as discussed in MPEP § 2106.05(h).)
Therefore, the claim is ineligible.

With respect to claim 10:
2A Prong 2: The additional elements recited in the claim do not integrate the abstract idea into a practical application, individually or in combination.
Additional elements:
wherein the CPU is further configured to cause the display device to display the evaluation result screen that displays an evaluation result using first evaluation data used at the time of training the training model and an evaluation result using second evaluation data, which is different from the first evaluation data, prepared at the time of a comparative evaluation of the training model in a comparable manner. (Adding the words “apply it” (or an equivalent) with the judicial exception, or mere instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea - see MPEP 2106.05(f).)
2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception.
Additional elements:
wherein the CPU is further configured to cause the display device to display the evaluation result screen that displays an evaluation result using first evaluation data used at the time of training the training model and an evaluation result using second evaluation data, which is different from the first evaluation data, prepared at the time of a comparative evaluation of the training model in a comparable manner. (Adding the words “apply it” (or an equivalent) with the judicial exception, or mere instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea - see MPEP 2106.05(f).)
Therefore, the claim is ineligible.

With respect to claim 11:
2A Prong 2: The additional elements recited in the claim do not integrate the abstract idea into a practical application, individually or in combination.
Additional elements:
further comprising a notifier configured to perform notification for prompting re-training of the training model when an evaluation result of the training model is below a predetermined threshold. (Adding the words “apply it” (or an equivalent) with the judicial exception, or mere instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea - see MPEP 2106.05(f).)
2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception.
Additional elements:
further comprising a notifier configured to perform notification for prompting re-training of the training model when an evaluation result of the training model is below a predetermined threshold. (Adding the words “apply it” (or an equivalent) with the judicial exception, or mere instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea - see MPEP 2106.05(f).)
Therefore, the claim is ineligible.

With respect to claim 12:
2A Prong 2: The additional elements recited in the claim do not integrate the abstract idea into a practical application, individually or in combination.
Additional elements:
further comprising the display device. (Mere recitation of a generic computer component – see § MPEP 2106.05(b)(I))
2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception.
Additional elements:
further comprising the display device. (Mere recitation of a generic computer component – see § MPEP 2106.05(b)(I))
Therefore, the claim is ineligible.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1, 5-6, and 11-14 are rejected under 35 U.S.C. 103 as being unpatentable over Guelman (US 20190279109 A1) in view of Dasgupta (US 11537506 B1), hereafter Guelman and Dasgupta respectively.

Regarding Claim 1, Guelman teaches:
An evaluation device comprising a central processing unit (CPU) configured to: (Guelman [0198] teaches: "The embodiments of the devices, systems and methods described herein may be implemented in a combination of both hardware and software. These embodiments may be implemented on programmable computers, each computer including at least one processor, a data storage system (including volatile memory or non-volatile memory or other data storage elements or a combination thereof), and at least one communication interface.")
acquire a training model that is an evaluation target and evaluation data; (Guelman [0043] teaches: "A platform 110 configured for model performance monitoring, receiving a one or more machine learning models 130 (e.g., stored in the form of one or more model data sets) through network 115 is provided." Examiner’s note: under BRI, “acquire a training model” can be interpreted as one or more machine learning models received for model performance monitoring.)
evaluate a functional quality of the training model based on output data acquired by inputting the evaluation data to the training model; (Guelman [0071] teaches: "Performance metrics unit 315 may be a comprehensive set of metrics to measure the accuracy of the models over time." Additionally, Guelman [0072] teaches: "In some embodiments, performance metrics unit 315 may be configured to output one or more metrics [...]". Furthermore, Guelman [0049] teaches: "As described above, MPM system 100 is configured to receive one or more model data sets representative of one or more machine learning models 130, and may receive additional data from external database(s) 120 through network 115." Moreover, Guelman [0027] teaches: "execute the one or more model data sets representative of the machine learning model to generate an output based on a set of input data". Examiner's note: under BRI, the functional quality of the training model can be interpreted as the accuracy of the models.)
evaluate a non-functional quality of the training model based on the output data; (Guelman [0065] teaches: "Population stability index unit 310 may be configured to assess the stability of the output of the machine learning model (hereinafter “model”) over time." Examiner's note: under BRI, a non-functional quality can be interpreted as the population stability index (PSI).)
output an evaluation result screen including a first evaluation result of the functional quality and a second evaluation result of the non-functional quality cause a display device to display the evaluation result screen; (Guelman [0077] teaches: "[...] MPM application unit 323 may be configured to process and display one or more output datasets from population stability index unit 310, feature analysis unit 312, performance matrix unit 315, and model calibration unit 317." Examiner’s note: under BRI, the first evaluation result of the functional quality can be interpreted as the metric output of the performance matrix unit 315 (named also performance metrics unit 315 throughout Guelman’s reference) and the second evaluation result of the non-functional quality can be interpreted as the output of the population stability index unit 310.)
calculate a first index value representing the functional quality; (Guelman [0071] teaches: "Performance metrics unit 315 may be a comprehensive set of metrics to measure the accuracy of the models over time." Additionally, Guelman [0072] teaches: "In some embodiments, performance metrics unit 315 may be configured to output one or more metrics [...]". Furthermore, Guelman [0049] teaches: "As described above, MPM system 100 is configured to receive one or more model data sets representative of one or more machine learning models 130, and may receive additional data from external database(s) 120 through network 115." Moreover, Guelman [0027] teaches: "execute the one or more model data sets representative of the machine learning model to generate an output based on a set of input data". Examiner's note: page 5, column 20 of the present application recites: “The functional quality is an accuracy of a function and, for example, includes an accuracy of an output result of a training model (a correct answer rate of an inference result). Therefore, under BRI, “calculate a first index value representing the functional quality” can be interpreted as measuring the accuracy of the models.)
calculate at least one second index value representing the non-functional quality; (Guelman [0065] teaches: "Population stability index unit 310 may be configured to assess the stability of the output of the machine learning model (hereinafter “model”) over time." Examiner's note: under BRI, “calculate at least one second index value representing the non-functional quality” can be interpreted as assessing the stability of the output of the machine learning model.)
cause the display device to display the evaluation result screen including the first index value and the second index value, (Guelman [0077] teaches: "[...] MPM application unit 323 may be configured to process and display one or more output datasets from population stability index unit 310, feature analysis unit 312, performance matrix unit 315, and model calibration unit 317." Examiner’s note: under BRI, “the first index value” can be interpreted as the metric output of the performance matrix unit 315 (named also performance metrics unit 315 throughout Guelman’s reference) and “the second index value” can be interpreted as the output of the population stability index unit 310.)
Guelman is not relied upon for teaching:
the evaluation result screen further including a first accepter configured to accept user's designation of at least one of the first index value and the second index value which the user desires to raise at a time of re-training;
determine a training policy of the training model to raise the index value accepted using the first accepter at the time of re-training; and
output an instruction for performing training based on the determined training policy.
However, the combination of Guelman and Dasgupta teaches: the evaluation result screen further including a first accepter configured to accept user's designation of at least one of the first index value and the second index value which the user desires to raise at a time of re-training; (Dasgupta [col. 4, lines 20-26] teaches: "In some embodiments, the MDE provides a model experimentation interface that allows users to configure and run model experiments, which performs a training run of a model and then tests the model to determine its performance. In some embodiments, the MDE provides a model diagnosis interface to present the model's performance metrics and allows users to visually diagnose the model's exhibited errors." Dasgupta [col. 25, lines 2-14] teaches: "In various embodiments, different types of performances metrics may be used, including for example, precision, accuracy, recall, F1-scores, and the like. In this example, two performance metrics are displayed in the graph: the model's precision and recall. In some embodiments, the graph may also display a performance goal 826, which may be a performance level specified for the sub-goal group as a whole. Different performance goal levels may be specified for different performance metrics. Accordingly, the user may quickly determine from the graph 820 how quickly the model development process is progressing towards its desired goal." Dasgupta [col. 25, lines 24-29] teaches: "Finally, a performance goal button 834 may allow the user to view, edit, or toggle in the graph the performance goal 826 of a group of experiments. In some embodiments, the performance goal may be a composite goal that is dependent on a combination of multiple performance metrics." Dasgupta [col. 21, lines 62-65] teaches: "In some embodiments of the MDE, user interface 700 may be used to define and configure a model experiment to run as one iteration of a model development process." Additionally, Dasgupta [col. 22, lines 47-48] teaches: "the user interface 700 also includes in this example a training configuration section 730.”  Examiner's note: the "first accepter" can be interpreted as the performance goal button, which allows the user to view, edit or toggle in the graph a group of experiments. Further, “first index value” can be interpreted as Guelman’s accuracy of the models and “second index value” can be interpreted as Guelman’s stability of the output of the machine learning model. Furthermore, the “user's designation of at least one of the first index value and the second index value which the user desires to raise at a time of re-training” can be interpreted as the desired user’s goal based on the model experiment configuration to run iterations for model development.)
determine a training policy of the training model to raise the index value accepted using the first accepter at the time of re-training; (Dasgupta [col. 25, lines 2-14] teaches: "In various embodiments, different types of performances metrics may be used, including for example, precision, accuracy, recall, F1-scores, and the like. In this example, two performance metrics are displayed in the graph: the model's precision and recall. In some embodiments, the graph may also display a performance goal 826, which may be a performance level specified for the sub-goal group as a whole. Different performance goal levels may be specified for different performance metrics. Accordingly, the user may quickly determine from the graph 820 how quickly the model development process is progressing towards its desired goal." Dasgupta [col. 25, lines 24-29] teaches: "Finally, a performance goal button 834 may allow the user to view, edit, or toggle in the graph the performance goal 826 of a group of experiments. In some embodiments, the performance goal may be a composite goal that is dependent on a combination of multiple performance metrics." Dasgupta [col. 21, lines 62-65] teaches: "In some embodiments of the MDE, user interface 700 may be used to define and configure a model experiment to run as one iteration of a model development process." Additionally, Dasgupta [col. 22, lines 47-48] teaches: "the user interface 700 also includes in this example a training configuration section 730.” Examiner’s note: under BRI, determining a training policy can be interpreted as defining and configuring a model experiment by specifying performance goal levels using the performance goal button, thus determining a training policy for the training model. Furthermore, the “to raise an index value” can be interpreted as the desired user’s goal based on the model experiment configuration to run iterations for model development.)
output an instruction for performing training based on the determined training policy. (Dasgupta [col. 22, lines 64-27, col 23, lines 1-5] teaches: "As shown, the user interface 700 also includes a simulation section 734. In this section, the user may specify that a simulation should be performed on the model, for example, by placing the model in an environment that is similar to a production environment, and providing production input data to the model to obtain performance results" Examiner's note: under BRI, the instruction output for performance training based on the determined training policy can be interpreted as the performing of the simulation according to the training configuration section 730.)
Accordingly, it would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention, having the teachings of Guelman and Dasgupta before them, to include Dasgupta's performance goal button into Guelman's monitoring system. One would have been motivated to make such a combination in order to "provide the following benefits: fast data annotation; quick iterations; Choice of algorithms; intuitive progress interface; customizable metrics; one click training and deployment." (Dasgupta [col. 4-5]).

Regarding Claim 5, Guelman in view of Dasgupta teaches the elements of claim 1 as outlined above. Guelman in view of Dasgupta also teaches:
The evaluation device according to claim 1, wherein the CPU is further configured to: (Guelman [0198] teaches: "The embodiments of the devices, systems and methods described herein may be implemented in a combination of both hardware and software. These embodiments may be implemented on programmable computers, each computer including at least one processor, a data storage system (including volatile memory or non-volatile memory or other data storage elements or a combination thereof), and at least one communication interface.")
acquire a plurality of training models that are evaluation targets; (Guelman [0043] teaches: "[…] receiving a one or more machine learning models 130 (e.g., stored in the form of one or more model data sets)".)
evaluate a functional quality of each of the plurality of training models based on a plurality of pieces of output data acquired by inputting the evaluation data to the plurality of training models; (Guelman [0071] teaches: "Performance metrics unit 315 may be a comprehensive set of metrics to measure the accuracy of the models over time." Additionally, Guelman [0072] teaches: "In some embodiments, performance metrics unit 315 may be configured to output one or more metrics [...]". Furthermore, Guelman [0049] teaches: "As described above, MPM system 100 is configured to receive one or more model data sets representative of one or more machine learning models 130, and may receive additional data from external database(s) 120 through network 115." Furthermore, Guelman [0027] teaches: "execute the one or more model data sets representative of the machine learning model to generate an output based on a set of input data". Additionally, Guelman [0057] teaches: […] Scoring data 305 may include model input features, model output or predictions (i.e., plurality of pieces of output data)".)
evaluate a non-functional quality of each of the plurality of training models based on the plurality of pieces of output data and (Guelman [0065] teaches: "Population stability index unit 310 may be configured to assess the stability of the output of the machine learning model (hereinafter “model”) over time." Additionally, Guelman [0057] teaches: […] Scoring data 305 may include model input features, model output or predictions (i.e., plurality of pieces of output data)".)
cause the display device to display the evaluation result screen […] (Guelman [0077] teaches: "[...] MPM application unit 323 may be configured to process and display one or more output datasets from population stability index unit 310, feature analysis unit 312, performance matrix unit 315, and model calibration unit 317.")
[…] in which evaluation results of the plurality of training models overlap each other. (Dasgupta [col. 26, lines 39-45] teaches: " As shown, the user interface 1000 may also provide a performance comparison graph 1020. In this example, each performance metric value for the two models are grouped together. This view allows the user to visually see the difference between the performance results of the two models.")

Regarding Claim 6, Guelman in view of Dasgupta teaches the elements of claim 5 as outlined above. Guelman in view of Dasgupta also teaches:
The evaluation device according to claim 5, wherein the evaluation result screen includes a second accepter that accepts user's designation of a training model used for the operation among the plurality of training models, and the CPU is further configured to output an instruction for performing an operation using the designated training model based on the designation of the training model that has been accepted by the second accepter. (Dasgupta [col. 10, lines 36-42] teaches: "In some embodiments, the model experiment interface 144 may allow the user to specify a variety of model experiment parameters, and then launch a model experiment. For example, an experiment definition user interface may allow a user to select a model for the experiment, which may be a model that was the result of a previous experiment, stored in the model repository 164." Additionally, Fig. 2 teaches a user interface with an option to select a model from various model options (i.e., plurality of training models), which can be used for training, hyperparameter adjusting, run simulations, or deploy (i.e., output an instruction for performing an operation using the designated training model).)
Regarding Claim 11, Guelman in view of Dasgupta teaches the elements of claim 1 as outlined above. Guelman in view of Dasgupta also teaches:
The evaluation device according to claim 1, wherein the CPU is further comprising a notifier configured to perform notification for prompting re-training of the training model when an evaluation result of the training model is below a predetermined threshold. (Dasgupta [col. 30, lines 5-7] teaches: "In some embodiments, if one or more performance metrics fall below a specified threshold, an aberration may be detected." Additionally, Dasgupta [col. 30, lines 20-27] teaches: "At operation 1180, when a performance aberration is detected, a user interface is generated to report the performance aberration of the production model. In some embodiments, the user interface may be a graphical user interface of the MDE. In some embodiments, the user interface may be a notification interface of the MDE, which may be configured to generates an email, a text, an event, etc. to registered users.")

Regarding Claim 12, Guelman in view of Dasgupta teaches the elements of claim 1 as outlined above. Guelman in view of Dasgupta also teaches:
The evaluation device according to claim 1, further comprising the display device. (Guelman [0045] teaches: "The performance data sets may be processed to output one or more values that can be displayed on a display device".)

Regarding Claim 13, the claim recites similar limitations as corresponding claim 1 and is rejected for similar reasons as claim 1 using similar teachings and rationale. Additionally, Guelman teaches:
An evaluation method using a computer […] (Guelman [0198] teaches: "The embodiments of the devices, systems and methods described herein may be implemented in a combination of both hardware and software.")

Regarding Claim 14, the claim recites similar limitations as corresponding claims 1 and 13 and is rejected for similar reasons as claims 1 and 13 using similar teachings and rationale. Additionally, Guelman teaches:
A non-transitory computer-readable storage medium storing a program causing a computer to execute: (Guelman [0005] teaches: "In one aspect, there is provided a computer implemented system for monitoring and improving a performance of one or more machine learning models, the system including: at least one memory storage device storing one or more model data sets representative of a machine learning model; at least one training engine configured to train the machine learning model; and at least one computer processor configured to, when executing a set of machine-readable instructions: receive or store the one or more model data sets representative of the machine learning model, wherein the machine learning model has being trained with a first set of training data; analyze the first set of training data, based on one or more performance parameters for the machine learning model, to generate one or more performance data sets; and process the one or more performance data sets to determine one or more values representing a performance of the machine learning model.")

Claim 4 is rejected under 35 U.S.C. 103 as being unpatentable over Guelman in view of Dasgupta as applied to claim 1 above, and further in view of Janson (US 20070168917 A1), hereafter Janson.

Regarding Claim 4, Guelman in view of Dasgupta teaches the elements of claim 1 as outlined above. However, Guelman in view of Dasgupta is not relied upon for teaching, but Janson teaches:
The evaluation device according to claim 1, wherein the first accepter does not accept the user's designation of an index value that is unable to improve quality through a training process. (Janson [0024] teaches: "[0024] According to a further improved embodiment of the second aspect, the method further comprises the steps of receiving an input value for the computer program, validating the received input value against the interface description and rejecting the received input value, if the input value violates any of the constraints specified on a corresponding input parameter in the interface description" Additionally, Janson [0034] teaches: "FIG. 3, an interface description, which rejects potentially harmful input values".)
Accordingly, it would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention, having the teachings of Guelman, Dasgupta and Janson before them, to include Janson’s input validation in Guelman and Dasgupta’s monitoring system. One would have been motivated to make such a combination in order to use input validation as a safeguard against user errors, because providing incorrect input parameters often result in program crashes, associated with a loss of work time, among others (Janson [0024]).

Claims 7 and 8 are rejected under 35 U.S.C. 103 as being unpatentable over Guelman in view of Dasgupta as applied to claim 1 above, and further in view of Ludwig (US 20220114259 A1), hereafter Ludwig.

Regarding Claim 7, Guelman in view of Dasgupta teaches the elements of claim 1 as outlined above. However, Guelman in view of Dasgupta is not relied upon for teaching, but Ludwig teaches:
The evaluation device according to claim 1, wherein the CPU is further configured to generate first augmented data by adding, to the evaluation data, first noise that is perceivable for persons […] (Ludwig [0024] teaches: "Here, program 150 generates a plurality of interpolated images ranging between a pair of images each from different classes. [...] In a further embodiment, for a robust model, said perturbations cause perceivable changes (e.g., visible to a human eye) to the original image.")
[…] and evaluates resistance to the first noise based on output data acquired by inputting the first augmented data to the training model. (Ludwig [0023] teaches: " In this embodiment, tolerance is a measure of model robustness to adversarial attacks of increasing strength. In an embodiment, program 150 determines tolerance by utilizing validation data to test the model and calculate one or more error rates." Examiner's note: under BRI, evaluating resistance of the noise perceivable for persons can be interpreted as determining tolerance of the images with perturbations that cause perceivable changes to the human eye.)
Accordingly, it would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention, having the teachings of Guelman, Dasgupta, and Ludwig before them, to include Ludwig's generating of adversarial samples into Guelman/Dasgupta's monitoring system. One would have been motivated to make such a combination because it "improves robustness to adversarial samples (i.e., images) while maintaining generalization performance for original example." (Ludwig [0016]).)

Regarding Claim 8, Guelman in view of Dasgupta and Ludwig teaches the elements of claim 7 as outlined above. Guelman in view of Dasgupta and Ludwig also teaches:
The evaluation device according to claim 7, wherein the CPU is further configured to generate second augmented data by adding, to the evaluation data, second noise that is unperceivable for persons […] (Ludwig [0011] teaches: "Adversarial attacks add a human imperceptible perturbation to the testing data such that data inputs are easily misclassified in the testing phase." Additionally, Ludwig [0026] In an embodiment, if the generated adversarial images do not reveal interpretable (i.e., detectable by a human) perturbations to the input, then program 150 utilizes the generated images to perform adversarial training and repeat the steps above until perturbations are interpretable." Examiner’s note: Ludwig discloses adding perturbations that humans cannot detect to the testing data. Therefore, under broadest reasonable interpretation, the secon
Read full office action
EVALUATION DEVICE, EVALUATION METHOD, AND STORAGE MEDIUM

This examiner grants 33% of cases after interview

Examiner Intelligence

Statute-Specific Performance

Office Action

Prosecution Timeline

Precedent Cases

Applications granted by this same examiner with similar technology

AI Strategy Recommendation

Prosecution Projections

Ready to respond to this office action?

EVALUATION DEVICE, EVALUATION METHOD, AND STORAGE MEDIUM

This examiner grants 33% of cases after interview

Examiner Intelligence

Statute-Specific Performance

Office Action

Prosecution Timeline

Precedent Cases

Applications granted by this same examiner with similar technology

AI Strategy Recommendation

Prosecution Projections

Ready to respond to this office action?

Sign in with your work email