Last updated: April 19, 2026
Application No. 17/645,740
INFORMATION PROCESSING SYSTEM, INFORMATION PROCESSING METHOD AND INFORMATION PROCESSING DEVICE FOR TRAINING A MACHINE LEARNING MODEL IN A RESPONSE SERVER

Non-Final OA §103
Filed
Dec 22, 2021
Examiner
KIM, JONATHAN J
Art Unit
2141
Tech Center
2100 — Computer Architecture & Software
Assignee
Rakuten Group Inc.
OA Round
3 (Non-Final)
Interview Optional

— +80.0% interview lift. This examiner has a relatively high allow rate; a written response may suffice.
Based on 6 resolved cases, 2023–2026
Examiner Intelligence

KIM, JONATHAN J View full profile →
Grants only 33% of cases
Career Allow Rate
2 granted / 6 resolved
-21.7% vs TC avg
Strong +80% interview lift
Without
With
+80.0%
Interview Lift
resolved cases with interview
Typical timeline
3y 3m
Avg Prosecution
30 currently pending
Career history
Total Applications
across all art units
Statute-Specific Performance

§101
36.7%
-3.3% vs TC avg
§103
38.6%
-1.4% vs TC avg
§102
15.9%
-24.1% vs TC avg
§112
8.7%
-31.3% vs TC avg
Black line = Tech Center average estimate • Based on career data from 6 resolved cases
Office Action

§103
DETAILED ACTION
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . 

A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 01/06/2026 has been entered.

The status of the claims is as follows.
Claims 1, 8, 9 and 11 are amended. Claims 12-14 have been added. Claims 1, 3-14 are currently pending.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.

Claims 1, 3-4; 8-10, 12-14 are rejected under 35 U.S.C. 103 as being unpatentable over Barson et al. (US20030014377 A1, hereinafter “Barson”) in view of Lajevardi et al. (US20220115148A1, hereinafter “Lajevardi”) in view of Aili et al. (US20180089572 A1, hereinafter “Aili”)

Regarding Claim 1, 
Barson discloses an information processing system comprising: at least one processor; and at least one memory device that stores a plurality of instructions, which when executed by the at least one processor, causes the at least one processor to: (Barson [0112]; “As shown in FIG. 4 the engine administrator 34 comprises a data manager  41; a training/retraining processor 42; an evaluator 43; and a processor for creating a neural network 44”, 
Barson [0177]; “As a minimum requirement the application specific software must allow the user to give any of the 8 API method instructions or calls to the ADE.” wherein software containing 8 API method instructions or calls reads on a memory device storing instructions),
Obtain the training data set (Barson [0138]; “This instruction requires that information about the location of an anomaly detector creation specification and a training data set is supplied when the instruction is made”),
Train the machine learning model on the training data set (Barson [0140]; “This instruction causes the training/retraining process 42 to train or retrain the neural network using the training data set and any retraining data that is available”),
Input test data to the machine learning model trained on the training data set; (Barson [0095]; “Once the neural network has been trained it is validated to check that the training has been successful. This is done by presenting a new set of profiles, that are known to be anomalous or not, to the trained neural network“)
Evaluate whether performance of the machine learning model satisfies a predetermined condition based on an output of the machine learning model to which the test data is entered; (Barson [0143]; “EvaluatePerformance[:] When this instruction is given to the ADE the performance evaluator 43 carries out an evaluation using the evaluation data set 45. When the performance evaluation is completed a classification error is returned to the application specific software. This gives an indication as to how many mis-classifications were made by the neural network. A mis-classification occurs when the neural network returns a detection result based on a known input-output pair, which does not match the correct output for that particular input”)
Deploy the trained machine learning model into the response server when the performance of the machine learning model is evaluated to satisfy the predetermined condition; (Barson [0146]; “When this instruction is given to the ADE a recently trained second neural network (that was created during the retaining process and is contained in a second anomaly detector) is switched with the current active neural network. That is, the current active neural network is replaced by the newly trained neural network.”,
Barson[0196]; “If the neural network performance reaches a level required by the user then the window sizes are deemed correct and are used for profiles in all data sets”)

Bason fails to explicitly disclose but Lajevardi discloses when the performance of the machine learning model is evaluated to not satisfy the predetermined condition, detect a problem with the training data by determining whether the training data satisfies a detection condition; (Lajevardi [0029]; “If the device management system 104 determines at 418 that the first characteristic data and the second characteristic data satisfy the one or more consistency criteria, the device 106 continues operating using the trained machine learning model, and routine returns to 412 to process the next set of input data at the next designated time.
If the device management system 104 determines at 418 that the first characteristic data and the second characteristic data do not satisfy the one or more consistency criteria, the device management system 104 determines, at 420, properties for a new set of training data for retraining the machine learning model running on the device 106. The device management system 104 determines the properties for the new set of training data in dependence on the second characteristic data and, optionally, the first characteristic data. In a first example, the device management system 104 may determine that the machine learning model should be retained, either from scratch or in a continued manner, using a new set of training data with properties consistent with those of the set of input data. This may be suitable if, for example, the device 106 is deployed in a new environment, and the properties of the input data generated by the sensors 304 in the new environment are not consistent with those of the training data (which may correspond to a different environment). The device management system 104 may, for example, send a request to the device 106 to send input data generated at the device 106 to the device management system 104, for use as new training data for the machine learning model. Alternatively, the device management system 104 may generate simulated training data with properties corresponding to those of the input data, or may output a request to a human user or automated system to collect new training data based on the determined properties.” wherein the device management system’s initial determination of first and second characteristic data not satisfying the one or more consistency criteria thus reads on evaluation of the model as not satisfying a predetermined condition; wherein determination of updates to the training data set in response to issues associated with the consistency criteria thus implicitly reads on detection of a problem with the training data by determining that the training data does not satisfy the detection condition (characteristic data presently not satisfying consistency criteria henceforth read as a detection condition) 
Lajevardi [Figure 4]; 

    PNG
    media_image1.png
    616
    433
    media_image1.png
    Greyscale
)
automatically update the training data set when the detection condition is determined to be satisfied (Lajevardi [0029]; “If the device management system 104 determines at 418 that the first characteristic data and the second characteristic data satisfy the one or more consistency criteria, the device 106 continues operating using the trained machine learning model, and routine returns to 412 to process the next set of input data at the next designated time.
If the device management system 104 determines at 418 that the first characteristic data and the second characteristic data do not satisfy the one or more consistency criteria, the device management system 104 determines, at 420, properties for a new set of training data for retraining the machine learning model running on the device 106. The device management system 104 determines the properties for the new set of training data in dependence on the second characteristic data and, optionally, the first characteristic data. In a first example, the device management system 104 may determine that the machine learning model should be retained, either from scratch or in a continued manner, using a new set of training data with properties consistent with those of the set of input data. This may be suitable if, for example, the device 106 is deployed in a new environment, and the properties of the input data generated by the sensors 304 in the new environment are not consistent with those of the training data (which may correspond to a different environment). The device management system 104 may, for example, send a request to the device 106 to send input data generated at the device 106 to the device management system 104, for use as new training data for the machine learning model. Alternatively, the device management system 104 may generate simulated training data with properties corresponding to those of the input data, or may output a request to a human user or automated system to collect new training data based on the determined properties.”
Lajevardi [Figure 4]; 

    PNG
    media_image1.png
    616
    433
    media_image1.png
    Greyscale

wherein the determined properties of the new training data being used in the next iteration of obtaining training data and training a ML model using aforementioned new training data properties thus reads on automatically updating the training data set when the detection condition is satisfied (determined properties in view of satisfying characteristic criteria))
retrain the machine learning model on the updated training data set, wherein the information processing system repeats, in response to the evaluation, updating the training data set, retraining the machine learning model, and evaluating the performance of the machine learning model; (Lajevardi [Figure 4]; 

    PNG
    media_image1.png
    616
    433
    media_image1.png
    Greyscale

wherein the iterative nature of obtaining training data, training a ML model, evaluating the ML model to determine characteristic data by which the training data set is conditionally updated thus reads on the repetition of cyclically updating the training data set in response to evaluations, retraining the model, and evaluating the performance for the next iterative update)

It would have been obvious to a person of ordinary skill in the art before the effective filing date of the invention to modify Barson’s method of neural network training and evaluation of the trained neural network’s performance across a test dataset to determine if it should be deployed by using Lajevardi’s method of retraining the machine learning model when model performance is insufficient through an updated data set configured only when a detection condition is fulfilled.  The motivation to do so is because if the “input data no longer sufficiently resembles the training data upon which the machine learning model was trained, the machine learning model may not be competent for use with the new input data. This may result in erroneous outputs from the machine learning model, which may in turn result in suboptimal performance or malfunctioning of the device” (Lajevardi [0004]).

The combination of Barson/Lajevardi fails to explicitly disclose but Aili discloses An information processing system that includes a training server that trains a machine learning model on a training data set including input data and a label, which is ground truth data for the input data (Aili [0117]; “In some aspects, clients 33 or servers 32 (or both) may make use of one or more specialized services or appliances that may be deployed locally or remotely across one or more networks 31.”, 
Aili [0077]; “ML service 150 may comprise a ML training engine 151 and a ML storage 152. ML training engine 151 may be configured to automatically create new NLU models or improve existing ML models through self-training based on, for example, the structure of a runtime solution, user-provided example inputs, new interaction data, classified and annotated historical natural language data, and the like. ML storage 152 may store ML models and model-related information, such as parameters and hyper-parameters, and allow for retrieval by other components of system 105. ML service 150 may be configured to extract data from log service” which reads on a training server that trains a machine learning model using user data inputs and accompanying annotations which are read as labels),
and a response server that inputs input data, which is entered by a user, to the trained machine learning model and outputs response data based on a label that is output by the machine learning model (Aili [0075]; “a plurality of runtime solutions 133 created by publisher 122, which may be used to process user inputs; and a prediction API 134, which may be configured to predict confidence ratings, annotations, classifications, and the like for user input. Runtime environment 131 may be a user-facing natural language application hosted on interfaces 166[a-n], which may be used to receive and collecting interaction inputs. Preprocessor 132 may be used to preprocess inputs by, for example, normalizing, performing spelling correction, tokenization, sentence splitting, morphological annotation, name entity detection, sentiment analysis, and the like. This may allow users some freedom in how an input is worded and phrased to make it natural to that particularly user with minimal adverse effects on response accuracy.”, which discloses runtime user-input as natural language understandings,
Aili [0031]; “receive a natural language input from an external interface, process the natural language input by at least annotating and classifying the input, generate a log dataset based at least on how the runtime solution and processed input interprets the natural language input, and store the natural language input and log dataset to a log storage; wherein, the machine learning service may automatically request the log dataset from the log storage to retrain and improve the natural language understanding model dataset”, wherein a server inputs natural language input read on as user input to a server hosting a machine learning service,
Aili[0084]; “At step 506, the runtime environment sends the input to a preprocessor … At step 539, the runtime service sends the response, with detailed log information, to a log service for storage, and possible use in improving models.”, wherein a server outputs the response data with log information containing annotated data read as labels.),

It would have been obvious to a person of ordinary skill in the art before the effective filing date of the invention to modify Barson/Lajevardi’s method of neural network training that iteratively updates its training dataset and subsequently retrains its models by using Aili’s dedicated training server to train the model and a response server to handle model inputs and outputs.  The motivation to do so is to allow Barson/Lajevardi’s neural network training method to “be configured to communicate with a plurality of other computing devices, such as clients or servers, over communications networks such as a wide area network a metropolitan area network, a local area network, a wireless network, the Internet, or any other network, using known protocols for such communication, whether wireless or wired.” (Aili [0106]), thus allowing trained model creation and deployments to occur over a wider effective area.

Regarding Claim 3, 
The Barson/Lajevardi/Aili combination of Claim 1 teaches the system of Claim 1 (and thus the rejection of Claim 1 is incorporated). Barson further discloses to determine whether a number of items of input data for each label in the training data set satisfies the detection condition; (Barson [0143]; “EvaluatePerformance[:] When this instruction is given to the ADE the performance evaluator 43 carries out an evaluation using the evaluation data set 45. When the performance evaluation is completed a classification error is returned to the application specific software. This gives an indication as to how many mis-classifications were made by the neural network. A mis-classification occurs when the neural network returns a detection result based on a known input-output pair, which does not match the correct output for that particular input“, wherein the identification of misclassified input-output pairs reads on determining a number of items (interpreted as a portion of the dataset) for which the label satisfies the detection condition (mis-classification)
Barson [0114]; “The data manager 41 maintains two data sets: an evaluation data set 45, and an example data set 46 which is also referred to as a training data set. The data manager receives inputs of detection data 40 and validated results 48. The validated results comprise information about whether anomaly candidates identified by the neural network 47 are real anomalies or not. These validated results 48 are also referred to as “profile identification and category” information; they are used to update the example data 46, the evaluation data 45 and for other purposes as described below. The evaluation data set 45 is created by splitting the detection data set 40 into two parts; an evaluation data set 45 and an example or training set 46. Both these sets of data contain profiles and information about whether each profile in the set is anomalous or not.
The example or training data set 46 is used to train the neural network 47 using the training processor 42. Adding new examples of anomalous behaviour 48 to this data set enables the detection to be updated with new information. This aids the general performance of the ADE; examples from false positive identifications can be added to the example data set to reduce the probability that the false identification recurs. Adding results from positive identifications reinforces the ability of the neural network 47 to make similar positive identifications.” wherein the input data is deemed to be associated with its own profile labels, all evaluated under the anomaly classification detection condition)
update the training data set when the detection condition is determined to be satisfied (Barson [0122]; “Detector 35[:] Once the data from the two profiles has been prepared, the neural network has been created and evaluated by the administrator 34, the neural network 47 is simply presented with the new detection data 40. Referring to FIG. 3, the detector 35 receives the detection data 40 and using the trained and validated neural network 47 carries out the detection process to produce potential anomaly candidates 41. The neural network classifies each recent profile either as an anomaly or not and the neural network 47 also gives an associated confidence value for each classification. Anomaly threshold parameters 52 are input to the detector 35 from application specific software. These parameters 52 are used to filter the potential anomaly candidates 41 to remove the majority of false positive identifications. For example, all anomaly candidates with a very low confidence rating could be filtered out.“,
Barson [0115]; “The example or training data set 46 is used to train the neural network 47 using the training processor 42. Adding new examples of anomalous behaviour 48 to this data set enables the detection to be updated with new information. This aids the general performance of the ADE; examples from false positive identifications can be added to the example data set to reduce the probability that the false identification recurs. Adding results from positive identifications reinforces the ability of the neural network 47 to make similar positive identifications.”, wherein the data set is updated by adding anomalous behaviour detected by the detector)

Regarding Claim 3,
In addition, Lajevardi in the Barson/Lajevardi/Aili combination also teaches to determine whether a number of items of input data for each label in the training data set satisfies the detection condition; (Lajevardi [Column 2 Line 58]; “When the number of observational data fed during an inference component process is equal to or more than a preset reference value”)
update the training data set when the detection condition is determined to be satisfied ((Lajevardi [Column 3 Line 30];
“The learning component 210 may be configured to generate a learning model M(.) by learning elements included in an input learning data set.
In an embodiment, the learning component 210 may generate the learning model M(.) by learning data based on an artificial neural network method.
The artificial neural network method indicates a method that models data by repeating a process of estimating a result by applying a weight to input data and detecting an error of the estimated value to correct the weight.
The inference component 220 may apply learning model M(.) to observational data, and output a recognition result D(M(.)) of the observational data.
The determination component 230 may perform analysis and determination based on the input data and the recognition result D(M(.)) of the inference component 220, and output a determination result value (Ground Truth) GT(.).
The model update component 240 may generate a cascaded learning model M(A.sup.N) by cascading the recognition results of the first learning model generated from the previous learning data set and the second learning model generated from the current update learning data set, and provide the cascaded learning model M(A.sup.N) to the inference component 220.
Here, N may indicate the number of times that the learning model is updated.
In an embodiment, when the number of observational data is equal to or more than a preset reference value, the model update component 240 may construct an update learning data set. When the update learning data set is constructed, the model update component 240 may generate the second learning model, and update the learning model by cascading the recognition results of the first learning model generated at the previous point of time and the second learning model” wherein the training data set is evaluated upon a detection condition of exceeding a preset reference value and subsequently an updated training data set is configured)

Regarding Claim 4, 
The Barson/Lajevardi/Aili combination of Claim 1 teaches the system of Claim 1 (and thus the rejection of Claim 1 is incorporated). Barson further discloses to update the training data set based on an improvement parameter when the performance of the machine learning model is evaluated not to satisfy the predetermined condition; Update the improvement parameter in response to an update of the training data set; (Barson [0170]; “1. target error--this is a threshold error value which must be achieved before training stops. If the target error is set to 0 then the threshold is ignored. The target error is specified as the sum of squared errors over the training set. That is, the training set is presented to the neural network and the output values are subtracted from the expected output values to give a set of errors. The sum of the squares of these errors is then calculated.” wherein training includes a retraining data set’s creation or update; thereby, a target error constituting an improvement parameter dictating training consequently dictates updates on a training data set. Target error is dynamically updated according to the calculated sum of squares of an updated data set,
Barson [0044]; “(iv) determining when a predetermined threshold which relates to the level of correspondence between the output values and their respective target output values is reached; 
(v) automatically retraining the neural network using the set of training data. This provides the advantage that it is not necessary for the user to make a decision about when to retrain. This removes the need for an expert user to be available to maintain the system while it is in use.”, wherein an evaluation of the performance according to a threshold is done before retraining the neural network which encompasses updating the training dataset if threshold target output values are not reached)

Regarding Claim 4,
In addition, Lajevardi in the Barson/Lajevardi/Aili combination also teaches to update the training data set based on an improvement parameter when the performance of the machine learning model is evaluated not to satisfy the predetermined condition; Update the improvement parameter in response to an update of the training data set Lajevardi [Column 4 Line 60]; “Referring to Fig. 5, the model update component 240 may include a BKS (Behavior Knowledge Space) element generation component 241 and a mapping component 243.
In an embodiment, the model update component 240 may use a BKS method to derive the cascaded learning model M(A.sup.N).
The BKS method indicates a method that stores determination result values for recognition results of the inference component 220 in a table, and provides a recognition result by referring to the table, when new (observational) data are introduced.
In the BKS method, the recognition results of the inference component 220 may become key data, and the BKS element generation component 241 may construct BKS elements by calculating a statistics vector for each key data.
BKS: Set of BKS elements
BKS element: {(key data, statistics vector)}
In an embodiment, the BKS element generation component 241 may construct the BKS elements based on recognition results obtained by inferring all data k of the previous learning data set A and the update learning data set B” wherein the model update component comprising the BKS element reads on updating the training data set based on an improvement parameter; wherein the BDS constructed in part by the model evaluation “recognition results” of the previous and updated learning sets reads on the improvement parameter being updated in response to updated learning sets)

Claim 8 recites a method comprising the same executable instructions as the system of Claim 1, and is thus rejected for reasons set forth in the rejection of Claim 1.

Claim 9 recites a device comprising the same processor and memory device, the memory device storing the same executable instructions as the system of Claim 1, and is thus rejected for reasons set forth in the rejection of Claim 1.

Regarding Claim 10, 
The Barson/Lajevardi/Aili combination of Claim 1 teaches the system of Claim 1 (and thus the rejection of Claim 1 is incorporated). The combination already discloses to acquire a question data based on a user input; input the question data into the training machine learning model and obtain an answer as output; and output the answer to the user; (Aili [0095]; “FIG. 13 is a diagram illustrating an exemplary interface 1300 displaying a number of natural language resources used by the hybrid system in various embodiments of the invention. NLI resources may comprise a large assortment of lists and natural language components such as phrases, for example a number of question resources based on phrases such as “how do I . . . ” or “what is . . . ”, that may then be utilized in recognizing and answering questions from a user. An extended library of language objects may be used to provide the building blocks of natural language, enabling complex recognition of natural language as provided by a user, avoiding common pitfalls associated with virtual assistants that require questions to be asked in a particular way, or only accept a certain number or arrangement of arguments within a question”, wherein the question resources provided used as input data for the model reads on question data based on user input
Aili [0077]; “ML service 150 may comprise a ML training engine 151 and a ML storage 152. ML training engine 151 may be configured to automatically create new NLU models or improve existing ML models through self-training based on, for example, the structure of a runtime solution, user-provided example inputs, new interaction data, classified and annotated historical natural language data, and the like. ML storage 152 may store ML models and model-related information, such as parameters and hyper-parameters, and allow for retrieval by other components of system 105. ML service 150 may be configured to extract data from log service” which reads on inputting the natural language question data into a training machine learning model)
Aili [0076]; “Log service 140 may comprises a query engine 141 and a log storage 142; and may be used to store and query interaction data and metadata collected during runtime. Metadata may include triggers activated, flows activated, classifications, prediction results, variables set, external data collected, and the like which may be used in improving NL models and classifiers” wherein the log comprising outputted prediction results of the models during runtime reads on outputting the obtained answer responses to inputted question data to the user)

Regarding Claim 12, 
The Barson/Lajevardi/Aili combination of Claim 1 teaches the system of Claim 1 (and thus the rejection of Claim 1 is incorporated). The combination already discloses wherein the predetermined condition is satisfied based on whether a correct answer rate exceeds a predetermined threshold value; (Lajevardi [0028]; “The device management system 104 determines, at 418, whether the first characteristic data and the second characteristic data satisfy one or more consistency criteria. The consistency criteria are designed to measure whether the training data and the input data are sufficiently similar that the trained model is deemed competent for use with the input data. The consistency criteria may include, for example, a difference between a value of a characteristic for the training data and a value of the same characteristic for the input data being less than a specified threshold value. In this way, the consistency criteria can measure whether the set of input data sufficiently resembles the set of training data. The consistency criteria may include a predetermined distance between values of one or more characteristics for the two data sets being less than a specified value, or any other suitable metric for measuring a distance between distributions, such as a Kullback-Leibler divergence or other measure of divergence. Additionally, or alternatively, the consistency criteria may include values of one or more characteristics for the set of input data, for example a range or confidence interval, falling within limits depending on corresponding values for the set of training data. In this way, the consistency criteria can determine whether values for the set of input extend beyond a region for which the set of training data is deemed competent.” wherein the criteria is satisfied in part when the input data matches the training data to a sufficient threshold, thus reading on the predetermined condition satisfied (criteria fulfilled) when the model offers a correct output rate (answer rate) exceeding some threshold)
and wherein the detection condition is determined to be satisfied based on whether a number of data items exceeds an upper limit value (Lajevardi [0028]; “In this way, the consistency criteria can measure whether the set of input data sufficiently resembles the set of training data. The consistency criteria may include a predetermined distance between values of one or more characteristics for the two data sets being less than a specified value, or any other suitable metric for measuring a distance between distributions, such as a Kullback-Leibler divergence or other measure of divergence. Additionally, or alternatively, the consistency criteria may include values of one or more characteristics for the set of input data, for example a range or confidence interval, falling within limits depending on corresponding values for the set of training data. In this way, the consistency criteria can determine whether values for the set of input extend beyond a region for which the set of training data is deemed competent.” wherein the measure of divergence between the distributions of data items and their measured consistencies thus reads on the detection condition determined to be satisfied based on whether a number of data items within the compared distributions exceeds the upper limit value threshold)

Regarding Claim 13, 
The Barson/Lajevardi/Aili combination of Claim 1 teaches the system of Claim 1 (and thus the rejection of Claim 1 is incorporated). The combination already discloses to wherein the detection condition includes both an upper limit value and a lower limit value of a number of data items in the training data set; (Lajevardi [0028]; “In this way, the consistency criteria can measure whether the set of input data sufficiently resembles the set of training data. The consistency criteria may include a predetermined distance between values of one or more characteristics for the two data sets being less than a specified value, or any other suitable metric for measuring a distance between distributions, such as a Kullback-Leibler divergence or other measure of divergence. Additionally, or alternatively, the consistency criteria may include values of one or more characteristics for the set of input data, for example a range or confidence interval, falling within limits depending on corresponding values for the set of training data. In this way, the consistency criteria can determine whether values for the set of input extend beyond a region for which the set of training data is deemed competent.” wherein the Kullback-Leibler measure of divergence between the distributions of data items to determine if the training vs input data meet some threshold criteria thus reads on the detection condition including upper and lower limit values of data items in the training data set (items in the training distribution exceed the threshold criteria by deviating too much in either direction from the input distributions, thus reading on a lower and upper limit of allowed deviation consistent with consistency criteria)
and when the performance of the machine learning model is evaluated as not satisfying the predetermined condition, the upper limit value and the lower limit value are updated (Lajevardi Lajevardi [0028] [Figure 4]; 

    PNG
    media_image1.png
    616
    433
    media_image1.png
    Greyscale

wherein the determination of training data properties inherently results in the recalculation of threshold boundaries of consistency between the new data sets and future input data sets)

Regarding Claim 14, 
The Barson/Lajevardi/Aili combination of Claim 1 teaches the system of Claim 1 (and thus the rejection of Claim 1 is incorporated). The combination already discloses to wherein the system updates the training data set based on the detection condition (Lajevardi [Figure 4]; 

    PNG
    media_image1.png
    616
    433
    media_image1.png
    Greyscale

wherein the determination of training data properties and update of the training data set with new training data properties in mind thus reads on updating the training data set based on the consistency detection condition associated with the criteria)

Claims 5-7 are rejected under 35 U.S.C. 103 as being unpatentable over Barson et al. (US20030014377 A1, hereinafter “Barson”) in view of Lajevardi et al. (US20220115148A1, hereinafter “Lajevardi”)  in view of Aili et al. (US20180089572 A1, hereinafter “Aili”) and further in view of DiCorpo et al. (US20120150773 A1, hereinafter “DiCorpo”)

Regarding Claim 5, 
The Barson/Lajevardi/Aili combination of Claim 1 teaches the method of Claim 1 (and thus the rejection of Claim 1 is incorporated). The combination of Barson/Lajevardi/Aili fails to disclose but Aili further discloses to store input data in a log storage when the user inputs the input data (Aili [0031]; “receive a natural language input from an external interface, process the natural language input by at least annotating and classifying the input, generate a log dataset based at least on how the runtime solution and processed input interprets the natural language input, and store the natural language input and log dataset to a log storage; wherein, the machine learning service may automatically request the log dataset from the log storage to retrain”) 

It would have been obvious to a person of ordinary skill in the art before the effective filing date of the invention to modify the Barson/Lajevardi/Aili combination’s neural network training method using servers to incorporate Aili’s method to store all input data in log storage.  The motivation to do so is to ensure that input data “is continuously used by the machine learning service to retrain available models as more interaction data is received.” (Aili [0032]), thus allowing interaction data to be stored for future retraining epochs of the model.

The combination of Barson/Lajevardi/Aili fails to disclose but Dicorpo discloses to determine whether there is a label for which a number of items of the input data is insufficient in the training data set; (Dicorpo [0055]; “A user may review this information to determine additional data to add to the training data set. Certain categories of documents may have been underrepresented in the training data set 352.”)
Extract an input data item corresponding to the label from the input data stored in the log storage.
(Dicorpo [0055]; “In one embodiment, quality analyzer 340 identifies particular files, documents, etc. from the negative data 350 that caused false positives and identifies particular files, documents, etc. from the positive data 345 that caused false negative. A user may review this information to determine additional data to add to the training data set”, wherein the positive and negative data labels from the input data are stored in the log storage of Aili)
and add training data including the extracted input data item and the label to the training data set when it is determined that there is a label for which a number of items of the input data is insufficient (Dicorpo [0055]; “The user may correct this by adding additional examples of product documentation to the negative data set”)
It would have been obvious to a person of ordinary skill in the art before the effective filing date of the invention to modify the Barson/Lajevardi/Aili combination’s neural network training method to add training data of identified insufficient class labels into the dataset.  The motivation to do so is to prevent “certain categories of documents [that] may have been underrepresented in the training data set 352.” (DiCorpo [0055]), thus allowing for more accurate datasets.

Regarding Claim 6, 
The Barson/Lajevardi/Aili/DiCorpo combination of Claim 5 teaches the system of Claim 5 (and thus the rejection of Claim 5 is incorporated). The combination already discloses to extract an input data item corresponding to the label from the input data stored in the log storage based on the input data item corresponding to the label in the training data set and the input data stored in the log storage; (Dicorpo [0055]; “In one embodiment, quality analyzer 340 identifies particular files, documents, etc. from the negative data 350 that caused false positives and identifies particular files, documents, etc. from the positive data 345 that caused false negatives”)
when it is determined that there is a label for which a number of items of the input data is insufficient. (Dicorpo [0055]; “A user may review this information to determine additional data to add to the training data set. Certain categories of documents may have been underrepresented in the training data set 352.”)

Regarding Claim 7, 
The Barson/Lajevardi/Aili combination of Claim 1 teaches the method of Claim 1 (and thus the rejection of Claim 1 is incorporated). The combination of Barson/Lajevardi/Aili fails to disclose but Aili further discloses to store, when the user inputs input data, the input data in a log storage (Aili [0031]; “receive a natural language input from an external interface, process the natural language input by at least annotating and classifying the input, generate a log dataset based at least on how the runtime solution and processed input interprets the natural language input, and store the natural language input and log dataset to a log storage; wherein, the machine learning service may automatically request the log dataset from the log storage to retrain”) 

It would have been obvious to a person of ordinary skill in the art before the effective filing date of the invention to modify the Barson/Lajevardi/Aili combination’s neural network training method using servers to incorporate Aili’s method to store all input data in log storage.  The motivation to do so is to ensure that input data “is continuously used by the machine learning service to retrain available models as more interaction data is received.” (Aili [0032]), thus allowing interaction data to be stored for future retraining epochs of the model.

The combination of Barson/Lajevardi/Aili fails to disclose but Dicorpo discloses to extract an input data item corresponding to one of labels from the input data stored in the log storage; (Dicorpo [0055]; “In one embodiment, quality analyzer 340 identifies particular files, documents, etc. from the negative data 350 that caused false positives and identifies particular files, documents, etc. from the positive data 345 that caused false negatives”)
and add a set of the extended input data item and the label to the training data (Dicorpo [0055]; “The user may correct this by adding additional examples of product documentation to the negative data set”)

It would have been obvious to a person of ordinary skill in the art before the effective filing date of the invention to modify the Barson/Lajevardi/Aili combination’s neural network training method to identify a particular label’s data and add it into the dataset. The motivation to do so is “to improve the quality metric … the computing device modifies the training set of data in response to user input if the quality metric fails to meet a quality threshold “ (DiCorpo [0007]), thus allowing for datasets to improve in quality by adding new data to learn from.

Claim 11 is rejected under 35 U.S.C. 103 as being unpatentable over Barson et al. (US20030014377 A1, hereinafter “Barson”) in view of Lajevardi et al. (US20220115148A1, hereinafter “Lajevardi”) in view of Aili et al. (US20180089572 A1, hereinafter “Aili”) and further in view of Yamamoto (US20220083580A1).

Regarding Claim 11, 
The Barson/Lajevardi/Aili combination of Claim 1 teaches the method of Claim 1 (and thus the rejection of Claim 1 is incorporated). The combination of Barson/Lajevardi/Aili fails to disclose but Yamamoto discloses to determine an upper limit value and a lower limit value of the number of items of data, which are improvement parameters (Yamamoto [0079]; “As illustrated in FIG. 7, the user inputs, in boxes 58 on the user interface 50, values for designating an upper limit number and a lower limit number of documents (similar pre-training data) to be acquired. In the example in FIG. 7, the user designates “100,000” as the upper limit number and designates “30,000” as the lower limit number”
Yamamoto [0085]; “For example, if the user wants to perform pre-training using only similar pre-training data with a high similarity or if the user wants to reduce a time of the training process, it is possible to reduce the upper limit number for the search. Further, if the user wants to preform pre-training using a larger number of pieces of similar pre-training data, it is possible to increase the upper limit number for the search”) 

It would have been obvious to a person of ordinary skill in the art before the effective filing date of the invention to modify the Barson/Lajevardi/Aili combination’s neural network training method using servers to incorporate Yamamoto’s method of determining upper and lower limits to the inputted number of items of data improvement parameters.  The motivation to do so is “to adjust quality and an amount of the similar pre-training data depending on a need of the user, so that it is possible to perform pre-training appropriate for the purpose of the user” (Yamamoto [0085]).

Response to Arguments
Applicant’s amendment to the title of the invention is acknowledged, and found to be sufficiently descriptive.
The Examiner acknowledges the Applicant’s amendments in which Claims 1, 8, 9 and 11 are amended and Claims 12-14 have been added.
Applicant’s arguments filed January 6th, 2026, traversing the rejection of claim 11 under 35 U.S.C. § 112(b) has been fully considered, and is fully persuasive.
Applicant’s arguments filed January 6th, 2026, traversing the rejection of claims 1, 3-11 under 35 U.S.C. § 101 have been fully considered, and are fully persuasive.
Applicant’s arguments regarding the 35 U.S.C. § 103 rejection of claims 1, 3-11 of the previous office action and new claims 12-14 have been considered, but is not fully persuasive. New reference Lajevardi has been combined with primary reference Barson to disclose the following elements argued by the applicant as not being disclosed by the prior art previously relied upon:
 when the performance of the machine learning model is evaluated to not satisfy the predetermined condition, detecting a problem with the training data by determining whether the training data set satisfies a detection condition
The rejection of Claim 1 under 35 U.S.C. § 103 has been maintained. Similarly, the rejection of Claims 8 and 9 under 35 U.S.C. § 103 have been maintained.
The rejection of Claims 3-7 and 10-11 under 35 U.S.C. § 103, which depend directly or indirectly from Claim 1, have been maintained.
	Regarding new claims 12-14, new secondary reference Lajevardi in combination with primary reference Barson discloses the elements argued by the applicant as not being disclosed by the prior art previously relied upon. 

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure:
“TRAINING A MACHINE LEARNING MODEL USING A BATCH BASED ACTIVE LEARNING APPROACH” (US20210089960A1) which discloses analysis of machine learning model performance for active learning and updates to a training data set for re-training
“AUTOMATIC DETECTION OF LEARNING MODEL DRIFT” (US20190147357A1) which discloses retraining of a machine learning model involving training data updates in response to detection conditions associated with a predetermined problem’s occurrence
Any inquiry concerning this communication or earlier communications from the examiner should be directed to JONATHAN J KIM whose telephone number is (571) 272-0523. 
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kieu Vu can be reached on (571) 272-4057. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.



/JONATHAN J KIM/Examiner, Art Unit 2141

/MATTHEW ELL/Supervisory Patent Examiner, Art Unit 2141
Read full office action
Prosecution Timeline

Dec 22, 2021
Application Filed
Apr 14, 2025
Non-Final Rejection — §103
Jul 03, 2025
Interview Requested
Jul 15, 2025
Applicant Interview (Telephonic)
Jul 15, 2025
Examiner Interview Summary
Jul 18, 2025
Response Filed
Oct 01, 2025
Final Rejection — §103
Dec 17, 2025
Interview Requested
Dec 23, 2025
Examiner Interview Summary
Dec 23, 2025
Applicant Interview (Telephonic)
Jan 06, 2026
Request for Continued Examination
Jan 23, 2026
Response after Non-Final Action
Feb 19, 2026
Non-Final Rejection — §103 (current)
AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Prosecution Projections

3-4
Expected OA Rounds
33%
Grant Probability
99%
With Interview (+80.0%)
3y 3m
Median Time to Grant
High
PTA Risk
Based on 6 resolved cases by this examiner. Grant probability derived from career allow rate.