Last updated: May 29, 2026
Application No. 17/852,024
METHOD AND APPARATUS FOR GENERATING PROCESS SIMULATION MODELS

Non-Final OA §103
Filed
Jun 28, 2022
Priority
Jul 20, 2021 — RE 10-2021-0095160
Examiner
MEYER, JACQUELINE CHRISTINE
Art Unit
2144
Tech Center
2100 — Computer Architecture & Software
Assignee
Samsung Electronics Co., Ltd.
OA Round
3 (Non-Final)
Interview Optional

— +70.0% interview lift. Interview already conducted in this application's prosecution history. This examiner has a 67% grant rate with +70.0% interview lift. Since an interview has already been tried, recommend written response with narrowed claims based on precedent claim evolution patterns.
Based on 15 resolved cases, 2023–2026
Examiner Intelligence

MEYER, JACQUELINE CHRISTINE View full profile →
Grants 67% — above average
Career Allowance Rate
10 granted / 15 resolved
+11.7% vs TC avg
Strong +70% interview lift
Without
With
+70.0%
Interview Lift
resolved cases with interview
Typical timeline
3y 10m
Avg Prosecution
14 currently pending
Career history
Total Applications
across all art units
Statute-Specific Performance

§101
5.3%
-34.7% vs TC avg
§103
85.5%
+45.5% vs TC avg
§102
5.3%
-34.7% vs TC avg
§112
1.3%
-38.7% vs TC avg
Black line = Tech Center average estimate • Based on career data from 15 resolved cases
Office Action

§103
DETAILED ACTION
	This nonfinal rejection is responsive to the amendment filed on January 26, 2026. Claims 1-11 and 17-25 are pending. Claims 1, 11, and 18 are independent.
	Claim rejections under 35 USC §103 are withdrawn. However, a new grounds of rejection has been made in light of applicant’s amendment.

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on January 26, 2026 has been entered.

 Information Disclosure Statement
The information disclosure statement (IDS) submitted on February 3, 2026 is in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statement is being considered by the examiner.


Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


	Claims 1-5, 7-9, 18-22, and 24-25 are rejected under 35 U.S.C. 103 as being unpatentable over Bhaskar et al. (WO2017117568), hereinafter Bhaskar, in view of Dunne et al. (US20200272899), hereinafter Dunne, in view of Gupta et al. (Improving Surrogate Model Accuracy for the LCLS-II Injector Frontend Using Convolutional Neural Networks and Transfer Learning), hereinafter Gupta.

Regarding claim 1, Bhaskar teaches:
	A method of generating a simulation model based on simulation data and measurement data of a target, the method comprising: (Bhaskar, page 4, lines 12-14: “The one or more components include a machine learning based model configured for performing one or more simulations for specimens.” And page 26, lines 4-7: “In this manner, the simulated measurement(s) may represent measurements, images, output, data, etc. that may be generated for the specimens by a metrology system described herein.”)
	A pre-learning model based on the simulation data (Bhaskar, page 26, lines 14-23: “the information for the non-nominal instances will be used for re-training of the machine learning based model thereby performing transfer learning of the non-nominal instances to the machine learning based model. Therefore, acquiring the information for the non-nominal instances may essentially be transfer learning training input generation. Transfer learning training input generation can be performed in a number of ways described further herein including: a) empirical simulation of real defect events on wafers and masks using process design of experiments (DOEs); b) introduction of virtual defect events in design/simulation space by using synthetic approaches; and c) hybrid approaches using empirical plus synthetic methods in concert.” – The model that is then transfer trained is analogous to the pre-learning model with the input generation options a and b including simulation data that it is generated with.)

Bhaskar does not explicitly teach:
	classifying weight parameters, included in a pre-learning model learned based on the simulation data, as a first weight group and a second weight group based on a degree of significance;
	retraining the first weight group of the pre-learning model based on the simulation data; and
	training the second weight group of a transfer learning model based on the measurement data, wherein the transfer learning model includes the first weight group of the pre-learning model retrained based on the simulation data,
	wherein, in the retraining of the first weight group and the training of the second weight group: 
training based on the simulation data is separate from training based on the measurement data, and
	weight parameters of the first weight group are trained on different data from weight parameters of the second weight group based on classification into the first weight group or the second weight group.

However, Dunne teaches:
	classifying weight parameters, included in a pre-learning model learned based on the simulation data, as a first weight group and a second weight group based on a degree of significance; (Dunne, paragraph 0131: “Thus, in some embodiments, the centralized site/device 604 may be configured to freeze one or more weights in one or more layers of the neural network during training in order to prevent the weights from changing during the re-training process. The centralized site/device 604 may select or determine weights to freeze by implementing or using a minimum size technique or a minimum delta technique.” – Bhaskar already teaches that the pre-learning model is based on the simulation data, the freezing of one or more weights would indicate that there are two weight groups, one weight group of frozen layers and one weight group of non-frozen layers. A minimum delta technique to decide which weights to freeze is analogous to the groups being based on a degree of significance.)
retraining the first weight group of the pre-learning model based on the simulation data; and (Dunne, paragraph 0132: “The centralized site/device 604 may train or retrain the neural network and freeze the determined/selected weights. The retraining may be performed incrementally by freezing groups of weights (e.g. within a layer) and performing a re-training cycle, or by freezing all selected weights initially and then completing re-training.”)
	training the second weight group of a transfer learning model …, wherein the transfer learning model includes the first weight group of the pre-learning model retrained based on the simulation data, (Dunne, paragraph 0132: “The centralized site/device 604 may train or retrain the neural network and freeze the determined/selected weights. The retraining may be performed incrementally by freezing groups of weights (e.g. within a layer) and performing a re-training cycle, or by freezing all selected weights initially and then completing re-training.” – The training or retraining of the neural network by freezing weight groups indicates training the second weight group, e.g., the non-frozen weights, of a transfer model. Since this re-training cycle is performed incrementally, the weight groups which are frozen would be different weight groups with each training, which Bhaskar already teaches as being done with the simulation data.)
training based on the simulation data is separate from training based on the measurement data, and (Dunne, paragraph 0119: “The baseline for this technique (e.g., updating a neural network based on a neural network difference model) is based on patches. In this case, after re-training with the new data in the centralized site/device 604, a patch may be generated between the original neural network and the retrained neural network.” – While Dunne does not explicitly teach the measurement data, it does teach that the re-training is done using new data which would be separate data from the simulation data taught by Bhaskar above.)
weight parameters of the first weight group are trained on different data from weight parameters of the second weight group based on classification into the first weight group or the second weight group. (Dunne, paragraph 0132: “acquiring new data on which to additionally train or re-train the neural network in the centralized site/device 604… The retraining may be performed incrementally by freezing groups of weights (e.g. within a layer) and performing a re-training cycle, or by freezing all selected weights initially and then completing re-training.” – The new data that is acquired to retrain the neural network by freezing groups of weights is analogous to the weight parameters of the first weight group and weight parameters of the second weight group being trained on different data.)
	Dunne is considered analogous to the claimed invention as it is in the same field of endeavor, machine learning. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date to have modified Bhaskar, which already teaches generating a simulation model based on simulation data and measurement data but does not explicitly teach grouping the weight parameters and retraining each weight group based on either simulation data or measurement data, to include the teachings of Dunne which does teach retraining weights of the model by freezing weight groups and retraining based on new data in order to generate a patch to update the neural network on the edge device which requires transferring less data. (Dunne, paragraph 0120)

Bhaskar and Dunne do not explicitly teach 
that the training of the second weight group is based on the measurement data
	that the new data used for training, which is separate from the training based on simulation data, is measurement data

However, Gupta teaches:
	that the training of the second weight group is based on the measurement data and that the new data used for training, which is separate from the training based on simulation data, is measurement data (Gupta, page 11, column 2, last paragraph: “In this case, after training the base model, we combine the training data sets such that the neural network is trained on both simulation data sets simultaneously” and page 8, column 2, paragraph 2: “Because we have very little measured data, we generated an initial model trained on simulation data and then modify it to be consistent with measured data afterward. Here, we develop and demonstrate a transfer learning procedure to accomplish this.” – The neural network being trained on the simulation data sets after being initially trained is analogous to the retraining and is explicitly done using simulation data, thus the retraining of the first weight group, which is taught by Dunne above, is done using simulation data. Likewise, the model is then transferred and modified (trained) using measured data, thus the training of the second weight group, which is taught by Dunne above, is done using measurement data.) 
	Gupta is considered analogous to the claimed invention as it is in the same field of endeavor, machine learning. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date to have modified Bhaskar and Dunne, which already teaches retraining the first weight group of the pre-learning model and training the second weight group of the transfer learning model but does not explicitly teach that the second weight group is trained using measurement data and that it is separate from training based on the simulation data, to include the teachings of Gupta which does teach that the second weight group is trained using measurement data and that it is separate from training based on the simulation data in order to “compensate for the difference between the injector simulations and measurements by using transfer learning [24, 30], resulting in a surrogate model that is more representative of the real machine and can interpolate between VCC images more accurately than a model trained only on measured data.” (Gupta, page 2, column 2, paragraph 1)

Regarding claim 2, Bhaskar, Dunne, and Gupta teach the method of claim 1, as cited above.
Bhaskar does not explicitly teach:
	the classifying of the weight parameters comprises extracting the first weight group from the weight parameters based on sizes of the weight parameters.

However, Dunne further teaches:
	the classifying of the weight parameters comprises extracting the first weight group from the weight parameters based on sizes of the weight parameters. (Dunne, paragraph 0131: “The minimum size technique may include ranking, selecting or determining the weights to be frozen based on an aggregate measure of weight size.” – ranking the weights based on the weight size is analogous to extracting the weight parameters based on sizes of the weight parameters which is then used to determine the weights to freeze, e.g., the first weight group.)

Regarding claim 3, Bhaskar, Dunne, and Gupta teach the method of claim 1, as cited above.
Bhaskar does not explicitly teach:
the classifying of the weight parameters comprises sorting the weight parameters in ascending order thereof based on sizes of the weight parameters, generating a reference weight value based on a degree of variation of each of sizes of the sorted weight parameters, and classifying weight parameters, which are greater than or equal to the reference weight value, as the first weight group. 

However, Dunne further teaches:
the classifying of the weight parameters comprises sorting the weight parameters in ascending order thereof based on sizes of the weight parameters, generating a reference weight value based on a degree of variation of each of sizes of the sorted weight parameters, and classifying weight parameters, which are greater than or equal to the reference weight value, as the first weight group. (Dunne, paragraph 0133: “In some embodiments the centralized site/device 604 may be configured to perform a weights freezing method based on a minimum delta technique, which may include the centralized site/device 604 training the original neural network, sending the original neural network to the edge device 602 and deploying the edge device 602, acquiring new data on which to additionally train or re-train the neural network in the centralized site/device 604, performing additional training or re-training without any weights freezing (i.e., without restricting or freezing any of the weights), analyzing the retrained neural network weight-by-weight to rank the weights in ascending order of the magnitude of their change before and after re-training, according to some change metric, and optionally select the top x % of the weights from the ordered list (with x possibly based on a user specification for the desired maximum patch size or the desired accuracy reduction).” – The weights being ranked in ascending order is analogous to sorting the weight parameters based on sizes, the change metric is analogous to the reference weight value based on a degree of variation which is further used to determine which weights are being frozen and, therefore, classifying the weights into the first weight group.)

Regarding claim 4, Bhaskar, Dunne, and Gupta teach the method of claim 1, as cited above.
Bhaskar does not explicitly teach:
the retraining of the first weight group of the pre-learning model comprises initializing values of the weight parameters of the second weight group before retraining the first weight group.

However, Dunne further teaches:
the retraining of the first weight group of the pre-learning model comprises initializing values of the weight parameters of the second weight group before retraining the first weight group. (Dunne, paragraph 0131: “Thus, in some embodiments, the centralized site/device 604 may be configured to freeze one or more weights in one or more layers of the neural network during training in order to prevent the weights from changing during the re-training process.” – Freezing the weights before re-training indicates that the weights have already been initialized. Thus, when the non-frozen weights are being retrained this is the first weight group and the frozen weight groups are the second weight group.)


Regarding claim 5, Bhaskar, Dunne, and Gupta teach the method of claim 1, as cited above.
Bhaskar does not explicitly teach:
the training of the second weight group of the transfer learning model comprises maintaining values of the weight parameters of the first weight group learned in the pre-learning model and retraining the weight parameters of the second weight group.

However, Dunne further teaches:
the training of the second weight group of the transfer learning model comprises maintaining values of the weight parameters of the first weight group learned in the pre-learning model and retraining the weight parameters of the second weight group. (Dunne, paragraph 0132: “The centralized site/device 604 may train or retrain the neural network and freeze the determined/selected weights. The retraining may be performed incrementally by freezing groups of weights (e.g. within a layer) and performing a re-training cycle, or by freezing all selected weights initially and then completing re-training” – The training being performed incrementally by freezing groups of weights indicates that when the second weight group is being trained on the measurement data, the first weight group is frozen and therefore being maintained.)

Regarding claim 7, Bhaskar, Dunne, and Gupta teach the method of claim 1, as cited above.
Bhaskar further teaches:
the target is a semiconductor process, (Bhaskar, page 21, lines 17-21: “As described above, therefore, the one or more computer subsystems described herein may be included in a system with one or more other subsystems with actual wafer handling and/or processing capability (e.g. , imaging subsystems, inspection subsystems, defect review subsystems, metrology subsystems, semiconductor fabrication process subsystems).”) and the simulation data comprises at least one of semiconductor process parameters or characteristic data of a semiconductor device manufactured based on the semiconductor process parameters, and (Bhaskar, page 26, lines 18-23: “Transfer learning training input generation can be performed in a number of ways described further herein including: a) empirical simulation of real defect events on wafers and masks using process design of experiments (DOEs); b) introduction of virtual defect events in design/simulation space by using synthetic approaches; and c) hybrid approaches using empirical plus synthetic methods in concert.” – The training input data including simulation data of defect events on wafers and process design of experiments is analogous to the simulation data comprising semiconductor process parameters (the process design) and characteristic data of the semiconductor device (the defect events on wafers).)
the characteristic data comprises at least one of a doping profile or a voltage-current characteristic of the semiconductor device. (Bhaskar, page 26, lines 1-6: “For example, the one or 11wt-e simulations performed by the machine learning based model may generate simulated measurement(s) (e.g.,, image(s), output, data. etc.) representing output generated by one of the systems described herein for specimen(s). In this manner, the simulated measurement(s) may represent measurements, images, output, data, etc. that may be generated for the specimens by a metrology system described herein.” – The measurement data would be indicative of characteristic data, while the measurements generated by a metrology system would include measurements such as the doping profile and voltage-current characteristic.)

Regarding claim 8, Bhaskar, Dunne, and Gupta teach the method of claim 7, as cited above.
Bhaskar further teaches:
the pre-learning model or the transfer learning model is configured to infer at least one of the doping profile (Bhaskar, page 20, lines 20-22: “For example, the embodiments described herein may be configured for training a machine learning based model that performs one or more simulations for the purposes of mask inspection, wafer inspection, and wafer metrology,” – wafer metrology includes doping profiles, therefore this is analogous to the model being configured to infer the doping profile.) or the voltage-current characteristic of the semiconductor device.  (Bhaskar, page 21, lines 21-25: “In this manner, the embodiments described herein may be configured as predictive systems including data in situ inside any semiconductor platform such as a metrology tool, an inspection tool, an etch chamber, etc... that has detectors and a computational platform to learn a model of its world (e.g., defects on a wafer in the case of a semiconductor inspector).” – metrology tools inside a semiconductor platform would include the voltage-current characteristics of the semiconductor device.)

Regarding claim 9, Bhaskar, Dunne, and Gupta teach the method of claim 1, as cited above.
Bhaskar further teaches:
	the transfer learning model comprises a first transfer learning model configured to infer a voltage-current characteristic of a semiconductor device and a second transfer learning model configured to infer a doping profile of the semiconductor device, by using semiconductor process parameters as inputs.  (Bhaskar, page 23, lines 13-16: “The one or more nominal instances of the specimens and the training performed using the one or more nominal instances may vary depending on the simulations that will be performed by the machine learning based model and the machine learning based model itself.” And page 21, lines 21-25: “In this manner, the embodiments described herein may be configured as predictive systems including data in situ inside any semiconductor platform such as a metrology tool, an inspection tool, an etch chamber, etc... that has detectors and a computational platform to learn a model of its world (e.g., defects on a wafer in the case of a semiconductor inspector).” And page 20, lines 20-22: “For example, the embodiments described herein may be configured for training a machine learning based model that performs one or more simulations for the purposes of mask inspection, wafer inspection, and wafer metrology,”  – This is indicative that the method is capable of creating different transfer learning models based on the simulations and instances that need to be performed. The metrology tools inside the semiconductor platform would include the voltage-current characteristics, so this is indicative of a transfer learning model to infer voltage-current characteristics. While the model trained to perform wafer metrology is indicative of a transfer model to infer a doping profile as a wafer metrology includes the doping profile.)

Regarding claim 18, claim 18 has all the same limitations of claim 1 which are taught by Bhaskar, Dunne, and Gupta – see claim 1 above.
Bhaskar further teaches:
	A neural network device, comprising: (Bhaskar, page 14, lines 9-10: “The computer subsystems shown in Fig. 1 (as well as other computer subsystems described herein) may also be referred to herein as computer system(s).” – This is analogous to the neural network device.)
a memory configured to store a neural network program; and a processor configured to execute the neural network program stored in the memory, wherein the processor is configured to execute the neural network program to (Bhaskar, page 14, lines 13-15: “In general, the term “computer system” may be broadly defined to encompass any device having one or more processors, which executes instructions from a memory medium,”)

Regarding claim 19, Bhaskar, Dunne, and Gupta teach the neural network device of claim 18, as cited above.
Claim 19 additionally has the same limitations of claim 2 which are taught by Bhaskar, Dunne, and Gupta – see claim 2 above.

Regarding claim 20, Bhaskar, Dunne, and Gupta teach the neural network device of claim 18, as cited above.
Claim 20 additionally has the same limitations of claim 3 which are taught by Bhaskar, Dunne, and Gupta – see claim 3 above.

Regarding claim 21, Bhaskar, Dunne, and Gupta teach the neural network device of claim 18, as cited above.
Claim 21 additionally has the same limitations of claim 4 which are taught by Bhaskar, Dunne, and Gupta – see claim 4 above.

Regarding claim 22, Bhaskar, Dunne, and Gupta teach the neural network device of claim 18, as cited above.
Claim 22 additionally has the same limitations of claim 5 which are taught by Bhaskar, Dunne, and Gupta – see claim 5 above.

Regarding claim 24, Bhaskar, Dunne, and Gupta teach the neural network device of claim 18, as cited above.
Claim 24 additionally has the same limitations of claim 7 which are taught by Bhaskar, Dunne, and Gupta – see claim 7 above.

Regarding claim 25, Bhaskar, Dunne, and Gupta teach the neural network device of claim 24, as cited above.
Claim 25 additionally has the same limitations of claim 8 which are taught by Bhaskar, Dunne, and Gupta – see claim 8 above.

	Claims 6 and 23 are rejected under 35 U.S.C. 103 as being unpatentable over Bhaskar in view of Dunne in view of Gupta in view of Salimans et al. (Weight Normalization: A Simple Reparameterization to Accelerate Training of Deep Neural Networks), hereinafter Salimans.

Regarding claim 6, Bhaskar, Dunne, and Gupta teach the method of claim 1, as cited above.
Bhaskar, Dunne, and Gupta do not explicitly teach:
the training of the transfer learning model comprises normalizing values of weight parameters of the trained second weight group.  

However, Salimans teaches:
the training of the transfer learning model comprises normalizing values of weight parameters of the trained second weight group.  (Salimans, section 2, paragraph 2: “This reparameterization has the effect of fixing the Euclidean norm of the weight vector w: we now have ||w|| = g, independent of the parameters v. We therefore call this reparameterizaton weight normalization.” – The weight vector corresponds with the second weight group which is then normalized by the taught method.)
	Salimans is considered analogous to the claimed invention as it is in the same field of endeavor, machine learning. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date to have modified Bhaskar, Dunne, and Gupta, which already teaches training the transfer learning model but does not explicitly teach normalizing the values of the weight parameters, to include the teachings of Salimans which does teach normalizing the values of the weight parameters in as “doing so improves the conditioning of the gradient and leads to improved convergence of the optimization procedure.” (Salimans, section 2, paragraph 2)

Regarding claim 23, Bhaskar, Dunne, and Gupta teach the neural network device of claim 18, as cited above.
Claim 23 additionally has the same limitations of claim 6 which are taught by Bhaskar, Dunne, Gupta, and Salimans – see claim 6 above.

	Claim 10 is rejected under 35 U.S.C. 103 as being unpatentable over Bhaskar in view of Dunne in view of Gupta in view of Aghdasi et al. (US20210089921), hereinafter Aghdasi.

Regarding claim 10, Bhaskar, Dunne, and Gupta teach the method of claim 9, as cited above.
Bhaskar, Dunne, and Gupta do not explicitly teach:
	the training of the transfer learning model comprises inferring the voltage-current characteristic based on the first transfer learning model and generating the second transfer learning model based on a difference between the pre-learning model and the first transfer learning model.

However, Aghdasi teaches:
the training of the transfer learning model comprises inferring the voltage-current characteristic based on the first transfer learning model and generating the second transfer learning model based on a difference between the pre-learning model and the first transfer learning model.  (Aghdasi, paragraph 0035: “If performance of that pre-trained model already satisfies at least a minimum performance criterion with respect to that additional training data, the additional training data may not be needed and that pre-trained model can be utilized. In at least some embodiments, an attempt can still be made to prune this model then retrain to attempt to ensure no significant loss of accuracy (e.g., less than 1% loss in accuracy) due to the pruning” – Claim 9 already teaches a transfer learning model inferring the voltage-current characteristics and a second transfer learning model. The retraining following pruning is analogous to training the second transfer learning model based on a difference between the pre-learning model and first transfer learning model as the pre-trained model is analogous to the pre-learning model and the pruned model before retraining is analogous to the first transfer learning model. The accuracy is indicative of the retraining being based on the difference between the two.)
	Aghdasi is considered analogous to the claimed invention as it is in the same field of endeavor, machine learning. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date to have modified Bhaskar, Dunne, and Gupta, which already teaches training a transfer learning model based on a first transfer learning model but does not explicitly teach generating the second transfer learning model based on a difference between the pre-learning model and the first transfer learning model, to include the teachings of Aghdasi which does teach generating the second transfer learning model based on a difference between the pre-learning model and the first transfer learning model in order to "obtain a machine learning model that is fully trained for an intended inferencing task without having to train the model from scratch." (Aghdasi, abstract)

	Claims 11 and 17 are rejected under 35 U.S.C. 103 as being unpatentable over Bhaskar in view of Aghdasi in view of Dunne in view of Gupta.

Regarding claim 11, Bhaskar teaches:
A method of generating a simulation model based on simulation data and measurement data of a target, the method comprising: (Bhaskar, page 4, lines 12-14: “The one or more components include a machine learning based model configured for performing one or more simulations for specimens.” And page 26, lines 4-7: “In this manner, the simulated measurement(s) may represent measurements, images, output, data, etc. that may be generated for the specimens by a metrology system described herein.”)

Bhaskar does not explicitly teach:
generating a common model, learning a common feature of a first characteristic and a second characteristic based on simulation data, and generating a first pre-learning model inferring the first characteristic and a second pre-learning model inferring the second characteristic, based on the common model;
classifying weight parameters, included in the first pre-learning model, as a first weight group and a second weight group based on the first characteristic and a degree of association;
initializing weight parameters included in the second weight group and retraining the first pre-learning model and the second pre-learning model based on the first weight group and the simulation data;
retraining the second pre-learning model based on the second weight group and the simulation data;
training a first transfer learning model corresponding to the first pre-learning model based on the first weight group and measurement data of the first characteristic; and
training a second transfer learning model corresponding to the second pre- learning model based on the first transfer learning model,
wherein, in the retraining of the first pre-learning model and the second pre- learning model and the training of the first transfer learning model;
training based on the simulation data is separate from training based on the measurement data of the first characteristic, and
weight parameters included in the first weight group are trained on different data from the weight parameters included in the second weight group based on classification into the first weight group or the second weight group. 

However, Aghdasi teaches:
	generating a common model, learning a common feature of a first characteristic and a second characteristic based on simulation data, and generating a first pre-learning model inferring the first characteristic and a second pre-learning model inferring the second characteristic, based on the common model; (Aghdasi, paragraph 0020: “A user or other entity can obtain one or more of these pre-trained models, and further train them to be able to make inferences for one or more additional classes or types of input data.” – The pre-trained model would be analogous to a common model, then once it is further trained it can be used as an inference for different classes which is analogous to creating different pre-learning models for inferring different characteristics.)
training a first transfer learning model corresponding to the first pre-learning model …; and (Aghdasi, paragraph 0030: “As illustrated in the system 400 of FIG. 4A, a transfer learning toolkit 406 can be utilized that contains one or more training modules 408, 412, as well as a pruning module 410, that can produce a fully trained model 414 from a pre-trained model and additional training data.” – Retraining the pre-trained model using transfer learning is analogous to training a first transfer learning model.)
training a second transfer learning model corresponding to the second pre- learning model based on the first transfer learning model, (Aghdasi, figure 6 and paragraph 0033: “If it is determined 610, through an evaluation, that the training was not successful, such as where the accuracy or confidence of the trained model does not at least meet a minimum threshold value, then further training may occur with additional training data.” – This iterative process shows that the pre-trained model could be further pre-trained in another step (fig. 6, number 610) which would correspond with an updated, or second, pre-learning model. This would further lead to a second transfer learning model.)
	Aghdasi is considered analogous to the claimed invention as it is in the same field of endeavor, machine learning. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date to have modified Bhaskar, which already teaches generating a simulation model but does not explicitly teach generating a common model and pre-learning models inferring characteristics, to include the teachings of Aghdasi which does teach generating a common model and pre-learning models inferring characteristics in order to “obtain a machine learning model that is fully trained for an intended inferencing task without having to train the model from scratch.” (Aghdasi, abstract)

Bhaskar and Aghdasi do not explicitly teach:
classifying weight parameters, included in the first pre-learning model, as a first weight group and a second weight group based on the first characteristic and a degree of association;
initializing weight parameters included in the second weight group and retraining the first pre-learning model and the second pre-learning model based on the first weight group and the simulation data;
retraining the second pre-learning model based on the second weight group and the simulation data;
that training of the first transfer model is based on the first weight group and measurement data of the first characteristic
wherein, in the retraining of the first pre-learning model and the second pre- learning model and the training of the first transfer learning model;
training based on the simulation data is separate from training based on the measurement data of the first characteristic, and
weight parameters included in the first weight group are trained on different data from the weight parameters included in the second weight group based on classification into the first weight group or the second weight group. 

However, Dunne teaches:
classifying weight parameters, included in the first pre-learning model, as a first weight group and a second weight group based on the first characteristic and a degree of association; (Dunne, paragraph 0131: “Thus, in some embodiments, the centralized site/device 604 may be configured to freeze one or more weights in one or more layers of the neural network during training in order to prevent the weights from changing during the re-training process. The centralized site/device 604 may select or determine weights to freeze by implementing or using a minimum size technique or a minimum delta technique.” – Bhaskar already teaches that the pre-learning model is based on the simulation data, the freezing of one or more weights would indicate that there are two weight groups, one weight group of frozen layers and one weight group of non-frozen layers. A minimum delta technique to decide which weights to freeze is analogous to the groups being based on a degree of significance.)
initializing weight parameters included in the second weight group (Dunne, paragraph 0131: “Thus, in some embodiments, the centralized site/device 604 may be configured to freeze one or more weights in one or more layers of the neural network during training in order to prevent the weights from changing during the re-training process.” – Freezing the weights before re-training indicates that the weights have already been initialized. Thus, when the non-frozen weights are being retrained this is the first weight group and the frozen weight groups are the second weight group.) and retraining the first pre-learning model and the second pre-learning model based on the first weight group and the simulation data; (Dunne, paragraph 0132: “The centralized site/device 604 may train or retrain the neural network and freeze the determined/selected weights. The retraining may be performed incrementally by freezing groups of weights (e.g. within a layer) and performing a re-training cycle, or by freezing all selected weights initially and then completing re-training.”)
retraining the second pre-learning model based on the second weight group and the simulation data; (Dunne, paragraph 0132: “The centralized site/device 604 may train or retrain the neural network and freeze the determined/selected weights. The retraining may be performed incrementally by freezing groups of weights (e.g. within a layer) and performing a re-training cycle, or by freezing all selected weights initially and then completing re-training.” – The training or retraining of the neural network by freezing weight groups indicates training the second weight group, e.g., the non-frozen weights, of a transfer learning model. Since this re-training cycle is performed incrementally, the weights which are frozen would be different weight groups with each training, which Bhaskar already teaches as being done with the simulation data.)
wherein, in the retraining of the first pre-learning model and the second pre- learning model and the training of the first transfer learning model: training based on the simulation data is separate from training based on the measurement data of the first characteristic, and (Dunne, paragraph 0119: “The baseline for this technique (e.g., updating a neural network based on a neural network difference model) is based on patches. In this case, after re-training with the new data in the centralized site/device 604, a patch may be generated between the original neural network and the retrained neural network.” – While Dunne does not explicitly teach the measurement data, it does teach that the re-training is done using new data which would be separate data from the simulation data taught by Bhaskar above.)
weight parameters included in the first weight group are trained on different data from the weight parameters included in the second weight group based on classification into the first weight group or the second weight group. (Dunne, paragraph 0132: “acquiring new data on which to additionally train or re-train the neural network in the centralized site/device 604… The retraining may be performed incrementally by freezing groups of weights (e.g. within a layer) and performing a re-training cycle, or by freezing all selected weights initially and then completing re-training.” The new data that is acquired to retrain the neural network by freezing groups of weights is analogous to the weight parameters of the first weight group and weight parameters of the second weight group being trained on different data.)
	Dunne is considered analogous to the claimed invention as it is in the same field of endeavor, machine learning. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date to have modified Bhaskar and Aghdasi, which already teaches a simulation model based on simulation data and measurement data but does not explicitly teach classifying weight parameters into weight groups based on a degree of significance and retraining each weight group based on either simulation data or measurement data, to include the teachings of Dunne which does teach classifying weight parameters into weight groups based on a degree of significance and retraining each weight group based on either simulation data or measurement data in order to generate a patch to update the neural network on the edge device which requires transferring less data. (Dunne, paragraph 0120)

Bhaskar, Aghdasi, and Dunne do not explicitly teach:
that the new data used for training, which is separate from the training based on simulation data, is measurement data

However, Gupta teaches:
that the new data used for training, which is separate from the training based on simulation data, is measurement data (Gupta, page 11, column 2, last paragraph: “In this case, after training the base model, we combine the training data sets such that the neural network is trained on both simulation data sets simultaneously” and page 8, column 2, paragraph 2: “Because we have very little measured data, we generated an initial model trained on simulation data and then modify it to be consistent with measured data afterward. Here, we develop and demonstrate a transfer learning procedure to accomplish this.” – The neural network being trained on the simulation data sets after being initially trained is analogous to retraining and is explicitly done using simulation data, thus the retraining of the first weight group, which is taught by Dunne above, is done using simulation data. Likewise, the model is then transferred and modified (trained) using measured data, thus the training of the second weight group, which is taught by Dunne above, is done using measurement data.) 
	Gupta is considered analogous to the claimed invention as it is in the same field of endeavor, machine learning. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date to have modified Bhaskar, Aghdasi, and Dunne, which already teaches retraining the pre-learning models and training the transfer learning model but does not explicitly teach that the training based on the simulation data is separate from training based on measurement data, to include the teachings of Gupta which does teach that the training based on the simulation data is separate from training based on measurement data in order to “compensate for the difference between the
injector simulations and measurements by using transfer learning [24, 30], resulting in a surrogate model that is more representative of the real machine and can interpolate between VCC images more accurately than a model trained only on measured data.” (Gupta, page 2, column 2, paragraph 1)


Regarding claim 17, Bhaskar, Aghdasi, Dunne, and Gupta teach the method of claim 11, as cited above.
Bhaskar does not explicitly teach:
the training of the second transfer learning model comprises generating the second transfer learning model based on the first pre-learning model, variation data of a weight parameter of the first transfer learning model, and the second weight group of the second pre- learning model.

However, Aghdasi further teaches:
	the training of the second transfer learning model comprises generating the second transfer learning model based on the first pre-learning model, (Aghdasi, figure 6 and paragraph 0033: “If it is determined 610, through an evaluation, that the training was not successful, such as where the accuracy or confidence of the trained model does not at least meet a minimum threshold value, then further training may occur with additional training data.” – As taught above for training a second transfer learning model, the first transfer learning model that is used to create the second pre-learning model, the iterative process means that the first transfer learning model is based on the first pre-learning model, therefore, the training of the second transfer learning model is based on the first pre-learning model.)

Bhaskar and Aghdasi do not explicitly teach:
	the training of the second transfer learning model comprises generating the second transfer learning model based on … variation data of a weight parameter of the first transfer learning model, and the second weight group of the second pre- learning model.


However, Dunne further teaches:
	the training of the second transfer learning model comprises generating the second transfer learning model based on … variation data of a weight parameter of the first transfer learning model, (Dunne, paragraph 0133: “a weights freezing method based on a minimum delta technique, which may include the centralized site/device 604 training the original neural network, sending the original neural network to the edge device 602 and deploying the edge device 602, acquiring new data on which to additionally train or re-train the neural network in the centralized site/device 604, performing additional training or re-training without any weights freezing (i.e., without restricting or freezing any of the weights), analyzing the retrained neural network weight-by-weight to rank the weights in ascending order of the magnitude of their change before and after re-training, according to some change metric, and optionally select the top x % of the weights from the ordered list (with x possibly based on a user specification for the desired maximum patch size or the desired accuracy reduction).” – The minimum delta technique is a form of variation data. Since it is looking at the minimum delta of the initial weights this is analogous to the variation data of a weight parameter of the first transfer learning model.) and the second weight group of the second pre- learning model. (Dunne, paragraph 0132: “The centralized site/device 604 may train or retrain the neural network and freeze the determined/selected weights. The retraining may be performed incrementally by freezing groups of weights (e.g. within a layer) and performing a re-training cycle, or by freezing all selected weights initially and then completing re-training.”)

Response to Arguments
Applicant’s arguments, on page 9 of Applicant’s Remarks, with respect to the rejection(s) of the independent claim(s) under 35 USC § 103 regarding the combination of Bhaskar in view of Xu in view of Mishra in view of Gupta have been fully considered and are persuasive as the combination does not teach the newly amended claim limitation. Therefore, the rejection has been withdrawn.  However, upon further consideration, a new ground(s) of rejection is made in view of Bhaskar in view of Dunne in view of Gupta. See section Claim Rejections – 35 USC §103 above.

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 
Bird et al. (From Simulation to Reality: CNN Transfer Learning for Scene Classification)
Gordon et al. (SplitNet: Sim2Sim and Task2Task Transfer for Embodied Visual Navigation)
Any inquiry concerning this communication or earlier communications from the examiner should be directed to JACQUELINE MEYER whose telephone number is (703)756-5676. The examiner can normally be reached M-F 8:00 am - 4:30 pm EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Tamara Kyle can be reached at 571-272-4241. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/J.C.M./Examiner, Art Unit 2144                                                                                                                                                                                                        
/TAMARA T KYLE/Supervisory Patent Examiner, Art Unit 2144
Read full office action
Prosecution Timeline

Show 6 earlier events
Dec 19, 2025
Interview Requested
Jan 07, 2026
Applicant Interview (Telephonic)
Jan 08, 2026
Examiner Interview Summary
Jan 26, 2026
Request for Continued Examination
Jan 31, 2026
Response after Non-Final Action
Apr 07, 2026
Non-Final Rejection mailed — §103
May 06, 2026
Examiner Interview Summary
May 06, 2026
Applicant Interview (Telephonic)
Precedent Cases

Applications granted by this same examiner with similar technology

17/368,168
Patent 12639619
ARTIFICIAL INTELLIGENCE-BASED MULTI-GOAL-AWARE DEVICE SAMPLING
4y 10m to grant Granted May 26, 2026
17/539,971
Patent 12608611
MULTISCALE DIMENSIONAL REDUCTION OF DATA
4y 4m to grant Granted Apr 21, 2026
17/381,240
Patent 12585981
MANAGING AN INSTALLED BASE OF ARTIFICIAL INTELLIGENCE MODULES
4y 8m to grant Granted Mar 24, 2026
17/570,468
Patent 12468941
SYSTEMS AND METHODS FOR DYNAMICS-AWARE COMPARISON OF REWARD FUNCTIONS
3y 10m to grant Granted Nov 11, 2025
Study what changed to get past this examiner. Based on 4 most recent grants.
Strategy Recommendation AI-generated — please review before filing

Get a prosecution strategy drawn from examiner precedents, rejection analysis, and claim mapping.
Typically takes 5-10 seconds — AI-generated, attorney review required before filing
Prosecution Projections

3-4
Expected OA Rounds
67%
Grant Probability
99%
With Interview (+70.0%)
3y 10m (~0m remaining)
Median Time to Grant
High
PTA Risk
Based on 15 resolved cases by this examiner. Grant probability derived from career allowance rate.