Prosecution Insights
Last updated: April 19, 2026
Application No. 17/334,574

METHOD FOR DECREASING UNCERTAINTY IN MACHINE LEARNING MODEL PREDICTIONS

Non-Final OA §101§103§112
Filed
May 28, 2021
Examiner
OCHOA, JUAN CARLOS
Art Unit
2186
Tech Center
2100 — Computer Architecture & Software
Assignee
ASML Netherlands B.V.
OA Round
5 (Non-Final)
68%
Grant Probability
Favorable
5-6
OA Rounds
4y 2m
To Grant
91%
With Interview

Examiner Intelligence

Grants 68% — above average
68%
Career Allow Rate
354 granted / 520 resolved
+13.1% vs TC avg
Strong +23% interview lift
Without
With
+22.8%
Interview Lift
resolved cases with interview
Typical timeline
4y 2m
Avg Prosecution
41 currently pending
Career history
561
Total Applications
across all art units

Statute-Specific Performance

§101
27.8%
-12.2% vs TC avg
§103
35.1%
-4.9% vs TC avg
§102
5.1%
-34.9% vs TC avg
§112
29.5%
-10.5% vs TC avg
Black line = Tech Center average estimate • Based on career data from 520 resolved cases

Office Action

§101 §103 §112
DETAILED ACTION The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . The amendment filed 03/09/2026 has been received and considered. Claims 1-20 are presented for examination. Continued Examination Under 37 CFR 1.114 A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection. Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114. Applicant's submission filed on 03/09/2026 has been entered. Claim Rejections - 35 USC § 112 The following is a quotation of 35 U.S.C. 112(b): (b) CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention. The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph: The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention. Claims 1-20 and are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which applicant regards as the invention. Claim 1 recites the limitation "the photolithography simulation" in line(s) 11, 14. There is insufficient antecedent basis for this limitation in the claim. The anteceding limitation was amended out. Claim 1 recites the limitation "the adjusted parameterized model" in line(s) 18. There is insufficient antecedent basis for this limitation in the claim. While there is a parameterized model with adjusted one or more parameters anteceding this limitation in the claim, there is no "adjusted parameterized model" anteceding this limitation in the claim. As to claim 15, the same deficiency applies. Dependent claims inherit the defect of the claim from which they depend. Claim Rejections - 35 USC § 101 35 U.S.C. 101 reads as follows: Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title. Claims 1-20 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. Independent claim 1, Step 1: a method (process = 2019 PEG Step 1 = yes). Independent claim 1 Step 2A, Prong One: claim recites: for quantifying uncertainty in parameterized model predictions… determining a variability of the predicted multiple posterior distributions for the given input by sampling from the distribution of distributions; using the determined variability of the predicted multiple posterior distributions to quantify uncertainty of image predictions associated with the photolithography simulation from the parameterized model; and determining one or more photolithography process parameters based on predictions from the adjusted parameterized model As to the limitations "determining one or more photolithography process parameters based on predictions from the adjusted parameterized model", the term "determining" is not elaborated but merely repeated in the Specification. Determinations are mental in nature. These limitations, as drafted and under a broadest reasonable interpretation, can be characterized as entailing a user analyzing deciding/determining (judgments, opinions), that can be performed in the human mind or by a human using a pen and paper. Claim 1 is substantially drawn to mental concepts: observation, evaluation, judgment, opinion. Information and/or data also fall within the realm of abstract ideas because information and data are intangible. See Electric Power Group1 (Electric Power hereinafter): “Information… is an intangible”. As to the quantifying uncertainty, it is mental in nature. See for example in the Specification (underline emphasis added): “[00106]… determining a variability of the sampled distributions. For example, Fig. 6C illustrates an example expected distribution p (z|x) 600, and a variability 602 of sampled distributions from a distribution of distributions for p (z|x) 600. Variability 602 may be caused by an uncertainty of the machine learning model, for example… using the determined variability in the predicted multiple posterior distributions to quantify the uncertainty in the parameterized model predictions comprises using the determined variability in the first and second sets of predicted multiple posterior distributions (e.g., the distribution of distributions for p (z|x) 600 shown in Fig. 6C, and a similar distribution of distributions for a p (y|z)) to quantify the uncertainty in the machine learning model predictions". As to the sampling from the distribution of distributions, it is mental in nature. See for example in the claimed invention (underline emphasis added): “12… wherein sampling comprises randomly selecting distributions from the distribution of distributions, wherein the sampling is gaussian or non-gaussian". As to the limitations "for a photolithography apparatus" and “for adjusting the photolithography apparatus to change focus or pupil shape based on the one or more determined photolithography process parameters", they are no more than intended use. If a claim limitation, under its broadest reasonable interpretation, covers mental processes, then it falls within the "(c) Mental processes" grouping of abstract ideas (2019 PEG Step 2A, Prong One: Abstract Idea Grouping? = Yes, (c) Mental processes—concepts performed in the human mind (including an observation, evaluation, judgment, opinion). Independent claim 1 Step 2A, Prong Two: As to the limitations "receiving a given input, wherein the given input comprises a mask image and an image of an actual wafer pattern produced using the mask image", these limitations describe the concept of “mere data gathering”, which corresponds to the concepts identified as abstract ideas by the courts. Data gathering, including when limited to particular content does not change its character as information, is also within the realm of abstract ideas. Data gathering has not been held by the courts to be enough to qualify as “significantly more”. See Electric Power. As to the limitations "causing a parameterized model to predict multiple posterior distributions from the parameterized model for the given input, the multiple posterior distributions comprising a distribution of distributions", “adjusting one or more parameters of the parameterized model to reduce the uncertainty of the image predictions associated with the photolithography simulation”, and "simulating a photolithography process using the parameterized model with the adjusted one or more parameters", the limitations appear to be just “apply it” limitations, because the limitations invoke computers merely as a tool to perform an existing process. This judicial exception is not integrated into a practical application (2019 PEG Step 2A, Prong Two: Additional elements that integrate the Judicial exception/Abstract idea into a practical application? = NO). Independent claim 1 Step 2B: As discussed with respect to Step 2A, Prong two, claim 1 recites data gathering, these limitations are recited at a high level of generality; and therefore, remain insignificant extra-solution activity even upon reconsideration. As discussed with respect to Step 2A, Prong two, limitations invoking computers or other machinery merely as a tool to perform an existing process are just “apply it” limitations – simply adding a general purpose computer or computer components after the fact to an abstract idea (e.g., a mathematical equation). See MPEP 2106.05(f)(2). As to the limitations "causing a parameterized model to predict multiple posterior distributions from the parameterized model", these limitations amount to computer implementation of mental concepts. Predictions are mental in nature. See for example in the Specification (underline emphasis added): "[00104]… causing the model to predict multiple posterior distributions pΘ(z|x) for a given input, multiple posterior distributions pφ(y|z) for a given input, and/or other posterior distributions“. As to the limitations "adjusting one or more parameters of the parameterized model to reduce the uncertainty of the predictions", these limitations amount to computer implementation of mental concepts. See for example in the Specification (underline emphasis added): "[0027]… using the determined variability in the predicted multiple posterior distributions to adjust the one or more parameters of the machine learning model to decrease the uncertainty of the machine learning model comprises training the machine learning model with additional and more diverse training samples”. As to the limitations "simulating a photolithography process using the parameterized model with the adjusted one or more parameters", these limitations amount to computer implementation of mathematical concepts. See for example in the Specification (underline emphasis added): "[00155] The concepts disclosed herein may simulate or mathematically model any generic imaging system for imaging sub wavelength features, and may be especially useful with emerging imaging technologies capable of producing increasingly shorter wavelengths" As discussed with respect to Step 2A, Prong two, the intended use limitations remain intended use even upon reconsideration, because no actual adjusting of a physical apparatus is performed in the body of the claim. As to the limitations "adjusting the photolithography apparatus", the "adjusting" is of "determined photolithography process parameters". The specification reads (underline emphasis added): '[0016]… the one or more determined photolithography process parameters comprise the focus, and adjusting the photolithography apparatus based on the focus comprises changing the focus from a first focus to a second focus… [00115]… the one or more determined photolithography process parameters comprise the pupil shape, and adjusting the photolithography apparatus based on the pupil shape comprises changing the pupil shape from a first pupil shape to a second pupil shape… the one or more determined photolithography process parameters comprise the, and adjusting the photolithography apparatus based on the focus comprises changing the focus from a first focus to a second focus' In the specification, the Examiner notes that the claimed "photolithography apparatus" is not associated with a physical apparatus. A physical apparatus in the specification is not referred to as a "photolithography apparatus". The specification reads (underline emphasis added): '[0003] A lithographic projection apparatus can be used, for example, in the manufacture of integrated circuits (I Cs)… [0055] Fig. 1 shows a block diagram of various subsystems of a lithography system… [00156] While the concepts disclosed herein may be used for imaging on a substrate such as a silicon wafer, it shall be understood that the disclosed concepts may be used with any type of lithographic imaging systems, e.g., those used for imaging on substrates other than silicon wafers' Thus, taken alone the individual additional elements do not amount to significantly more than the above-identified judicial exception (the abstract idea). Looking at the additional elements as an ordered combination adds nothing that is not already present when looking at the additional elements taken individually. There is no indication that their combination improves the functioning of a computer itself or improves any other technology (underline emphasis added). Therefore, the claim does not amount to significantly more than the abstract idea itself (2019 PEG Step 2B: NO). Claim 15 recites substantially the same elements as claim 1 and is rejected for the same reasons above. Further, the additional element computer readable medium is rejected below: Independent claim 15 Step 2A Prong two and 2B: the claim recites the additional element computer readable medium at a high level of generality, and it is recited as performing generic computer functions routinely used in computer applications. Generic computer components recited as performing generic computer functions that are well-understood, routine and conventional activities amount to no more than implementing the abstract idea with a computerized system. The use of a computer to implement the abstract idea of a mental algorithm has not been held by the courts to be enough to qualify as “significantly more”. The implementation on a computing system is described in the specification (underline emphasis added): "[00124]… methods described… performed by computer system… description herein is not limited to any specific combination of hardware circuitry and software". Dependent claims Step 2A, Prong One: Dependent claims limitations further the mental concepts of their independent claims. (See Independent claim 1, Step 2A, Prong One above). If a claim limitation, under its broadest reasonable interpretation, covers mental processes, then it falls within the "(c) Mental processes" grouping of abstract ideas (2019 PEG Step 2A, Prong One: Abstract Idea Grouping? = Yes, (c) Mental processes—concepts performed in the human mind (including an observation, evaluation, judgment, opinion). Dependent claims, Step 2A Prong two: As to the limitations "4/10/17… for a/the given input“ and “5… wherein the given input comprises one or more of an image, a clip, an encoded image, an encoded clip, or data from a prior layer of the parameterized model”, these limitations describe the concept of “mere data gathering”. (See Independent claim 1, Step 2A, Prong Two above). As to the limitations “2… wherein the parameterized model is a machine learning model”, “3/16… wherein causing the parameterized model to predict the multiple posterior distributions comprises causing the parameterized model to generate the distribution of distributions using parameter dropout”, "4/17… causing the parameterized model to predict the multiple posterior distributions from the parameterized model… causing the parameterized model to predict a first set of multiple posterior distributions corresponding to a first posterior distribution pΘ(z|x), and a second set of multiple posterior distributions corresponding to a second posterior distribution pφ(y|z)", “6… using the determined variability in the predicted multiple posterior distributions and/or the quantified uncertainty to adjust the one or more parameters of the parameterized model to reduce the uncertainty of the image predictions by making the parameterized model more descriptive or including more diverse training data“, “7… wherein the parameterized model comprises encoder-decoder architecture”, "8… wherein the encoder-decoder architecture comprises variational encoder-decoder architecture, the method further comprising training the variational encoder-decoder architecture with a probabilistic latent space, which generates realizations in an output space”, "9… wherein the latent space comprises a low dimensional encoding”, "11… determining a conditional probability using a decoder part of the encoder-decoder architecture”, "14… increasing a training set size and/or adding to a dimensionality of the latent space; adding additional dimensionality to the latent space; or training the parameterized model with additional and more diverse training samples”, and “18… wherein: the parameterized model comprises encoder-decoder architecture; and the encoder-decoder architecture comprises variational encoder-decoder architecture, the operations further comprising training the variational encoder-decoder architecture with a probabilistic latent space, which generates realizations in an output space”, the limitations appear to be just “apply it” limitations, because the limitations invoke computers merely as a tool to perform an existing process. This judicial exception is not integrated into a practical application of the exception (2019 PEG Step 2A, Prong Two: Additional elements that integrate the Judicial exception/Abstract idea into a practical application? = NO). Dependent claims, Step 2B: As discussed with respect to Step 2A, Prong two, claims recite data gathering, these limitations are recited at a high level of generality; and therefore, remain insignificant extra-solution activity even upon reconsideration. As discussed with respect to Step 2A, Prong two, limitations invoking computers merely as a tool to perform an existing process are just “apply it” limitations, (see Independent claim 1, Step 2B above). In the dependent claims, their additional elements do not provide an inventive concept in Step 2B. Therefore, the claims do not amount to significantly more than the abstract idea itself (2019 PEG Step 2B: NO). Claim Rejections - 35 USC § 103 The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action: A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made. The factual inquiries set forth in Graham v. John Deere Co., 383 U.S. 1, 148 USPQ 459 (1966), that are applied for establishing a background for determining obviousness under 35 U.S.C. 103(a) are summarized as follows: 1. Determining the scope and contents of the prior art. 2. Ascertaining the differences between the prior art and the claims at issue. 3. Resolving the level of ordinary skill in the pertinent art. 4. Considering objective evidence present in the application indicating obviousness or nonobviousness. Examiner would like to point out that any reference to specific figures, columns and lines should not be considered limiting in any way, the entire reference is considered to provide disclosure relating to the claimed invention. Claims 1-3, 5-16, and 18-20 are rejected under 35 U.S.C. 103(a) as being unpatentable over (see IDS dated 05/28/2021) Sandeep Madireddy et al., (Sandeep hereinafter), "Modeling I/O Performance Variability Using Conditional Variational Autoencoders", taken in view of (see IDS dated 05/28/2021) International Pub. No. WO2018208869, (WO2018208869 hereinafter), and further in view of William S. Wong, (Wong hereinafter), U.S. Pre–Grant publication 20080309897. As to claim 1, Sandeep discloses a method for quantifying uncertainty in parameterized model predictions (see “resulting model provides the I/O performance prediction as well as distributional quantities (e.g., credible intervals) that quantify the uncertainty in the prediction” in page 112, col. 2, 4th paragraph)… the method comprising: causing a parameterized model (see “parameterized model“ as “encoder and decoder networks“, “CVAE model consists of the encoder and decoder networks” in page 111, col. 1, last paragraph) to predict multiple posterior distributions from the parameterized model for the given input, the multiple posterior distributions comprising (see “input“ as “metrics“, “CVAE method to build predictive models for I/O performance and its variability as a function of system-wide I/O metrics, while leveraging data from multiple dissimilar applications” in page 110, col. 2, 2nd paragraph; “The training objective for a VAE is obtained using a variational Bayesian approach where the true posterior distribution of the latent variable is assumed to be intractable and hence approximated using a parametric distribution (referred to as a variational distribution)… encoder network outputs the parameters in the variational distribution (e.g., µz(X) and ∑z(X) for a Gaussian distribution” in page 110, last paragraph)… determining a variability of the predicted multiple posterior distributions for the given input by sampling from the distribution of distributions; using the determined variability of the predicted multiple posterior distributions (see “input“ as “metrics“, “modeling… predicts the performance variability in addition to the expected performance as a function of the concurrent system-wide metrics. Formally, we… define the application I/O time by φ = f(α, ζ) + ϵ(α, ζ), (1) where α represents the metrics related to application-specific characteristics, ζ is a feature set that represents the file system-specific metrics for which the data is observable, and ϵ (α, ζ) represents a stochastic error term associated with the variability of φ” in page 110, col. 2, 1st paragraph; “We adopt a variant of CVAE called the IWAE [8] that weights the samples drawn from the latent variable according to their probabilities” in page 111, 2nd paragraph) to quantify uncertainty of (see “The predictive accuracy of the IW-CVAE model is evaluated in the context of the I/O data” in page 111, col. 2, 2nd paragraph) predictions… from the parameterized model (see “model the input-dependent noise and handle dissimilar applications. The resulting model provides the I/O performance prediction as well as distributional quantities (e.g., credible intervals) that quantify the uncertainty in the prediction” in page 112, col. 2, 4th paragraph) adjusting one or more parameters of the parameterized model to reduce the uncertainty of the (see “The predictive accuracy of the IW-CVAE model is evaluated in the context of the I/O data” in page 111, col. 2, 2nd paragraph) predictions (see “reduce the uncertainty of the… predictions”, “hyperparameters can have a significant impact on the predictive performance of the model, and hence need to be carefully chosen. We tune the hyperparameters… where the model is trained for 1000 different configuration choices of the hyperparameters and the one that gave the best testing accuracy was chosen… obtaining the global dataset by aggregating training (testing) data across all application groups” in page 111, col. 1, last paragraph to col. 2, 1st paragraph). About Examiner's interpretation of "adjusting one or more parameters of the parameterized model to reduce the uncertainty of the Sandeep fails to disclose, but WO2018208869 discloses for a photolithography apparatus (see “optical and electron beam tools may be configured for directing energy (e.g., light, electrons) to and/or scanning energy over a physical version of the specimen thereby generating actual (i.e., not simulated) output and/or images for the physical version of the specimen" in page 19, 3rd paragraph) receiving a given input, wherein the given input comprises a mask image (see 'an image of a reticle acquired by a reticle inspection system and/or derivatives thereof can be used as a "proxy" or "proxies" for the design' in page 8, 1st paragraph) and an image of an actual wafer pattern produced using the mask image (see "mask image" as "first modality includes broadband optical imaging" and “image of an actual wafer pattern" as scanning electron microscope (SEM), "the first modality includes broadband optical imaging, and the second modality includes SEM. Broadband optical imaging as a modality generally refers to any optical imaging that is performed with broadband (BB) light source such as that generated by a BBP light source described herein" in page 25, last paragraph)… a distribution of distributions (see “To avoid overfitting and reduce redundancy in the extracted features, sparsity in the feature maps may be enforced by using a drop out layer at the end of the encoder and also including a L1 regularization on the codes in the L2 cost function” in page 36, last paragraph)… and determining one or more photolithography process parameters based on predictions from the adjusted parameterized model (see “learning based models may also include machine learning models… Machine learning explores the study and construction of algorithms that can learn from and make predictions on data" in page 30, 4th paragraph) for adjusting the photolithography apparatus (see “optical and electron beam tools may be configured for directing energy (e.g., light, electrons) to and/or scanning energy over a physical version of the specimen thereby generating actual (i.e., not simulated) output and/or images for the physical version of the specimen" in page 19, 3rd paragraph) to change focus (see “Light from optical element 18 may be focused onto specimen 14 by lens 20. Although lens 20 is shown in Fig. 1 as a single refractive optical element, it is to be understood that, in practice, lens 20 may include a number of refractive and/or reflective optical elements that in combination focus the light from the optical element to the specimen" in page 11, 2nd paragraph) (see “embodiments described herein may be particularly useful for… metrology" in page 47, 3rd paragraph; "Metrology processes are also used at various steps during a semiconductor manufacturing process to monitor and control the process… metrology processes are used to measure one or more characteristics of specimens such as a dimension (e.g., line width, thickness, etc.) of features formed on the specimens during a process such that the performance of the process can be determined from the one or more characteristics. In addition, if the one or more characteristics of the specimens are unacceptable (e.g., out of a predetermined range for the characteristic(s)), the measurements of the one or more characteristics of the specimens may be used to alter one or more parameters of the process such that additional specimens manufactured by the process have acceptable characteristic(s)" in page 2, 2nd paragraph). About Examiner's interpretation "a distribution of distributions", Examiner notes that as per dependent claim 3 "wherein causing the parameterized model to predict the multiple posterior distributions comprises causing the parameterized model to generate the distribution of distributions using parameter dropout". Therefore, it would have been obvious to one of ordinary skill in this art before the effective filing date of the claimed invention to use WO2018208869 with Sandeep, because WO2018208869 “uses a deep regression neural network or other learning based model… trained with pairs of corresponding images from different modalities to transform image 1 to image domain of image 2 for registration” (see page 38, last paragraph), and as a result, WO2018208869 reports that "we can replace the currently used heuristic renderings and hardware-dependent approaches with a data driven-based and hardware-independent approach" (see page39, 1st paragraph). Sandeep and WO2018208869 fail to disclose, but Wong discloses image (see "[0036]… resist image may be simulated using a photolithography simulation system") associated with the photolithography simulation; and simulating a photolithography process using the parameterized model with the adjusted one or more parameters (see "[0024]… simulating a photolithography process using a mask layout to produce a first simulated resist image"). Sandeep, WO2018208869, and Wong are analogous art because they are related to model predictions. Therefore, it would have been obvious to one of ordinary skill in this art before the effective filing date of the claimed invention to use Wong with Sandeep and WO2018208869, because Wong discloses that "[0023] the invention tracks how the collective movement of edge segments in a mask layout alters the resist image values at control points in the layout and simultaneously determines a correction amount for each edge segment in the layout. A multisolver matrix that represents the collective effect of movements of each edge segment in the mask layout is used to simultaneously determine the correction amount for each edge segment in the mask layout", and as a result, Wong reports that his "[0044]… method… can be applied to any mask layout for any type of mask layer, for example contact layers and poly layers, and any type of mask, for example bright-field masks or dark-field masks… also be applied to mask layouts for masks used in double-dipole illumination photolithography. Correction delta vectors can be determined for each of the two mask layouts used in double-dipole illumination such that the aerial image at low contrast is equal to zero and the resist image values from both masks is equal to zero. [0045] FIG. 7 is a diagram of plots of resist image values versus iteration number for a typical edge segment in a contact features, determined by a prior art single-variable solver and by the method of FIG. 6. As shown, the RI values resulting from corrections determined using the multisolver method of FIG. 6 converges faster (at iteration 11) to the desired RI value of zero than the prior art single-variable solver (at iteration 29)". As to claim 2, Sandeep discloses wherein the parameterized model is a machine learning model (see “An autoencoder is an unsupervised ML method that first encodes the input to a typically lower-dimensional (latent) representation and then decodes the input back from this latent representation… enables the latent space to compactly represent the maximum information in the training data, which is then used by the decoder for efficient reconstruction. Both the encoder and decoder are neural networks, hence an autoencoder is trained by measuring the reconstruction error and back-propagating the error, as is standard for training neural networks” in page 110, next to last paragraph). As to claim 3, WO2018208869 discloses the distribution of distributions using parameter dropout (see “To avoid overfitting and reduce redundancy in the extracted features, sparsity in the feature maps may be enforced by using a drop out layer at the end of the encoder and also including a L1 regularization on the codes in the L2 cost function… learning based model may vary in type and parameter values” in page 36, last paragraph to in page 37, 1st paragraph). As to claim 5, Sandeep discloses wherein the given input comprises one or more of parameterized model (see “An autoencoder is an unsupervised ML method that first encodes the input to a typically lower-dimensional (latent) representation and then decodes the input back from this latent representation… enables the latent space to compactly represent the maximum information in the training data, which is then used by the decoder for efficient reconstruction” in page 110, next to last paragraph). As to claim 6, Sandeep discloses using the determined variability in the predicted multiple posterior distributions (see “modeling… predicts the performance variability in addition to the expected performance as a function of the concurrent system-wide metrics. Formally, we… define the application I/O time by φ = f(α, ζ) + ϵ(α, ζ), (1) where α represents the metrics related to application-specific characteristics, ζ is a feature set that represents the file system-specific metrics for which the data is observable, and ϵ (α, ζ) represents a stochastic error term associated with the variability of φ” in page 110, col. 2, 1st paragraph) (see “reduce the uncertainty of the predictions”, “hyperparameters can have a significant impact on the predictive performance of the model, and hence need to be carefully chosen. We tune the hyperparameters… where the model is trained for 1000 different configuration choices of the hyperparameters and the one that gave the best testing accuracy was chosen… obtaining the global dataset by aggregating training (testing) data across all application groups” in page 111, col. 1, last paragraph to col. 2, 1st paragraph). As to claim 7, Sandeep discloses wherein the parameterized model comprises encoder-decoder architecture (see “CVAE model consists of the encoder and decoder networks” in page 111, col. 1, last paragraph). As to claim 8, Sandeep discloses wherein the encoder-decoder architecture comprises variational encoder-decoder architecture, the method further comprising training the variational encoder-decoder architecture with a probabilistic latent space, which generates realizations in an output space (see “An autoencoder is an unsupervised ML method that first encodes the input to a typically lower-dimensional (latent) representation and then decodes the input back from this latent representation… enables the latent space to compactly represent the maximum information in the training data, which is then used by the decoder for efficient reconstruction. Both the encoder and decoder are neural networks, hence an autoencoder is trained by measuring the reconstruction error and back-propagating the error, as is standard for training neural networks. A variational autoencoder (VAE) is a probabilistic version of an autoencoder, where the latent dimension (P(Z|X)) is modeled as a probability distribution, as opposed to the deterministic values. Consequently, the reconstructed inputs are also distributions (P(X|Z))” in page 110, next to last paragraph). As to claim 9, Sandeep discloses wherein the latent space comprises a low dimensional encoding (see “An autoencoder is an unsupervised ML method that first encodes the input to a typically lower-dimensional (latent) representation and then decodes the input back from this latent representation” in page 110, next to last paragraph). As to claim 10, Sandeep discloses determining, for the given input, a conditional probability of a latent variable using an encoder part of the encoder-decoder architecture (see “A variational autoencoder (VAE) is a probabilistic version of an autoencoder, where the latent dimension (P(Z|X)) is modeled as a probability distribution, as opposed to the deterministic values” in page 110, next to last paragraph). As to claim 11, Sandeep discloses determining a conditional probability using a decoder part of the encoder-decoder architecture (see “decoder is now designed to reconstruct the outputs conditioned on the inputs and latent variable(s) (i.e., P(Y |Z,X)). Thus when the likelihood is assumed to be a Gaussian distribution, the decoder outputs μy(X,Z) and Σy(X,Z), which yields an input-dependent distribution of Y. We adopt a variant of CVAE called the IWAE [8] that weights the samples drawn from the latent variable according to their probabilities” in page 111, 2nd paragraph). As to claim 12, while Sandeep discloses wherein sampling comprises (see “decoder network then seeks to reconstruct the input distribution by sampling from the latent variable with an objective to maximize the probability that the observed data is from this input distribution… decoder is now designed to reconstruct the outputs conditioned on the inputs and latent variable(s) (i.e., P(Y |Z,X)). Thus when the likelihood is assumed to be a Gaussian distribution, the decoder outputs μy(X,Z) and Σy(X,Z), which yields an input-dependent distribution of Y. We adopt a variant of CVAE called the IWAE [8] that weights the samples drawn from the latent variable according to their probabilities” in page 111, 1st-2nd paragraphs), WO2018208869 discloses randomly (see “sampling for training may be performed in any suitable manner known in the art” in page 44, 1st paragraph). As to claim 13, Sandeep discloses wherein the uncertainty of the image predictions (see “model the input-dependent noise and handle dissimilar applications. The resulting model provides the I/O performance prediction as well as distributional quantities (e.g., credible intervals) that quantify the uncertainty in the prediction” in page 112, col. 2, 4th paragraph) is related to an uncertainty of weights of parameters of the parameterized model, and a size and descriptiveness of the latent space (see “decoder network then seeks to reconstruct the input distribution by sampling from the latent variable with an objective to maximize the probability that the observed data is from this input distribution… decoder is now designed to reconstruct the outputs conditioned on the inputs and latent variable(s) (i.e., P(Y|Z,X)). Thus when the likelihood is assumed to be a Gaussian distribution, the decoder outputs μy(X,Z) and Σy(X,Z), which yields an input-dependent distribution of Y. We adopt a variant of CVAE called the IWAE [8] that weights the samples drawn from the latent variable according to their probabilities” in page 111, 1st-2nd paragraphs). As to claim 14, Sandeep discloses adjusting the one or more parameters of the parameterized model (see “modeling… predicts the performance variability in addition to the expected performance as a function of the concurrent system-wide metrics. Formally, we… define the application I/O time by φ = f(α, ζ) + ϵ(α, ζ), (1) where α represents the metrics related to application-specific characteristics, ζ is a feature set that represents the file system-specific metrics for which the data is observable, and ϵ (α, ζ) represents a stochastic error term associated with the variability of φ” in page 110, col. 2, 1st paragraph) to reduce the uncertainty of the image predictions comprises: increasing a training set size (see “reduce the uncertainty of the predictions”, “hyperparameters can have a significant impact on the predictive performance of the model, and hence need to be carefully chosen. We tune the hyperparameters… where the model is trained for 1000 different configuration choices of the hyperparameters and the one that gave the best testing accuracy was chosen… obtaining the global dataset by aggregating training (testing) data across all application groups” in page 111, col. 1, last paragraph to col. 2, 1st paragraph). As to claims 15 and 16, these claims recite a computer readable medium for performing the method of claims 1 and 3. WO2018208869 discloses a computer readable medium (see “computer-readable medium” in page 6, last paragraph) for performing a method that teaches claims 1 and 3. Therefore, claims 15 and 16 are rejected for the same reasons given above. As to claim 18, Sandeep discloses wherein: the parameterized model comprises encoder-decoder architecture (see “CVAE model consists of the encoder and decoder networks” in page 111, col. 1, last paragraph); and the encoder-decoder architecture comprises variational encoder-decoder architecture, the operations further comprising training the variational encoder-decoder architecture with a probabilistic latent space, which generates realizations in an output space (see “An autoencoder is an unsupervised ML method that first encodes the input to a typically lower-dimensional (latent) representation and then decodes the input back from this latent representation… enables the latent space to compactly represent the maximum information in the training data, which is then used by the decoder for efficient reconstruction. Both the encoder and decoder are neural networks, hence an autoencoder is trained by measuring the reconstruction error and back-propagating the error, as is standard for training neural networks. A variational autoencoder (VAE) is a probabilistic version of an autoencoder, where the latent dimension (P(Z|X)) is modeled as a probability distribution, as opposed to the deterministic values. Consequently, the reconstructed inputs are also distributions (P(X|Z))” in page 110, next to last paragraph). As to claims 19 and 20, Wong discloses wherein the image predictions comprise predictions of (see "[0024]… simulating a photolithography process using a mask layout to produce a first simulated resist image"). Claims 4 and 17 rejected under 35 U.S.C. 103(a) as being unpatentable over Sandeep taken in view of WO2018208869 in view of Wong as applied to claims 1 and 15 above, and further in view of Y. Burda, R. Grosse, and R. Salakhutdinov, (Burda hereinafter), “Importance weighted autoencoders” (see PTO-892 Notice of Reference Cited dated 06/06/2024). As to claims 4 and 17, Sandeep discloses … determining the variability of the predicted multiple posterior distributions for the given input by sampling from the distribution of distributions comprises determining the variability of the first and second sets of predicted multiple posterior distributions (“We adopt a variant of CVAE called the IWAE [8] that weights the samples drawn from the latent variable according to their probabilities” in page 111, 2nd paragraph) for the given input by sampling from the distribution of distributions for the first and second sets (see “input“ as “metrics“, “modeling… predicts the performance variability in addition to the expected performance as a function of the concurrent system-wide metrics. Formally, we… define the application I/O time by φ = f(α, ζ) + ϵ(α, ζ), (1) where α represents the metrics related to application-specific characteristics, ζ is a feature set that represents the file system-specific metrics for which the data is observable, and ϵ (α, ζ) represents a stochastic error term associated with the variability of φ” in page 110, col. 2, 1st paragraph); and using the determined variability in the predicted multiple posterior distributions (see “input“ as “metrics“, “modeling… predicts the performance variability in addition to the expected performance as a function of the concurrent system-wide metrics. Formally, we… define the application I/O time by φ = f(α, ζ) + ϵ(α, ζ), (1) where α represents the metrics related to application-specific characteristics, ζ is a feature set that represents the file system-specific metrics for which the data is observable, and ϵ (α, ζ) represents a stochastic error term associated with the variability of φ” in page 110, col. 2, 1st paragraph) to quantify uncertainty of the (“We adopt a variant of CVAE called the IWAE [8] that weights the samples drawn from the latent variable according to their probabilities” in page 111, 2nd paragraph) to quantify the uncertainty of the (see “model the input-dependent noise and handle dissimilar applications. The resulting model provides the I/O performance prediction as well as distributional quantities (e.g., credible intervals) that quantify the uncertainty in the prediction” in page 112, col. 2, 4th paragraph) and Wong discloses image (see "[0036]… resist image may be simulated using a photolithography simulation system"). Sandeep, WO2018208869, and Wong fail to disclose, but in a NPL cited by Sandeep, Burda discloses causing the parameterized model to predict the multiple posterior distributions from the parameterized model for a given input comprises causing the parameterized model to predict a first set of multiple posterior distributions corresponding to a first posterior distribution pΘ(z|x), and a second set of multiple posterior distributions corresponding to a second posterior distribution pφ(y|z) (see “The VAE objective… heavily penalizes approximate posterior samples which fail to explain the observations. This places a strong constraint on the model, since the variational assumptions must be approximately satisfied in order to achieve a good lower bound… This VAE criterion may be too strict… If we lower our standards… this may give us additional flexibility to train a generative network whose posterior distributions do not fit the VAE assumptions. This is the motivation behind our proposed algorithm, the Importance Weighted Autoencoder (IWAE)” in page 3, 3 IMPORTANCE WEIGHTED AUTOENCODER, 1st paragraph). Sandeep, WO2018208869, Wong, and Burda are analogous art because they are related to machine learning model predictions. Therefore, it would have been obvious to one of ordinary skill in this art before the effective filing date of the claimed invention to use Burda with Sandeep, WO2018208869, and Wong, because Burda "presented the importance weighted autoencoder, a variant on the VAE trained by maximizing a tighter log-likelihood lower bound derived from importance weighting. We showed empirically that IWAEs learn richer latent representations and achieve better generative performance than VAEs with equivalent architectures and training time. We believe this method may improve the flexibility of other generative models currently trained with the VAE objective" (see page 8, 6 CONCLUSION). Response to Arguments Regarding the rejections under 101, Applicant's arguments have been considered, but they are not persuasive. Applicant argues, (see page 10, 1st paragraph to page 13, 3rd paragraph): ‘… At best, the Office identified some elements of the claims as corresponding to mental concepts. But this is not enough because the Office has not identified any abstract idea that Applicant's claims allegedly monopolize, thereby not meeting its burden of establishing a prima facie case under Prong One. Accordingly, the rejection of claims under§ 101 should be withdrawn for at least this reason. Instead of identifying the abstract idea of the claim as a whole, the Office improperly dissects the claims into discrete elements. The Office allegation was relied on only one of those discrete elements of claim 1, rather than analyzing claim 1 as a whole. For example, as to claim 1, the Office stated… the Office focused only on the "determining a variability... " and "using the determined variability ... " elements of claim 1, but simply ignored the rest of the claim, e.g., "causing a parameterized model to predict multiple posterior distributions ... ," "adjusting one or more parameters ... ," "simulating a photolithography process ... ," etc. … the Office incorrectly asserted that the claims recite an abstract idea by alleging some claim elements are categorized as a "mental process." See Office Action at 3. Contrary to the Office's allegations, the claims are not directed to a mental process. Instead, the claims include specific technical features that are used to address issues in wafer processing using a photolithography apparatus. Amendments to independent claims 1 and 15 clarify additional features that further remove the claims from the realm of a mental process, such as "receiving a given input, wherein the given input comprises a mask image and an image of an actual wafer pattern produced using the mask image" and "determining one or more photolithography process parameters based on predictions from the adjusted parameterized model for adjusting a photolithography apparatus to change focus or pupil shape based on the one or more determined photolithography process parameters." E.g., Applicant amended claim 1. At least these claim features cannot feasibly be performed mentally so as to adequately address problems in the semiconductor wafer processing using a photolithography apparatus and improving the performance of the apparatus using the machine-learning-based predictions. See e.g., specification as filed, ¶¶[0009]-[0016] and [0114]-[0121], and independent claims 1 and 15. Neither of as-amended claims 1 or 15 recites an abstract idea under Prong One when the claims are evaluated as a whole. A person of ordinary skill in the art in the field of mask manufacturing and patterning processes would understand that the above-noted specific technical language cannot practically be performed mentally. "Claims do not recite a mental process when they do not contain limitations that can practically be performed in the human mind, for instance when the human mind is not equipped to perform the claim limitations"… … the human mind is not equipped to "receiving a given input, wherein the given input comprises a mask image and an image of an actual wafer pattern produced using the mask image," "simulating a photolithography process using the parameterized model with the adjusted one or more parameters," and "determining one or more photolithography process parameters based on predictions from the adjusted parameterized model for adjusting a photolithography apparatus to change focus or pupil shape based on the one or more determined photolithography process parameters," as recited in Applicant's claims…’ The MPEP reads (underline emphasis added): ‘2106.04… II… A… 2. Prong Two asks does the claim recite additional elements that integrate the judicial exception into a practical application?… If the additional elements in the claim integrate the recited exception into a practical application of the exception, then the claim is not directed to the judicial exception (Step 2A: NO) and thus is eligible at Pathway B… For a claim reciting a judicial exception to be eligible, the additional elements (if any) in the claim must "transform the nature of the claim" into a patent-eligible application of the judicial exception, Alice… either at Prong Two or in Step 2B’ ‘2106.05(f) Mere Instructions To Apply An Exception [R-10.2019]… In addition to the abstract idea, the claims also recited the additional element of…’. ‘2106.07(a)… II… After identifying the judicial exception in the rejection, identify any additional elements (features/limitations/steps) recited in the claim beyond the judicial exception and explain why they do not integrate the judicial exception into a practical application and do not add significantly more to the exception’ About "additional elements", BASCOM2, (BASCOM hereinafter) reads: “the ‘elements of each claim both individually and ‘as an ordered combination’ to determine whether the additional elements [beyond those that recite the abstract idea”. Examiner's response: Applicant’s argument is not persuasive, because Applicant’s arguments conflate judicial exception(s) or abstract idea(s) (Step 2A, Prong One) with additional elements (Step 2A, Prong Two or Step 2B). Throughout the prosecution of this application, in accordance with the guidance set forth in MPEP (supra) and in several decisions, BASCOM (supra) for example, the Examiner does not conflate judicial exception(s) or abstract idea(s) (Step 2A, Prong One) with additional elements (Step 2A, Prong Two or Step 2B). Applicant argues that the additional elements are not judicial exception(s) or abstract idea(s), but the additional elements were addressed in Examiner's rejection Step 2A, Prong Two and/or Step 2B. Applicant's arguments do not address these limitations as additional elements, as pointed out by the Examiner. Applicant further argues, (see page 13, next to last paragraph to page 19, 2nd paragraph): ‘… the Office has failed to provide the requisite analysis when it only focused on what it alleged to be additional elements without taking into consideration the other elements… … the claims describe the improvements in a specific manner. Amended claims 1 and 15 recite in part, "receiving a given input, wherein the given input comprises a mask image and an image of an actual wafer pattern produced using the mask image," "simulating a photolithography process using the parameterized model with the adjusted one or more parameters," and "determining one or more photolithography process parameters based on predictions from the adjusted parameterized model for adjusting a photolithography apparatus to change focus or pupil shape based on the one or more determined photolithography process parameters." At least these features are directed to an improvement in the technical field of "mask manufacturing and patterning processes." Specification as filed, ¶ [0002]; see also id. ¶ [0009] ("According to some embodiments, there is provided a method for adjusting a photolithography apparatus.") (emphasis added). The approach of quantifying and reducing uncertainty in parameterized model image predictions to improve photolithography apparatus for semiconductor fabrication helps with overcoming "the ultimate functionality of a wafer." Specification as filed, ¶ [0071]. This approach creates a technological feedback loop where uncertainly quantification drives optimization of a physical system (the photolithography apparatus). As such, Applicant's claims are directed to, among other features, solving technical problems of the semiconductor wafer processing apparatus and method due to uncertainties in model predictions… … claims 1 and 15 specifically address the technical problems of the photolithography apparatus in the semiconductor wafer processing, rather than merely using a general purpose computer as a tool. … claims 1 and 15 are eligible for reasons analogous to the claims held eligible in Ex parte Desjardins… … Applicant's present Specification explains technical problems in conventional photolithography systems where "the certainty of predictions made by the machine learning model is not clear. That is, given an input, it is not clear whether prior machine learning models generate accurate and consistent output." Specification as filed, ¶ [0071]. The Specification further explains that "uncertainties about the predictions of a machine learning model may produce uncertainties in a proposed mask layout" and that "[t]hese uncertainties may result in questions about the ultimate functionality of a wafer." Id. Indeed, the Specification notes that "[u]ntil now, however, there was no method to determine variability (or uncertainty) in the output from a model." Id. Like Desjardins, Applicant's Specification describes improvements to address these technical problems in conventional photolithography apparatus, including "receiving a given input, wherein the given input comprises a mask image and an image of an actual wafer pattern produced using the mask image," "determining one or more photolithography process parameters based on predictions from the adjusted machine learning model based on the given input; and adjusting the photolithography apparatus based on the one or more determined photolithography process parameters." Specification as filed, ¶ [0009]… Thus, as a whole, claim 1 provides improvements to how the recited parameterized model controls photolithography apparatus operation, addressing the technical problem of unknown prediction certainty that affects wafer functionality, analogous to the eligible claims of Desjardins. Accordingly, Applicant submits claims 1 and 15 are eligible under Prong Two… … claimed features, in an ordered combination, recite an inventive concept of a method for adjusting a photolithography apparatus based on quantification of uncertainty in parameterized model image predictions. Specification as filed, ¶ [0009]-[0016], [0071], [0080], [0114]-[0121]… an improvement of ultimate functionality of a wafer fabricated by the claimed photolithography apparatus by improving the accuracy of the parametrized model's predictive capability…’ Examiner's response: Applicant's argument is not persuasive, because claim 1 does not read "optimization of a physical system (the photolithography apparatus)”, “improvements to how the recited parameterized model controls photolithography apparatus operation”, or "a wafer fabricated by the claimed photolithography apparatus", as argued. Claim 1 does read "for a photolithography apparatus" and “for adjusting the photolithography apparatus to change focus or pupil shape based on the one or more determined photolithography process parameters", they are no more than intended use. As to the limitations "adjusting the photolithography apparatus", the "adjusting" is of "determined photolithography process parameters". In the specification, the Examiner notes that the claimed "photolithography apparatus" is not associated with a physical apparatus. A physical apparatus in the specification is not referred to as a "photolithography apparatus" (see Independent claim 1, Step 2B above). Therefore, the rejections are maintained. Regarding the arguments with respect to the rejection under 103, Applicant’s arguments with respect to the independent claims have been fully considered, but they are not persuasive. Applicant argues that the prior art disclosures in the previous rejection fail to teach the newly added limitations. These features of Applicants' claims and arguments were newly added. The previous Office Action could not have pointed out disclosures of a limitation that was not claimed before. See rejection supra. Independent claims are rejected over Sandeep taken in view of WO2018208869 and further in view of Wong; instead of Sandeep taken in view of EP3343456 and further in view of Tsai, and WO2018208869 is newly cited. Conclusion Examiner would like to point out that any reference to specific figures, columns and lines should not be considered limiting in any way, the entire reference is considered to provide disclosure relating to the claimed invention. Any inquiry concerning this communication or earlier communications from the examiner should be directed to JUAN CARLOS OCHOA whose telephone number is (571)272-2625. The examiner can normally be reached Mondays, Tuesdays, Thursdays, and Fridays 9:30AM - 7:00 PM. Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice. If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Renee Chavez can be reached on 571-270-1104. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300. Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. /JUAN C OCHOA/Primary Examiner, Art Unit 2186 1 Electric Power Group, LLC v. Alstom S.A., 119 USPQ2d 1739 Fed. Cir. 2016 2 BASCOM Global Internet Services, Inc. v. AT&T Mobility LLC, U.S. Court of Appeals for the Federal Circuit, No. 2015-1763 (June 27, 2016)
Read full office action

Prosecution Timeline

May 28, 2021
Application Filed
Jun 03, 2024
Non-Final Rejection — §101, §103, §112
Sep 04, 2024
Response Filed
Nov 21, 2024
Final Rejection — §101, §103, §112
Jan 24, 2025
Response after Non-Final Action
Feb 06, 2025
Request for Continued Examination
Feb 11, 2025
Response after Non-Final Action
Jun 26, 2025
Non-Final Rejection — §101, §103, §112
Sep 23, 2025
Response Filed
Nov 10, 2025
Final Rejection — §101, §103, §112
Feb 12, 2026
Response after Non-Final Action
Mar 09, 2026
Request for Continued Examination
Mar 13, 2026
Response after Non-Final Action
Mar 20, 2026
Non-Final Rejection — §101, §103, §112 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

Patent 12584379
SYSTEMS AND METHODS FOR NOTCHING A TARGET WELLBORE IN A SUBSURFACE FORMATION
2y 5m to grant Granted Mar 24, 2026
Patent 12566898
Associativity and Resolution of Computer-Based Models and Data
2y 5m to grant Granted Mar 03, 2026
Patent 12468867
SIMULATION METHOD, SIMULATION APPARATUS, COMPUTER READABLE MEDIUM, FILM FORMING APPARATUS, AND METHOD OF MANUFACTURING ARTICLE
2y 5m to grant Granted Nov 11, 2025
Patent 12419687
NASAL IMPLANT DESIGN METHOD OF MANUFACTURING PATIENT-CUSTOMIZED NASAL IMPLANT
2y 5m to grant Granted Sep 23, 2025
Patent 12379718
MODEL PREDICTIVE MAINTENANCE SYSTEM FOR BUILDING EQUIPMENT
2y 5m to grant Granted Aug 05, 2025
Study what changed to get past this examiner. Based on 5 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Powered by AI — typically takes 5-10 seconds

Prosecution Projections

5-6
Expected OA Rounds
68%
Grant Probability
91%
With Interview (+22.8%)
4y 2m
Median Time to Grant
High
PTA Risk
Based on 520 resolved cases by this examiner. Grant probability derived from career allow rate.

Sign in with your work email

Enter your email to receive a magic link. No password needed.

Personal email addresses (Gmail, Yahoo, etc.) are not accepted.

Free tier: 3 strategy analyses per month