DETAILED ACTION
Claims 1-20 (filed 04/30/2026) have been considered in this action. Claims 1, 7, 11 and 17 have been amended. Claims 2-6, 8-10, 12-16 and 18-20 have been filed in the same format as previously filed.
Response to Arguments
Applicant’s arguments, see page 8 paragraph 1, filed 04/30/2026, with respect to rejection of claims 7 and 17 under 35 U.S.C. 112(a) and 35 U.S.C. 112(b) have been fully considered and are persuasive. The rejection of claims 7 and 17 under 35 U.S.C. 112(a) and 35 U.S.C. 112(b) has been withdrawn.
Applicant's arguments in regards to rejection of claim 1 over David in view of Liano under 35 U.S.C. 103, see page 8 paragraph 2, filed 04/30/2026 have been fully considered but they are not persuasive.
Applicant has argued that David does not teach:
receiving, by a processing device, expected output data defining a target value for an attribute of a semiconductor device to be manufactured by at least one semiconductor device manufacturing process performed within at least one processing chamber
Specifically, applicant has argued that the target data of David does not qualify as “expected output data defining a target value for an attribute of a semiconductor device to be manufactured” and pointed to references where David states the target data is historical measurement data without providing consideration to other mentions within David that “[0068] In step 602, a target is selected. In one embodiment, the target is an overlay measurement (e.g., IBO measurement, DBO measurement, CD-SEM, TEM, etc.) and could be a linear overlay offset in the x and y direction. The target could also be other lithography apparatus parameters that need to be controlled to minimize overlay error, such as reticle position, reticle rotation, or reticle magnification. The target could be parametric data such as on/off current of the transistor, transistor thresholds, or some other parameter that quantifies the health of the transistor. The target could also be yield information, such as the functionality of a given die or area on the wafer (sometimes measured as either pass or fail). The target could also be semiconductor device performance data”. David uses the selected target for collecting data to train machine learning models to learn the correlations between the target and input parameters for configuring a semiconductor device as outlined in Figure 6. David then deploys that model in a cascaded manner so that predictions of outcomes from one semiconductor manufacturing device are used to control an upstream semiconductor manufacturing device by producing the inputs for configuring that upstream device as David states “[0096] virtual metrology predictions generated from upstream process equipment and metrology data can be used as inputs to the model. This essentially represents a multi-step model or algorithm, where first the virtual metrology predictions are determined by a first algorithm. For example, the outputs can be used as inputs to another algorithm designed for overlay error compensation, overlay error measurement, or yield prediction” and “[0092] As previously discussed, the convolution of CD error and overlay error can affect device performance. In order to optimize the device performance, it may be necessary to adjust the overlay for a given CD. In one embodiment, machine learning algorithms could be used with all or some of the above mentioned input data, along with CD error measurement and overlay error measurement to create a model whose target is a lithography apparatus control parameter, such as focus, power, or x-y direction control.”. In other words, the input for the machine learning algorithm that determines the parameters for adjusting overlay errors in a semiconductor manufacturing device are based on the expected output of the virtual metrology predictions from upstream devices. PHOSITA would not consider a virtual metrology prediction to be a measured value, as it’s a prediction.
David therefore teaches that parameters that define the values for reticle position/rotation/magnification are those targets, and are expected outputs because they are parameters that “need to be controlled to minimize overlay error” and relate to a predicted amount of overlay error meaning they are for processes which have not been performed yet. PHOSITA would understand that control parameters that produce a product cannot be adjusted after the fact that a product is made. This is further supported by paragraph [0050] of David which states “[0050] In yet another example, machine learning algorithms can be used to control a manufacturing process step. As noted above, virtual metrology can be used to predict a critical dimension or film thickness for a manufacturing process step. Before or during processing of this manufacturing step, the prediction can then be used to set and/or control any number of processing parameters (e.g. run time) for that processing step. For example, in the case of CMP, if virtual metrology predicts that a dielectric film thickness will be 100 Angstroms thicker than the target thickness if the wafer was to be polished at the nominal polish time, then a calculation can be made to lengthen the polish time so that the final polished thickness can be closer to the target thickness.”. In this example, a target expected thickness is predicted/expected to be a nominal value, which is then put into the machine learning algorithms to determine a new input parameter of polishing time that will get the process to the actual target thickness. This is completely against the argument provided by the applicant that “[page 10 paragraph 2] David’s entire disclosure is related to analyzing data from processes that have already occurred, not to receiving target values for future semiconductor device manufacturing” as the predicted thickness of the wafer is predicted to be greater than target value by 100, and thus adjustment is made to lengthen the polish time from this prediction that the target value will not be achieved, and provides the control parameters necessary (input data) for controlling the polishing to reach the target thickness.
The examiner further does not find convincing the argument that David does not teach “wherein each inverted machine learning model of the plurality of homogeneous inverted machine learning models shares a model architecture and is trained using a different set of data”, because David specifically recites “[0085] In step 612, the data is then fed into the algorithm for training. The algorithm could be one of many different types of algorithms. Examples of machine learning algorithms include… and Ensemble, including Boosting/Bagging”. In particular, the ensemble bagging technique is a well-known technique that utilizes a plurality of machine learning algorithms with the same/homogenous architecture that are fed different sets of data in their learning/training phase to produce different models of the same architecture, but with different configuration parameters (hyperparameters, etc.). For example, the provided Dieckman reference (Ensemble learning: Bagging and Boosting) explains that the Bagging technique is known to “[page 6] Let us focus first on the Bagging technique called bootstrap aggregation. Bootstrap aggregation aims to solve the right extreme of the previous chart by reducing the variance of the model to avoid overfitting. With this purpose, the idea is to have multiple models of the same learning algorithm that are trained by random subsets of the original training data. Those random subsets are called bags and can contain any combination of the data. Each of those datasets is then used to fit an individual model which produces individual predictions for the given data. Those predictions are then aggregated into one final classifier.”. In other words, by David stating they use an ensemble bagging technique, PHOSITA would recognize this as meaning “a plurality of homogeneous inverted machine learning models that model the at least one semiconductor device manufacturing process….wherein each inverted machine learning model of the plurality of homogeneous inverted machine learning models shares a model architecture and is trained using a different set of data” in the context of David using an bagging technique of machine learning algorithms to determine expected input parameters for configuring a semiconductor manufacturing device to achieve a target expected output.
Applicant’s arguments with respect to the use of Liano in claims 1 and 11 have been considered but are moot because the new ground of rejection does not rely on the Sardeshmukh reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument.
With the change in scope provided by claim amendment, and the supplying of Sardeshmukh et al. (US 20200133248), a new grounds of rejection under 35 U.S.C. 103 in view of David and Sardeshmukh is applied. See below for a mapping of features to the applied prior art.
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claim(s) 1, 5-7, 11 and 15-17 are rejected under 35 U.S.C. 103 as being unpatentable over David (US 20170109646, hereinafter David) in view of Sardeshmukh et al. (US 20200133248, hereinafter Sardeshmukh).
In regards to Claim 1, David teaches “A method comprising: receiving, by a processing device, expected output data defining target value of an attribute of a semiconductor device to be manufactured by at least one semiconductor device manufacturing process performed within at least one processing chamber” ([0029] machine learning algorithms can be used to create new approaches to data analysis by incorporating new types of input data, and the data can be more effectively correlated, organized and pre-processed, then used to make process adjustments. Data from prior production runs can be used to create a model for a target parameter, and data from a current production run can be input to the model to generate a prediction for the target parameter, and to correlate the prediction with the actual data [0068] In step 602, a target is selected. In one embodiment, the target is an overlay measurement (e.g., IBO measurement, DBO measurement, CD-SEM, TEM, etc.) and could be a linear overlay offset in the x and y direction. The target could also be other lithography apparatus parameters that need to be controlled to minimize overlay error, such as reticle position, reticle rotation, or reticle magnification. The target could be parametric data such as on/off current of the transistor, transistor thresholds, or some other parameter that quantifies the health of the transistor. [0096] virtual metrology predictions generated from upstream process equipment and metrology data can be used as inputs to the model. This essentially represents a multi-step model or algorithm, where first the virtual metrology predictions are determined by a first algorithm. For example, the outputs can be used as inputs to another algorithm designed for overlay error compensation) “wherein the expected output data corresponds to an unexplored portion of a process space associated with the at least one semiconductor device manufacturing process” ([0088] FIG. 7 illustrates one example of collecting input data for an input feature set 710, which is a matrix 712 having a number of input parameters 712a, 712b . . . 712x, which are relevant to a specified target, which may be a measurement, a calculated parameter, or a modeled parameter. The input data may be collected during wafer fabrication, at or before wafer test and sort and/or wafer probe testing. For example, input data can be collected from the process equipment 720 during steps for etch, CMP, gap fill, blanket, RTP, etc., and may include process variables such as process duration, temperature, pressure, RF frequency, etc.... In step 802, specified input data is collected, e.g., as an input vector, then fed into the model in step 804. If some of the specified data is not present in the 1×n vector, there are a number of techniques that can replace or estimate the missing data in the input vector; wherein missing data is an unexplored portion of a process space; [0104] In an embodiment, as new input data and corresponding target data is generated, the algorithm can be retrained so as to produce a better model that will give better scores; wherein the new target data is inherently from an unexplored space, because it is new and thus has not been used for training [0117] If a given input has a large number of missing or corrupted values, then that input feature may be removed from consideration in training the model. For example, if more than 50% of the data is not present for a given input feature, then that input feature can be thrown out. Alternatively, the missing data fields may be filled in with nominal values, or the records that do not contain values may be completely removed from the training dataset. A determination of which technique to use can be decided based on a human judgment of the importance of a given input feature; wherein the missing data fields imply an unexplored portion because those values are missing) “and identifying, by the processing device, expected input data by using the expected output data as input to a plurality of homogeneous inverted machine learning models that model the at least one semiconductor device manufacturing process” ([0050] machine learning algorithms can be used to control a manufacturing process step. As noted above, virtual metrology can be used to predict a critical dimension or film thickness for a manufacturing process step. Before or during processing of this manufacturing step, the prediction can then be used to set and/or control any number of processing parameters (e.g. run time) for that processing step. For example, in the case of CMP, if virtual metrology predicts that a dielectric film thickness will be 100 Angstroms thicker than the target thickness if the wafer was to be polished at the nominal polish time, then a calculation can be made to lengthen the polish time so that the final polished thickness can be closer to the target thickness. [0096] In some embodiments, virtual metrology predictions generated from upstream process equipment and metrology data can be used as inputs to the model. This essentially represents a multi-step model or algorithm, where first the virtual metrology predictions are determined by a first algorithm. For example, the outputs can be used as inputs to another algorithm designed for overlay error compensation, overlay error measurement, or yield prediction [0098] The algorithm can be a classification or regression algorithm, which are types of machine learning algorithms, but could be one of many different types of algorithms. Examples of some of these algorithms that can be used include: Decision Trees, CART (Classification and Regression Trees), C5.0, C4.5, CHAID, Support Vector Regression, Artificial Neural Networks, Perceptron, Back Propagation, Deep Learning, Ensemble, Boosting/Bagging, Random Forests, GBM (Gradient Boosting Machine), AdaBoos; wherein the models of David are inverted machine learning models, as they determine an expected configuration parameter from historical training data of machine learning models trained with previous manufacturing runs and the ensemble with bagging techniques implies plurality of homogenous models that are trained with different training data) “wherein each inverted machine learning model of the plurality of homogeneous inverted machine learning models is trained to determine, by performing linear extrapolation based on the expected output data, a respective set of input data of a plurality of sets of input data for configuring the semiconductor device manufacturing process to manufacture the semiconductor device having the attribute” ([0090] In a typical situation, the score can be the overlay offset prediction, for example, an offset in the x direction or the y direction. In step 808, the score is used to determine an adjustment to be made to one or more components of the lithographic apparatus. For example, the offset data could be applied to a control system to make an adjustment to the lithography apparatus parameters or “control knobs” to adjust for the overlay error. [0092] In one embodiment, machine learning algorithms could be used with all or some of the above mentioned input data, along with CD error measurement and overlay error measurement to create a model whose target is a lithography apparatus control parameter, such as focus, power, or x-y direction control. The goal is to optimize the lithography apparatus control parameter (given a measured CD) such that the lithography apparatus output results in the best semiconductor device performance or yield. [0120] There are also a number of approaches to feature selection. One approach is implementing random forests which identify which input features are most relevant to predicting overlay error. Another technique is the CHAID decision tree, which will also identify features that are important. Linear regression is another technique. ANOVA is another technique) “wherein each inverted machine learning model of the plurality of homogenous inverted machine learning models shares a model architecture and is trained using a different set of data” ([0098] The algorithm can be a classification or regression algorithm, which are types of machine learning algorithms, but could be one of many different types of algorithms. Examples of some of these algorithms that can be used include:.. Ensemble, Boosting/Bagging; [0156] In some embodiments, the model is trained on a portion of the data. It is then tested on a different portion of the data that is blind to the training phase. K-fold cross validation can also be applied to determine the robustness of the model. In the case of boosted on bagged algorithms, a training, testing, and validation dataset can be partitioned, where the validation set is completely blind while the testing set is used to optimize the model; wherein a bagging technique is known to use a plurality of homogenous machine learning models share an architecture but each of which is trained with different training data).
David fails to explicitly teach “…machine learning models is trained to determine, by performing linear extrapolation based on the expected output data, a respective set of input data…”. That is, while David suggests linear regression type mathematical feature identification for identifying parameters for configuring the semiconductor manufacturing device, it is deficient in teaching the performance of linear extrapolation for determining such parameters.
Sardeshmukh teaches “wherein the expected output data corresponds to an unexplored portion of a process space” ([0005] An alternative is to use fast and approximate data-driven models learnt from data generated from carefully designed limited number of expensive simulations. Hence, inverse prediction model is useful in problems with large design spaces, for narrowing down the design space so that expensive simulations and experiments could be used to explore the narrowed down design space.) “…machine learning models is trained to determine, by performing linear extrapolation based on the expected output data, a respective set of input data…” ([0016] Referring FIG. 1, a system (100) to predict a configuration of a manufacturing process for desired properties of a product is provided. Further herein, the system is configured for an inverse inference of a chain of manufacturing process to predict the configuration of the chain of the manufacturing processes for the desired properties of the product. A variant of the conditional Linear Gaussian Bayesian network to be used. As all the variables (process parameters and properties) being modeled are continuous variables, a Bayesian network variant that supports continuous distributions is needed. Further, the system comprises a model capable of representing non-linear relationships by learning piecewise linear approximations. [0052] The embodiments of present disclosure herein addresses unresolved problem of prediction of outputs/properties given the processing parameters (forward prediction problem) and prediction of inputs/process parameters required to achieve desired outputs/properties of a product. Moreover, the embodiments herein further provides a system and method to predict a configuration of a manufacturing process for desired properties of the product. Further herein, the system is configured for an inverse inference of a chain of manufacturing process to predict the configuration of the chain of the manufacturing processes for the desired properties of the product. A variant of the conditional Linear Gaussian Bayesian network to be used. As all the variables (process parameters and properties) being modeled are continuous variables, a Bayesian network is needed for continuous variables. Further, the system comprises a model capable of representing non-linear relationships and the model is capable of learning piecewise linear approximations; wherein the learning/training of piecewise linear approximation between the inputs and outputs of the model is a linear extrapolation).
It would have been obvious to a person having ordinary skill in the art before the effective file date of the claimed invention to have modified the system that utilized multiple machine learning models to determine control parameters for a semiconductor process as taught by David, with the use of machine learning models that extrapolate solutions of unexplored/unknown space using linear extrapolation as taught by Sardeshmukh, because it would provide the stated improvement of Sardeshmukh, namely “[0006] Embodiments of the present disclosure present technological improvements as solutions to one or more of the above-mentioned technical problems recognized by the inventors in conventional systems. For example, in one embodiment, a system to predict a configuration of a manufacturing process for desired properties of a product is provided.”. In other words, an improved model that is able to make determinations outside the training data set would be realized by David when incorporating the features of Sardeshmukh. Furthermore, both David and Sardeshmukh relate in their use of machine learning to determine control/configuration parameters for a manufacturing process, thus obviating their combination. By combining these elements, it can be considered taking the known use of a machine learning model that uses linear extrapolation of unknown space for determining control parameters of a manufacturing process, and incorporating these features into the known machine learning models that determines control parameters for semiconductor manufacturing processes using expected outputs as inputs to a plurality of homogenous machine learning models in a known way that achieves predictable results.
In regards to Claim 11, the servers and clients of David ([0165]) teach the recited structures. Claim 11 corresponds with a system that performs the method of claim 1, and thus claim 11 is rejected under 35 U.S.C. 103 using a similar analysis as applied to claim 1.
In regards to Claim 5, the combination of David and Sardeshmukh teach the method as incorporated by claim 1 above. David further teaches “The method of claim 1, wherein each set of input data comprises data related to performing the semiconductor device manufacturing process that is indicative of at least one of: time, energy, temperature, voltage, gas flow rate, wafer spin speed, distance, pressure, a precursor, a reactant, or a dilutant” ([0046] The algorithm can be a supervised learning algorithm, where a model can be trained using a set of input data and measured targets. The targets can be the critical dimensions that are to be controlled. The input data can be upstream metrology measurements, or data from process equipment (such as temperatures and run times)).
In regards to Claim 15, the servers and clients of David ([0165]) teach the recited structures. Claim 15 corresponds with a system that performs the method of claim 5, and thus claim 15 is rejected under 35 U.S.C. 103 using a similar analysis as applied to claim 5.
In regards to Claim 6, the combination of David and Sardeshmukh teach the method as incorporated by claim 1 above. David further teaches “The method of claim 1, wherein the expected output data for the manufacturing process comprises one or more values that indicate a layer thickness, a layer uniformity, or a structural width of a product that will be output by the manufacturing process.” ([0046] In another example, virtual metrology can use machine learning algorithms to predict metrology metrics such as film thickness and critical dimensions (CD) without having to take actual measurements, in real-time. This can have a big impact on throughput and also lessen the need for expensive TEM or SEM x-section measurements; [0050] In yet another example, machine learning algorithms can be used to control a manufacturing process step. As noted above, virtual metrology can be used to predict a critical dimension or film thickness for a manufacturing process step. Before or during processing of this manufacturing step, the prediction can then be used to set and/or control any number of processing parameters (e.g. run time) for that processing step. For example, in the case of CMP, if virtual metrology predicts that a dielectric film thickness will be 100 Angstroms thicker than the target thickness if the wafer was to be polished at the nominal polish time, then a calculation can be made to lengthen the polish time so that the final polished thickness can be closer to the target thickness.).
In regards to Claim 16, the servers and clients of David ([0165]) teach the recited structures. Claim 16 corresponds with a system that performs the method of claim 6, and thus claim 16 is rejected under 35 U.S.C. 103 using a similar analysis as applied to claim 6.
In regards to Claim 7, the combination of David and Sardeshmukh teach the method as incorporated by claim 1 above. David further teaches “The method of claim 1, wherein each inverted machine learning model of the plurality of homogenous machine learning models is trained using at least one of: a different hyperparameter, a different initialization value, or different training data” ([0085] In step 612, the data is then fed into the algorithm for training. The algorithm could be one of many different types of algorithms. Examples of machine learning algorithms include... and Ensemble, including Boosting/Bagging, Random Forests, and GBM (Gradient Boosting Machine). The best algorithm may not be a single algorithm, but can be an ensemble of algorithms; wherein bagging is a form of ensemble learning with the same architecture trained with different training data).
In regards to Claim 17, the servers and clients of David ([0165]) teach the recited structures. Claim 17 corresponds with a system that performs the method of claim 7, and thus claim 17 is rejected under 35 U.S.C. 103 using a similar analysis as applied to claim 7.
Claims 2-4, 10, 12-14 and 20 are rejected under 35 U.S.C. 103 as being unpatentable over David and Sardeshmukh as applied to claims 1 and 11 above, and further in view of Tristan et al. (US 20190095805, hereinafter Tristan).
In regards to Claim 2, the combination of David and Sardeshmukh teaches the method as incorporated by claim 1 above.
Sardeshmukh teaches “…manufacturing process inputs defining an extrapolated solution corresponding to the unexplored portion of the process space” ([0015] Indeed, the model 14 may be capable of control, prediction, and optimization of the system 12. For example, the model 14 may be capable of process control, quality control, energy use optimization (e.g., electricity use optimization, fuel use optimization), product mix management, financial optimization, and so forth. [0027] By incorporating the asymptotic behavior of the system 10, the resulting modeled system 34 may be capable of a substantially improved extrapolation behavior, including the ability to more closely model the actual system 10. [0032] If the model 14 is deemed not suitable for use, then the logic 40 may loop to block 42 to repeat the model's training process. Indeed, the model 14 may be iteratively trained so as to achieve an accuracy, a high-order behavior, and extrapolation properties suitable for modeling the system 10. The model 14 may include neural network and/or support vector machine embodiments capable of employing the techniques described herein, including asymptotic analysis techniques, capable of superior extrapolation properties that may be especially useful in control, prediction, and optimization applications.
The combination of David and Sardeshmukh fail to teach “The method of claim 1, further comprising: combining, by the processing device, at least a first set of input data of the plurality of sets of input data with a second set of input data of the plurality of sets of input data to generate a set of semiconductor device manufacturing process inputs …, wherein the set of semiconductor device manufacturing process inputs comprises a plurality of candidate values; and storing, by the processing device, the set of semiconductor device manufacturing process inputs in a storage device”.
Cay teaches “The method of claim 1, further comprising: combining, by the processing device, at least a first set of input data of the plurality of sets of input data with a second set of input data of the plurality of sets of input data to generate a set of semiconductor device manufacturing process inputs” ([col 3 line 63] Certain aspects and features of the present disclosure relate to optimizing a manufacturing process for an object (e.g., a physical product) using a combination of an optimization model and one or more machine learning models, such as a neural network...The recommended set of values can be the combination of values for the configurable settings that best meets a user-defined goal (e.g., a particular quality level or price point), as compared to all of the other combinations of values analyzed during the optimization process....More specifically, a computing system can execute an optimization model to identify a recommended set of values for configurable settings of a manufacturing process. Executing the optimization model can involve implementing an iterative process for maximizing or minimize an objective function) “wherein the set of semiconductor device manufacturing process inputs comprises a plurality of candidate values; and storing, by the processing device, the set of semiconductor device manufacturing process inputs in a storage device” ([col 2 line 15] Each iteration of the iterative process can include selecting a current set of candidate values for the configurable settings from within a current region of a search space defined by the optimization model, the current set of candidate values being selected for use in a current iteration of the iterative process; [col 4 line 41] during each iteration of the optimization model, the optimization model can first determine a current set of values for the configurable settings to analyze. In a typical optimization process, the optimization model may next input the current set of values to an objective function that is a predefined linear equation. But in some examples described herein, the optimization model can instead provide the current set of values as input to one or more trained machine learning models that are separate from the optimization model. The optimization model may communicate with the one or more trained machine learning models via an application programming interface (API). The trained machine learning models can receive the current set of values and generate respective output values based on the current set of values; [col 7 line 7] Network-attached data stores 110 can store data to be processed by the computing environment 114 as well as any intermediate or final data generated by the computing system in non-volatile memory. But in certain examples, the configuration of the computing environment 114 allows its operations to be performed such that intermediate and final data results can be stored solely in volatile memory (e.g., RAM), without a requirement that intermediate or final data results be stored to non-volatile types of memory (e.g., disk).
It would have been obvious to a person having ordinary skill in the art before the effective file date of the claimed invention to have modified the system that determines semiconductor manufacturing parameters using an ensemble of neural networks as taught by David and Sardeshmukh with the use of saving sets of candidate values for control parameters determined via machine learning that have been combined until a most optimized solution is found as taught by Cay, because it would gain the benefit of Cay, namely finding a most optimized solution that yields improvements to the manufacturing process or manufactured object ([col 4]). By combining these elements, it can be considered taking the known ability to generate sets of candidate input data for configuring a semiconductor manufacturing process and saving them to a memory, and using these features in the machine learning ensemble of David and Sardeshmukh in a known way that achieves predictable results.
In regards to Claim 12, the servers and clients of David ([0165]) teach the recited structures. Claim 12 corresponds with a system that performs the method of claim 2, and thus claim 12 is rejected under 35 U.S.C. 103 using a similar analysis as applied to claim 2.
In regards to Claim 3, the combination of David, Sardeshmukh and Cay teach the method as incorporated by claim 2 above. Cay further teaches “The method of claim 2, further comprising clustering, by the processing device, the first set of input data and the second set of input data into a plurality of groups, wherein each group of the plurality of groups comprises a respective value for the first set of input data and a respective value for the second set of input data” ([col 2 line 11] The operations can include executing an optimization model to identify a recommended set of values for configurable settings of a manufacturing process associated with an object. The optimization model can be configured to determine the recommended set of values by implementing an iterative process using an objective function. Each iteration of the iterative process can include selecting a current set of candidate values for the configurable settings from within a current region of a search space defined by the optimization model, the current set of candidate values being selected for use in a current iteration of the iterative process; providing the current set of candidate values as input to a trained machine learning model that is separate from the optimization model, the trained machine learning model being configured to predict a value for a target characteristic of the object or the manufacturing process based on the current set of candidate values;[col 26 line 7] FIG. 11 is a flow chart of an example of a process for generating and using a machine learning model according to some aspects. Machine learning is a branch of artificial intelligence that relates to mathematical models that can learn from, categorize, and make predictions about data. Such mathematical models, which can be referred to as machine learning models, can classify input data among two or more classes; cluster input data among two or more groups; [col 27 line 64] In block 1112, the trained machine learning model is used to analyze the new data and provide a result. For example, the new data can be provided as input to the trained machine learning model. The trained machine learning model can analyze the new data and provide a result that includes a classification of the new data into a particular class, a clustering of the new data into a particular group, a prediction based on the new data, or any combination of these).
In regards to Claim 13, the servers and clients of David ([0165]) teach the recited structures. Claim 13 corresponds with a system that performs the method of claim 3, and thus claim 13 is rejected under 35 U.S.C. 103 using a similar analysis as applied to claim 3.
In regards to Claim 4, the combination of David, Sardeshmukh and Cay teach the method as incorporated by claim 2 above. David further teaches “The method of claim 2, wherein the plurality of candidate values comprises a range of values for the first set of input data and a range of values for the second set of input data.” ([0073] If the reflectometry data is collected by illuminating the target with unpolarized broadband light and has a detectable wavelength range of 250 nm to 850 nm, then the user could choose to sample that light from 250 nm to 850 nm at 2 nm intervals, to get a total of 301 spectral intensity measurements for that wavelength range. These 301 samples would each be an input to the algorithm. An example of how the input data is associated with a target is shown in Table III. [0090] If the target was a parametric test value, then the score will be a prediction of that parametric test value. In a typical situation, the score can be the overlay offset prediction, for example, an offset in the x direction or the y direction. In step 808, the score is used to determine an adjustment to be made to one or more components of the lithographic apparatus. For example, the offset data could be applied to a control system to make an adjustment to the lithography apparatus parameters or “control knobs” to adjust for the overlay error.[0202] In the case where a prediction can be made, that prediction may then be checked to ensure that the prediction is within acceptable bounds set by the user or the system. In the case that the prediction is beyond these bounds, the client will execute its “safe mode” action. In the case that the prediction is within the expected ranges, the prediction is delivered in accordance to the user specified manner. A log of the data, model used, and the prediction may be kept; wherein anytime there are multiple values, they inherently have a range or likewise the range is the boundaries defined by David).
In regards to Claim 14, the servers and clients of David ([0165]) teach the recited structures. Claim 14 corresponds with a system that performs the method of claim 4, and thus claim 14 is rejected under 35 U.S.C. 103 using a similar analysis as applied to claim 4.
In regards to Claim 10, the combination of David and Sardeshmukh teach the method as incorporated by claim 1 above.
The combination of David and Sardeshmukh fail to teach “The method of claim 1, further comprising: providing, by the processing device for display, a plurality of candidate input value sets, wherein each candidate input value set of the plurality of candidate input value sets corresponds to the expected output data for the semiconductor device manufacturing process; receiving, by the processing device, a user selection of a candidate input value set of the plurality of candidate input value sets to obtain a selected candidate input value set; and initiating, by the processing device, a run of the semiconductor device manufacturing process using the selected candidate input value set”.
Cay teaches “The method of claim 1, further comprising: providing, by the processing device for display, a plurality of candidate input value sets, wherein each candidate input value set of the plurality of candidate input value sets corresponds to the expected output data for the semiconductor device manufacturing process” ([col 32 line 28] the processing device can transmit the electronic communication over a network to a remote user device (e.g., a laptop computer, mobile phone, or tablet) associated with an operator of the manufacturing process. The user device can receive the electronic communication and responsively output the recommended set of values on a display device to the operator, who may be located on the manufacturing floor or otherwise close to a control panel associated with the manufacturing process. Based on the output, the operator can adjust the configurable settings to the recommended set of values to improve the manufacturing process. As still another example, the electronic communication can be a display signal for generating a graphical user interface on a display device, such as a touch-screen display or a liquid crystal display. The graphical user interface can include the recommended set of values. An operator of the manufacturing process can view the graphical user interface on the display device and tune the configurable settings to the recommended set of values, to improve the manufacturing process...(151) The iterative process can begin at block 1402, in which a processing device executing the optimization model can select a current set of candidate values for the configurable settings to be used in the current iteration of the iterative process. The current set of candidate values can be selected from within a current region of a search space defined by the optimization model; [col 2 line 23] the trained machine learning model being configured to predict a value for a target characteristic of the object or the manufacturing process based on the current set of candidate values) “receiving, by the processing device, a user selection of a candidate input value set of the plurality of candidate input value sets to obtain a selected candidate input value set” ([col 22 line 47] a user may interact with one or more user interface windows presented to the user in a display under control of the ESPE independently or through a browser application in an order selectable by the user. For example, a user may execute an ESP application, which causes presentation of a first user interface window, which may include a plurality of menus and selectors such as drop down menus, buttons, text boxes, hyperlinks, etc. associated with the ESP application as understood by a person of skill in the art) “and initiating, by the processing device, a run of the semiconductor device manufacturing process using the selected candidate input value set” ([col 4 line 5] The optimization model and the machine learning models can cooperate with one another to determine a recommended set of values for configurable settings of the manufacturing process. The recommended set of values can be the combination of values for the configurable settings that best meets a user-defined goal (e.g., a particular quality level or price point), as compared to all of the other combinations of values analyzed during the optimization process. In some examples, the recommended set of values can be the optimal set of values as determined by the optimization process. The recommended set of values can then be applied to the manufacturing process, which can yield significant improvements to the manufacturing process or the manufactured object).
It would have been obvious to a person having ordinary skill in the art before the effective file date of the claimed invention to have modified the system which determines control parameters for a semiconductor manufacturing process using an ensemble of machine learning models as taught by David and Sardeshmukh, with the use of a user interface that allows user selection of candidate values and which optimizes the candidates until a user-defined goal is achieved as taught by Cay, because it would gain the obvious benefit of allowing a user interface for control and testing capabilities of different candidate parameters, thus improving the user experience. By combining these elements, it can be considered taking the known display methods of Cay, and applying them to David in a known way that achieves predictable results.
Allowable Subject Matter
Claims 8-9 and 18-19 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.
In particular, taking the totality of claim 1 in mind, the use of Feed Forward Neural Networks as an ensemble has not been found in relation to the plurality of homogenous neural networks that receive expected output of a manufacturing process as inputs to the ensemble, and outputs sets of input data used in configuring the manufacturing process to manufacture the product.
Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to JONATHAN M SKRZYCKI whose telephone number is (571)272-0933. The examiner can normally be reached M-Th 7:30-3:30.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Ken Lo can be reached at 571-272-9774. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/JONATHAN MICHAEL SKRZYCKI/ Examiner, Art Unit 2116