DETAILED ACTION
This action is written in response to the remarks and amendments dated 10/31/25. This action is made final. The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Response to Arguments
In view of the Applicant’s arguments as well as the 2019 Revised Patent Subject Matter Eligibility Guidance, (84 Fed. Reg. 50, Jan. 7, 2019) the Examiner withdraws all outstanding rejections under §101.
The Applicants argue that the previous art of record does not anticipate or render obvious the claims as currently amended. The Examiner provides updated prior art rejections below necessitated by the current amendments.
Subject Matter Eligibility
In determining whether the claims are subject matter eligible, the examiner has considered and applied the 2019 USPTO Patent Eligibility Guidelines, as well as guidance in the MPEP chapter 2106. The examiner finds that the independent claims are directed to the practical application of identifying deficiencies in semiconductor substrates.
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103(a) which forms the basis for all obviousness rejections set forth in this Office action:
(a) A patent may not be obtained though the invention is not identically disclosed or described as set forth in section 102 of this title, if the differences between the subject matter sought to be patented and the prior art are such that the subject matter as a whole would have been obvious at the time the invention was made to a person having ordinary skill in the art to which said subject matter pertains. Patentability shall not be negatived by the manner in which the invention was made.
The following are the references relied upon in the rejections below:
Zhang (US 2020/0327654 A1)
Melville (Melville, Prem, and Raymond J. Mooney. "Diverse ensembles for active learning." In Proceedings of the twenty-first international conference on Machine learning, p. 74. 2004.)
Saqlain (Saqlain, Muhammad, Bilguun Jargalsaikhan, and Jong Yun Lee. "A voting ensemble classifier for wafer map defect patterns identification in semiconductor manufacturing." IEEE Transactions on Semiconductor Manufacturing 32, no. 2 (2019): 171-182.)
Claims 1-3, 5-9 and 11 are rejected under 35 U.S.C. 103 as being unpatentable over Zhang and Saqlain.
Regarding claim 1, Zhang discloses an apparatus for treating a substrate, comprising:
at least one sensor configured to measure a condition of the substrate or the apparatus in a process of the treating of the substrate;
Fig. 1 (reproduced below).
PNG
media_image1.png
716
710
media_image1.png
Greyscale
substrate or apparatus :: specimen 14.
[0025] “In one embodiment, the specimen is a wafer.“
sensor :: detector 28 and/or detector 34.
[0035] “Therefore, the imaging system may include the detection channel that includes collector 24, element 26, and detector 28 and that is centered in the plane of incidence and configured to collect and detect light at scattering angle(s) that are at or close to normal to the specimen surface.” (Emphasis added.)
[0003]-[0004] “Fabricating semiconductor devices such as logic and memory devices typically includes processing a substrate such as a semiconductor wafer using a large number of semiconductor fabrication processes to form various features and multiple levels of the semiconductor devices. …. Inspection processes are used at various steps during a semiconductor manufacturing process to detect defects on wafers to drive higher yield in the manufacturing process and thus higher profits. Inspection has always been an important part of fabricating semiconductor devices.” (Emphasis added.)
a data collecting unit configured to execute a program stored on a computer-readable recording medium to collect in a time series data measured by the at least one sensor; and
[0037] “As such, the output that is generated by each of the detectors may be signals or data, …. in other instances, the detectors may be configured as imaging detectors that are configured to generate imaging signals”. (Emphasis added.)
See also fig. 1, ‘Computer subsystem 36’.
a data processing unit configured to detect an occurrence of an issue on the apparatus for treating the substrate by learning the data collected by the data collecting unit and detecting a change in a-current data measured by the at least one sensor
[0001] “The present invention generally relates to methods and systems for learnable defect detection for semiconductor applications. Certain embodiments relate to systems and methods for detecting defects on a specimen using a deep metric learning defect detection model and/or a. learnable low-rank reference image generator.” (Emphasis added.)
[0075] “The outputs of networks 228 and 230 are input to fully connected layer and SoftMax layer combination 234 in Block C 236, which generates output 238, which includes the final labels described above (i.e., defective or not).” (Emphasis added.)
wherein the data collecting unit sequentially defines and samples pairs of the data collected in the time series,
[0143] “For example, as shown in FIG. 3, a pair of test and corresponding, LPCA generated reference images (e.g., a pair of test image 306 and corresponding, LPCA-generated reference image 308) may be input to DL Feature Finders 310, which may output features 312 for the test image and features 314 for the reference image.”
Saqlain discloses the following further limitation which Zhang does not disclose:
a plurality of normality tests are performed on the pairs of the data collected in the time series to determine a frequency of abnormal I/O, and
PNG
media_image2.png
334
660
media_image2.png
Greyscale
P. 174, fig. 1.
P. 174, fig. 2, caption: “Experimental data frequencies of different wafer map defect classes.”
an issue association is determined based on the frequency of the abnormal I/O.
Id. “Ensemble results.”
P. 174, first col. “All base classifiers have some implementation limitation, so the use of an ensemble approach (combination of base classifiers) is a common practice. Fig. 1 shows the basic steps of an ensemble-based WM classification approach.”
At the time of filing, it would have been obvious to a person of ordinary skill to combine the ensemble learning technique disclosed by Saqlain with the semiconductor detect detection system of Zhang because ensemble classifiers can yield improved classification results over any individual classifier. Both disclosures pertain to semiconductor defect detection.
Regarding claim 2, Zhang discloses the following further limitation wherein the data processing unit comprises:
a data learning unit configured to learn past data collected by the data collecting unit using a Siamese network; and
[0068] “In one embodiment, the DML defect detection model has a Siamese network architecture. In another embodiment, the DML defect detection model has a triplet network architecture. In an additional embodiment, the DML defect detection model has a quadruplet network architecture. For example, the DML can be constructed by Siamese network, triplet network, quadruplet network, etc.” (Emphasis added.)
a data inspecting unit configured to detect the issue on the apparatus for treating the substrate has occurred in the current data based on learned data.
Id. See generally [0068]-[0070]
Regarding claim 3, Zhang discloses the following further limitation wherein the data collecting unit
collects a first data before the issue on the apparats for treating the substrate and a second data after the issue on the apparats for treating the substrate and
[0070] “Illustrated by FIG. 2, in one construction of a Siamese detection model that may be used in the embodiments described herein, the test images may include N BBP images 202 and 204 from N adjacent dies and one design image 200. These images may be selected from the same die coordinates with identical fields of view (FOV) (i.e., the same die coordinates in multiple dies centered on the same within die location). Blocks A, B, and C are three different deep CNNs. The two networks shown in Block B have the same architecture configuration. In addition to the two networks in Block B having the same architecture, the weights for both networks have to be shared by the networks for the network to have a Siamese architecture. First, the test images go through Block A to calculate the reference features, which is the average of N outputs from Block A. Second, both test and reference features go through Block B to measure the distance between them based on the outputs from Block B. Third, Block C is applied to generate the final labels (defective vs. non-defective) for each image pixel location.“ (Emphasis added.)
the data learning unit learns the first data and the second data using the Siamese network, and learns whether a data related to the issue on the apparats for treating the substrate indicates that the change has occurred.
Id.
Regarding claim 5, Zhang discloses the following further limitation wherein the data learning unit sets any one of the first data as a reference value, and learns by setting a relationship between another first data except for the any one of the first data and the reference value as 0, and by setting a relationship between the reference value and the second data as 1.
[0067] “For example, the distance in latent space is used to decide whether each portion (e.g., each pixel) in the test image is defective with respect to the reference image. In this manner, the DML defect detection model may make a binary decision of whether each pixel is a defect or not.”
[0070] “Third, Block C is applied to generate the final labels (defective vs. non-defective) for each image pixel location.”
Regarding claim 6, Zhang discloses the following further limitation wherein the data inspecting unit tests a validity test of the first data learned by the data learning unit using the current data measured by the at least one sensor.
[0010] “Another embodiment relates to a computer-implemented method for detecting defects on a specimen. The method includes projecting a test image generated for a specimen and a corresponding reference image into latent space. The method also includes, for one or more different portions of the test image, determining a distance in the latent space between the one or more different portions and corresponding one or more portions of the corresponding reference image. In addition, the method includes detecting defects in the one or more different portions of the test image based on distances determined for the one or more different portions of the test image, respectively. The projecting, determining, and detecting steps are performed by a DML defect detection model that is included in one or more components executed by one or more computer systems.”
Regarding claim 7, Zhang discloses the following further limitation wherein the data inspecting unit checks a first output by inputting a plurality of data recognized at the at least one sensor as an input value of the Siamese network learned at the data learning unit after the validity test is completed.
[0070] “Illustrated by FIG. 2, in one construction of a Siamese detection model that may be used in the embodiments described herein, the test images may include N BBP images 202 and 204 from N adjacent dies and one design image 200. These images may be selected from the same die coordinates with identical fields of view (FOV) (i.e., the same die coordinates in multiple dies centered on the same within die location). Blocks A, B, and C are three different deep CNNs. The two networks shown in Block B have the same architecture configuration. In addition to the two networks in Block B having the same architecture, the weights for both networks have to be shared by the networks for the network to have a Siamese architecture. First, the test images go through Block A to calculate the reference features, which is the average of N outputs from Block A. Second, both test and reference features go through Block B to measure the distance between them based on the outputs from Block B. Third, Block C is applied to generate the final labels (defective vs. non-defective) for each image pixel location.“
Regarding claim 8, Zhang discloses the following further limitation wherein the data inspecting unit detects from a sensor of the at least one sensor in which the change has occurred by checking the first output.
[0067] “For example, the distance in latent space is used to decide whether each portion (e.g., each pixel) in the test image is defective with respect to the reference image. In this manner, the DML defect detection model may make a binary decision of whether each pixel is a defect or not.”
[0070] “Third, Block C is applied to generate the final labels (defective vs. non-defective) for each image pixel location.”
Regarding claim 9, Zhang discloses the following further limitation wherein the data inspecting unit sets a case when the first output is 1 as a fourth data, and based on this sets a previous data as a third data, and checks an issue occurrence time point through checking a second output through a consecutive sampling.
[0070] “Illustrated by FIG. 2, in one construction of a Siamese detection model that may be used in the embodiments described herein, the test images may include N BBP images 202 and 204 from N adjacent dies and one design image 200. These images may be selected from the same die coordinates with identical fields of view (FOV) (i.e., the same die coordinates in multiple dies centered on the same within die location). Blocks A, B, and C are three different deep CNNs. The two networks shown in Block B have the same architecture configuration. In addition to the two networks in Block B having the same architecture, the weights for both networks have to be shared by the networks for the network to have a Siamese architecture. First, the test images go through Block A to calculate the reference features, which is the average of N outputs from Block A. Second, both test and reference features go through Block B to measure the distance between them based on the outputs from Block B. Third, Block C is applied to generate the final labels (defective vs. non-defective) for each image pixel location.“
[0094] “The one or more different parameters used to form the different device areas may be selected such that defect detection-related data generated using such a wafer simulates process variations and drift.”
The continuous iterative operation of the disclosed system is suggested by [0094] and is inherent throughout the disclosure.
Regarding claim 11, Zhang discloses the following further limitation wherein the data collected from the at least one sensor is a numeric data related to numbers.
[0037] “As such, the output that is generated by each of the detectors may be signals or data, …. in other instances, the detectors may be configured as imaging detectors that are configured to generate imaging signals”. (Emphasis added.)
See also fig. 1, ‘Computer subsystem 36’.
Claims 10 is rejected under 35 U.S.C. 103 as being unpatentable over Zhang, Saqlain and Melville.
Regarding claim 10, Melville discloses the following further limitation which Zhang/Saqlain does not disclose wherein the data inspecting unit withholds a determination when the first output is different from a result learned by the data learning unit.
P. 2, query by committee. In the case of disagreement among a plurality of classification models in an ensemble, instead of outputting results the example at hand is sent (by the ‘query’) to be labeled by a human oracle.
PNG
media_image3.png
538
514
media_image3.png
Greyscale
At the time of filing, it would have been obvious to a person of ordinary skill to apply active learning—and query by committee in particular—to the Zhang/Saqlain system because this can improve classification performance where labeled training data is limited. All three disclosures pertain to machine learning.
Additional Relevant Prior Art
The following references were identified by the Examiner as being relevant to the disclosed invention, but are not relied upon in any particular prior art rejection:
Hart discloses a survey of various applications machine learning (ML) techniques to alloy fabrication (including semiconductors). (Hart, Gus LW, Tim Mueller, Cormac Toher, and Stefano Curtarolo. "Machine learning for alloys." Nature Reviews Materials 6, no. 8 (2021): 730-755.)
Conclusion
THIS ACTION IS MADE FINAL. Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Vincent Gonzales whose telephone number is (571) 270-3837. The examiner can normally be reached on Monday-Friday 7 a.m. to 4 p.m. MT. If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Miranda Huang, can be reached at (571) 270-7092.
Information regarding the status of an application may be obtained from the USPTO Patent Center.
/Vincent Gonzales/Primary Examiner, Art Unit 2124