Last updated: April 18, 2026
Application No. 18/171,543
DEEP LEARNING TECHNIQUES FOR ANALYSES OF EXPERIMENTAL DATA GENERATED BY TWO OR MORE DETECTORS

Non-Final OA §103§112
Filed
Feb 20, 2023
Examiner
ALABI, OLUWATOSIN O
Art Unit
2129
Tech Center
2100 — Computer Architecture & Software
Assignee
Fei Company
OA Round
1 (Non-Final)
This examiner grants 58% of cases after interview

— +26.3% interview lift. A telephonic interview to clarify the technical implementation could significantly improve the outcome.
Based on 199 resolved cases, 2023–2026
Examiner Intelligence

ALABI, OLUWATOSIN O View full profile →
Grants 58% of resolved cases
Career Allow Rate
116 granted / 199 resolved
+3.3% vs TC avg
Strong +26% interview lift
Without
With
+26.3%
Interview Lift
resolved cases with interview
Typical timeline
3y 8m
Avg Prosecution
45 currently pending
Career history
244
Total Applications
across all art units
Statute-Specific Performance

§101
21.9%
-18.1% vs TC avg
§103
40.0%
+0.0% vs TC avg
§102
9.5%
-30.5% vs TC avg
§112
23.2%
-16.8% vs TC avg
Black line = Tech Center average estimate • Based on career data from 199 resolved cases
Office Action

§103 §112
DETAILED ACTION Notice of Pre-AIA or AIA Status The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA. Drawings The drawings were received on 02/20/2023 . These drawings are acceptable. Information Disclosure Statement The information disclosure statement (IDS) submitted on the following date(s): 11/17/2025, 08/15/2025 and 06/21/2023 have been considered by the examiner. Claim Interpretation The following is a quotation of 35 U.S.C. 112(f): (f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof. The following is a quotation of pre-AIA 35 U.S.C. 112, sixth paragraph: An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof. The claims in this application are given their broadest reasonable interpretation using the plain meaning of the claim language in light of the specification as it would be understood by one of ordinary skill in the art. The broadest reasonable interpretation of a claim element (also commonly referred to as a claim limitation) is limited by the description in the specification when 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph, is invoked. As explained in MPEP § 2181, subsection I, claim limitations that meet the following three-prong test will be interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph: (A) the claim limitation uses the term “means” or “step” or a term used as a substitute for “means” that is a generic placeholder (also called a nonce term or a non-structural term having no specific structural meaning) for performing the claimed function; (B) the term “means” or “step” or the generic placeholder is modified by functional language, typically, but not always linked by the transition word “for” (e.g., “means for”) or another linking word or phrase, such as “configured to” or “so that”; and (C) the term “means” or “step” or the generic placeholder is not modified by sufficient structure, material, or acts for performing the claimed function. Use of the word “means” (or “step”) in a claim with functional language creates a rebuttable presumption that the claim limitation is to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites sufficient structure, material, or acts to entirely perform the recited function. Absence of the word “means” (or “step”) in a claim creates a rebuttable presumption that the claim limitation is not to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is not interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites function without reciting sufficient structure, material or acts to entirely perform the recited function. Claim limitations in this application that use the word “means” (or “step”) are being interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action. Conversely, claim limitations in this application that do not use the word “means” (or “step”) are not being interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action. This application includes one or more claim limitations that do not use the word “means,” but are nonetheless being interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph, because the claim limitation(s) uses a generic placeholder that is coupled with functional language without reciting sufficient structure to perform the recited function and the generic placeholder is not preceded by a structural modifier. Such claim limitation(s) is/are: listed below, where the generic placeholder is in bold and the functional language italicize: Claim 1: An apparatus, comprising: an electron-beam column c onfigured to scan an electron beam across a sample ; a plurality of detectors configured to measure signals caused by interaction of the electron beam with the sample , the plurality of detectors including a first detector for a first modality, a second detector for a second modality, and a third detector for an imaging modality; and an electronic controller connected to receive streams of measurements from the plurality of detectors and configured to: for each pixel of a base image of the sample generated using the imaging modality, map, with an autoencoder, a respective first input vector and a respective second input vector to a respective probability density in a latent space, with the respective first input vector, the respective second input vector, and the base image being obtained based on the streams of measurements, the respective first input vector corresponding to the first modality, the respective second input vector corresponding to the second modality; identify, with the autoencoder, a respective latent-space cluster to which the respective probability density belongs; and generate a cluster-mapped image of the sample based on the base image and further based on latent-space clusters identified for different pixels of the base image. Claim 4: wherein the autoencoder comprises: a neural network encoder configured to jointly map the respective first input vector and the respective second input vector to the respective probability density in the latent space ; and a neural network decode r configured to generate reconstructed spectra based on mappings, with the neural network encoder, of training spectra to the latent space; Claim 5 wherein the autoencoder comprises: a first neural network encoder configured to map the respective first input vector to a first probability density in a first private subspace of the latent space ; and a second neural network encoder configured to map the respective second input vector to a second probability density in a second private subspace of the latent space ; and wherein the first neural network encoder and the second neural network encoder are further configured to jointly map the respective first input vector and the respective second input vector to the respective probability density in a shared subspace of the latent space claim 9 wherein the electronic controller is configured to generate the cluster-mapped image of the sample by coloring each pixel of the base image in accordance with a color code of the plurality of clusters claim 10 wherein the electronic controller is configured to apply processing to the streams of measurements to obtain the respective first input vector and the respective second input vecto r, the processing including one or more operations selected from the group consisting of removal of outlier peaks, subtraction of an estimated background, scaling, normalization, averaging, fitting with a selected function, binning or re-binning, and Gaussian kernel filtering. Claim 13: an i nterface device configured to receive streams of measurements from a plurality of detectors of the scientific instrument, the plurality of detectors being configured to measure signals caused by interaction of an electron beam with a sample and including a first detector for a first spectroscopic modality of the scientific instrument, a second detector for a second spectroscopic modality of the scientific instrument, and a third detector for an imaging modality of the scientific instrument ; and a processing device configured to: for each pixel of a base image of the sample generated using the imaging modality, map, with an autoencoder, a respective first input vector and a respective second input vector to a respective probability density in a latent space, with the respective first input vector, the respective second input vector, and the base image being obtained based on the streams of measurements, the respective first input vector corresponding to the first spectroscopic modality, the respective second input vector corresponding to the second spectroscopic modality; identify, with the autoencoder, a respective latent-space cluster to which the respective probability density belongs; and generate a cluster-mapped image of the sample based on the base image and further based on latent-space clusters identified for different pixels of the base image. Claim 14: wherein the autoencoder comprises: a neural network encoder configured to jointly map the respective first input vector and the respective second input vector to the respective probability density in the latent space ; and a neural network decoder configured t o generate reconstructed spectra based on mappings, with the neural network encoder, of training spectra to the latent space ; and wherein the neural network encoder and the neural network decoder have been trained using a loss function, the training spectra, and the reconstructed spectra, the loss function including a sum of a term representing reconstruction loss and a regularizer term. Claim 15: wherein the autoencoder comprises: a first neural network encoder configured to map the respective first input vector to a first probability density in a first private subspace of the latent space ; and a second neural network encoder configured to map the respective second input vector to a second probability density in a second private subspace of the latent space ; and wherein the first neural network encoder and the second neural network encoder are further configured to jointly map the respective first input vector and the respective second input vector to the respective probability density in a shared subspace of the latent space. Claim 18: further comprising a display device, wherein the processing device is configured t o: generate the cluster-mapped image of the sample by coloring each pixel of the base image in accordance with a color code of the plurality of clusters ; and cause the cluster-mapped image of the sample to be displayed by the display device. Because this/these claim limitation(s) is/are being interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph, it/they is/are being interpreted to cover the corresponding structure described in the specification as performing the claimed function, and equivalents thereof. If applicant does not intend to have this/these limitation(s) interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph, applicant may: (1) amend the claim limitation(s) to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph (e.g., by reciting sufficient structure to perform the claimed function); or (2) present a sufficient showing that the claim limitation(s) recite(s) sufficient structure to perform the claimed function so as to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph. Claim Rejections - 35 USC § 112 The following is a quotation of 35 U.S.C. 112(b): (b ) CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention. The following is a quotation of 35 U.S.C. 112 (pre-AIA), second paragraph: The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the appl icant regards as his invention. Claims 1-20 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA 35 U.S.C. 112, the applicant), regards as the invention. Regarding claim 1, the term “cluster-mapped image” in the limitation “ generate a cluster-mapped image of the sample based on the base image and further based on latent-space clusters identified for different pixels of the base image ” renders the claim definite because it is unclear what a clustered mapped image given that the claim notes that the mapping is completed using an autoencoder “ for each pixel of a base image of the sample generated using the imaging modality, map, with an autoencoder …” for mapping with respect to input vectors. Where are the clusters generated with respect to the autoencoder? And autoencoder generates an imaged by mapping elements to a latent space, as known in the art, but the term cluster-mapped image is not a term of art. Examiner notes that any image generated using an autoencoder reads on the claimed limitation/term. Regarding claims 13 and 19, the limitations are similar with claim 1 and thus rejected under the same rationale. Regarding the dependent claims of claims 1, 13, and 19, the claims fail to resolve the noted deficiencies and the ones that recite the same term are deemed indefinite under the same rationale. The rejection noted above is incorporated. Claim limitations 1, 4-5, 9-10, 13-15 and 18 are invokes 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph. However, the written description fails to disclose the corresponding structure, material, or acts for performing the entire claimed function and to clearly link the structure, material, or acts to the function. The noted generic place holder noted above, are noted to perform the claimed function where there is no recitation linking the generic place holder to the corresponding structure, material, or acts for performing the entire claimed function , noted above. Therefore, the claim is indefinite and is rejected under 35 U.S.C. 112(b) or pre-AIA 35 U.S.C. 112, second paragraph. Applicant may: (a) Amend the claim so that the claim limitation will no longer be interpreted as a limitation under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph; (b) Amend the written description of the specification such that it expressly recites what structure, material, or acts perform the entire claimed function, without introducing any new matter (35 U.S.C. 132(a)); or (c) Amend the written description of the specification such that it clearly links the structure, material, or acts disclosed therein to the function recited in the claim, without introducing any new matter (35 U.S.C. 132(a)). If applicant is of the opinion that the written description of the specification already implicitly or inherently discloses the corresponding structure, material, or acts and clearly links them to the function so that one of ordinary skill in the art would recognize what structure, material, or acts perform the claimed function, applicant should clarify the record by either: (a) Amending the written description of the specification such that it expressly recites the corresponding structure, material, or acts for performing the claimed function and clearly links or associates the structure, material, or acts to the claimed function, without introducing any new matter (35 U.S.C. 132(a)); or (b) Stating on the record what the corresponding structure, material, or acts, which are implicitly or inherently set forth in the written description of the specification, perform the claimed function. For more information, see 37 CFR 1.75(d) and MPEP §§ 608.01(o) and 2181. Claim Rejections - 35 USC § 103 The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action: A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made. Claim s 1, 10-13, and 19-20 are rejected under 35 U.S.C. 103 as being unpatentable over Ha et al. (US 20180330511, hereinafter ‘Ha’) in view of Higuchi et al. (US 20220301288, hereinafter ‘Hig’) . Regarding independent claim 1 , Ha teaches an apparatus, comprising: an electron-beam column configured to scan an electron beam across a sample; a plurality of detectors configured to measure signals caused by interaction of the electron beam with the sample, the plurality of detectors including a first detector for a first modality, a second detector for a second modality, and a third detector for an imaging modality ; ( in [0037] In some instances, the optical tool may be configured to direct light to the specimen [ an apparatus, comprising: an electron-beam column configured to scan an electron beam across a sample ] at more than one angle of incidence at the same time. For example, the illumination subsystem may include more than one illumination channel, one of the illumination channels may include light source 16, optical element 18, and lens 20 as shown in FIG. 1 and another of the illumination channels (not shown) may include similar elements, which may be configured differently or the same, or may include at least a light source and possibly one or more other components such as those described further herein. If such light is directed to the specimen at the same time as the other light, one or more characteristics (e.g., wavelength, polarization, etc.) of the light directed to the specimen at different angles of incidence may be different such that light resulting from illumination of the specimen at the different angles of incidence can he discriminated from each other at the detector(s) [ plurality of detectors configured to measure signals caused by interaction of the electron beam with the sample ]…; And in [0052] As also shown in FIG. 1, the electron column includes electron beam source 126 configured to generate electrons that are focused to specimen 128 [ plurality of detectors configured to measure signals caused by interaction of the electron beam with the sample ] by one or more elements 130. The electron beam source may include, for example, a cathode source or emitter tip, and one or more elements 130 may include, for example, a gun lens, an anode, a beam limiting aperture, a gate valve, a beam current selection aperture, an objective tens, and a scanning subsystem, all of which may include any such suitable elements known in the art … in [0067] As further noted above, the optical tool may be configured to generate output for the specimen with multiple modes or “different modalities [ the plurality of detectors including a first detector for a first modality, a second detector for a second modality, and a third detector for an imaging modality ] .” In this manner, in some embodiments, the optical images include images generated by the optical tool with two or more different values of a parameter of the optical tool. In general, a “mode” or “modality” (as those terms are used interchangeably herein) of the optical tool can be defined by the values of parameters of the optical tool used for generating output and/or images for a specimen. Therefore, modes that are different may be different in the values for at least one of the optical parameters of the tool … ) and an electronic controller connected to receive streams of measurements from the plurality of detectors ( i n [0069] The optical and electron beam tools described herein may be configured as inspection tools [ and an electronic controller connected to receive streams of measurements from the plurality of detectors ] . In addition, or alternatively, the optical and electron beam tools described herein may be configured as defect review tools. Furthermore, the optical and electron beam tools described herein may be configured as metrology tools. In particular, the embodiments of the optical and electron beam tools described herein and shown in FIG. 1 may be modified in one or more parameters to provide different imaging capability [ and an electronic controller connected to receive streams of measurements from the plurality of detectors ] depending on the application for which they will be used. In one such example, the optical tool shown in FIG. 1 … [0072] The one or more computer subsystems (e.g., computer subsystem(s) 36, 102, and 124 shown in FIG. 1) included in the system are configured for acquiring information for a specimen. The information for the specimen includes at least first and second images for the specimen. In the case of actual images, the computer subsystem may be configured for acquiring the actual images by using one or more of the tools [ and an electronic controller connected to receive streams of measurements from the plurality of detectors ] described herein for directing energy (e.g., light or electrons) to a specimen and detecting energy (e.g., light or electrons) from the specimen. Therefore, acquiring the actual images may include generating the images using a physical version of the specimen and some sort of imaging hardware. However, acquiring the actual images may include acquiring the actual images from a storage medium (including any of the storage media described herein) in Which the actual images have been stored by an actual imaging system (e.g., optical tool 10) .. . ) and configured to: for each pixel of a base image of the sample generated using the imaging modality, map, with an autoencoder, a respective first input vector and a respective second input vector to a respective probability density in a latent space, with the respective first input vector, the respective second input vector, ( in [0074] The different modalities are different in at least one imaging parameter of at least one imaging system. In one embodiment, the first and second modalities generate the first and second images with different pixel sizes. . . In one such example, an image captured using an optical imaging system and an image captured using an electron beam imaging system are captured at different frequencies … [0091] Deep learning is part of a broader family of machine learning methods based on learning representations of data. An observation (e.g., an image) can be represented in many ways such as a vector of intensity values per pixel, or in a more abstract way as a set of edges, regions of particular shape, etc. Some representations are better than others at simplifying the learning task (e.g., face recognition or facial expression recognition) … [0110] As shown in FIG. 3, SEM image 300 (or a first image acquired for a specimen with a first modality) is input to learning based model 302, which transforms the SEM image to thereby render it into the common space of CAD image 306. In other words, learning based model 302 transforms SEM image 300 to rendered image 304 by mapping SEM image 300 from SEM image space to CAD image space [ and configured to: for each pixel of a base image of the sample generated using the imaging modality, map, with an autoencoder, a respective first input vector and a respective second input vector to a respective probability density in a latent space, with the respective first input vector, the respective second input vector ].. . Since the rendered image and the CAD image now look as if they were acquired from the same modality prior to alignment, alignment can be performed relatively easily as described further herein. [0111] In the embodiment shown in FIG. 3, the learning based model may be a regression model or any of the learning based models described herein. In one such example, the learning based model may be in the form of a deep convolution autoencoder (DCAE) [ and configured to: for each pixel of a base image of the sample generated using the imaging modality, map, with an autoencoder, a respective first input vector and a respective second input vector to a respective probability density in a latent space, with the respective first input vector, the respective second input vector ] . The encoder portion of the learning based model may include, for example, five convolutional layers with kernel sizes of, for example, 5×5, a stride of 2, and no zero padding … [0115] The embodiment shown in FIG. 3 shows a run time mode of one of the alignment approaches described herein, FIG. 4 shows one possible method for training such an alignment approach. As shown in FIG. 4, the training may include inputting SEM image 400 into learning based model 402, which may be a regression model or another learning based model described herein. In this embodiment, the learning based model includes encoder 404 and decoder 408, .. . Features 406 are input to decoder 408, which transforms the image into a different space. In this case, the decoder transforms the input SEM image from features 406 to image 410 in design space. In this manner, image 410 may be a CAD image. ) and the base image being obtained based on the streams of measurements, the respective first input vector corresponding to the first modality, the respective second input vector corresponding to the second modality; ( in 0110] As shown in FIG. 3, SEM image 300 (or a first image acquired for a specimen with a first modality) is input to learning based model 302, which transforms the SEM image to thereby render it into the common space of CAD image 306. In other words, learning based model 302 transforms SEM image 300 to rendered image 304 by mapping SEM image 300 from SEM image space to CAD image space. In this manner, the common space in this embodiment is CAD image space. As such, in this embodiment, the second image is the CAD image generated for the specimen with a second modality. Rendered image 304 and CAD image 306 are then input to alignment step 308, which performs alignment or registration of the two images to thereby generate alignment results 310 [ the base image being obtained based on the streams of measurements, the respective first input vector corresponding to the first modality, the respective second input vector corresponding to the second modality ]… [0115] The embodiment shown in FIG. 3 shows a run time mode of one of the alignment approaches described herein, FIG. 4 shows one possible method for training such an alignment approach. As shown in FIG. 4, the training may include inputting SEM image 400 into learning based model 402, which may be a regression model or another learning based model described herein. In this embodiment, the learning based model includes encoder 404 and decoder 408 [ with an autoencoder, a respective first input vector and a respective second input vector to a respective probability density in a latent space, with the respective first input vector, the respective second input vecto r ] , Which may be configured as described further herein. In addition, although an auto-encoder is shown in this figure in the learning based model, any regression model such as CG AN or demise convolutional auto-encoder can be used in the embodiments described herein. Image 400 is input to encoder 404, which determines features 406 (i.e., learning or deep learning based features) of the image. Features 406 are input to decoder 408, which transforms the image into a different space. In this case, the decoder transforms the input SEM image from features 406 to image 410 in design space. In this manner, image 410 may be a CAD image. ) identify, with the autoencoder, a respective latent-space cluster to which the respective probability density belongs; ( As depicted in Fig. 4 and in [0115] The embodiment shown in FIG. 3 shows a run time mode of one of the alignment approaches described herein, FIG. 4 shows one possible method for training such an alignment approach. As shown in FIG. 4, the training may include inputting SEM image 400 into learning based model 402, which may be a regression model or another learning based model described herein. In this embodiment, the learning based model includes encoder 404 and decoder 408, Which may be configured as described further herein. In addition, although an auto-encoder is shown in this figure in the learning based model, any regression model such as CG AN or demise convolutional auto-encoder can be used in the embodiments described herein. Image 400 is input to encoder 404, which determines features 406 (i.e., learning or deep learning based features) [ identify, with the autoencoder, a respective latent-space cluster to which the respective probability density belongs ] of the image. Features 406 are input to decoder 408, which transforms the image into a different space. In this case, the decoder transforms the input SEM image from features 406 [ identify, with the autoencoder, a respective latent-space cluster to which the respective probability density belongs ] to image 410 in design space. In this manner, image 410 may be a CAD image.) and generate a cluster-mapped image of the sample based on the base image and further based on latent-space clusters identified for different pixels of the base image . ( i n [0115] The embodiment shown in FIG. 3 shows a run time mode of one of the alignment approaches described herein, FIG. 4 shows one possible method for training such an alignment approach. As shown in FIG. 4, the training may include inputting SEM image 400 into learning based model 402 [ based on the base image and further based on latent-space clusters identified for different pixels of the base image ] , which may be a regression model or another learning based model described herein. In this embodiment, the learning based model includes encoder 404 and decoder 408 [ … based on the base image and further based on latent-space clusters identified for different pixels of the base image ] , Which may be configured as described further herein. In addition, although an auto-encoder is shown in this figure in the learning based model, any regression model such as CG AN or demise convolutional auto-encoder can be used in the embodiments described herein. Image 400 is input to encoder 404, which determines features 406 (i.e., learning or deep learning based features) of the image. Features 406 are input to decoder 408, which transforms the image into a different space. In this case, the decoder transforms the input SEM image from features 406 to image 410 in design space [ generate a cluster-mapped image of the sample based on the base image and further based on latent-space clusters identified for different pixels of the base image ] . In this manner, image 410 may be a CAD image… [0117] The embodiments described above provide a number of differences and improvements compared to the currently used methods. For example, different from the currently used methods that are based on either heuristic renderings or physics-based rendering approaches, the embodiments described above uses a deep regression neural network or other learning based model described further herein trained with pairs of corresponding images from different modalities to transform image 1 to image domain of image 2 for registration, e.g., from SEM to CAD images, from SEM to broadband optical images, etc… And in [0145] Another embodiment relates to a computer-implemented method for aligning images for a specimen acquired with different modalities .. . The method also includes inputting the information for the specimen into a learning based model [ … based on the base image and further based on latent-space clusters identified for different pixels of the base image ]. The learning based model is included in one or more components exe cuted by one or more computer systems … [0151] Further modifications and alternative embodiments of various aspects of the invention will be apparent to those skilled in the art in view of this description. For example, methods and systems for aligning images for a specimen acquired with different modalities are provided .. . ) Ha teaches processing image data using machine learning models, including autoencoder models. One of ordinary skill in the art would understand t hat the latent space of an autoencoder can model corresponding features using a probability density function/distribution . Hig expressly teaches t hat the latent space of an autoencoder can model corresponding features using a probability density function/distribution , in [0080] A variational autoencoder 151 is one type of autoencoder. An autoencoder [ for each pixel of a base image of the sample generated using the imaging modality, map, with an autoencoder, … ] is a multilayer neural network that is created with machine learning such that input data and output data are identical to each other. The autoencoder compresses the input data into a vector having fewer dimensions than the input data, and restores the output data from the vector. Here, the variational autoencoder 151 is created such that a set of vectors follows a specific probability distribution [ … a respective first input vector and a respective second input vector to a respective probability density in a latent space, with the respective first input vector, the respective second input vector, and the base image being obtained based on the streams of measurements, the respective first input vector corresponding to the first modality, the respective second input vector corresponding to the second modality; … ] . The variational autoencoder 151 includes an encoder 152 and a decoder 153. … [0082] A vector 155 is calculated between the encoder 152 and the decoder 153. The vector 155 is a representation of the features of the image 157 in low dimensions. For example, the vector 155 has 48 dimensions. The vector 155 may be called a latent variable, feature value, feature vector, or another. The vector 155 is mapped to a latent space 154. The latent space 154 is a vector space such as a 48-dimensional space. [0083] When a set of images of the same type (for example, a set of face photos or a set of handwritten characters) is input to the encoder 152, a set of vectors [… a respective first input vector and a respective second input vector to a respective probability density in a latent space, with the respective first input vector, the respective second input vector, and the base image being obtained based on the streams of measurements … ] corresponding to the set of images has a specific probability distribution such as a normal distribution in the latent space 154 [ … map, with an autoencoder, … identify, with the autoencoder, a respective latent-space cluster to which the respective probability density belongs ] . For example, the probability distribution in the latent space 154 is a multivariate normal distribution that has the vector 155 as a probability variable and that is specified by a specific mean vector and variance-covariance matrix. Here, a probability distribution other than the normal distribution may be assumed. The probability of occurrence of a specified vector in the set of vectors is approximated to a probability density calculated by a probability density function [ … map, with an autoencoder, … identify, with the autoencoder, a respective latent-space cluster to which the respective probability density belongs ] . In general, a vector closer to the mean vector has a higher probability density, whereas a vector farther away from the mean vector has a lower probability density [ identify, with the autoencoder, a respective latent-space cluster to which the respective probability density belongs ] . Hig and Ha are analogous art because both involve developing information retrieval and data modeling techniques using machine learning systems and algorithms. It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the teachings of the prior art for processing data for a detection system based on feature vectors calculated by the neural network , as disclosed by Hig with the method of developing information retrieval and data modeling techniques using machine learning systems and algorithms as disclosed by Ha . One of ordinary skill in the arts would have been motivated to combine the disclosed methods disclosed by Hig and Ha as noted above ; Doing so allo ws for developing information processing and modeling techniques to reduce the possibility of generating image data that is clearly dissimilar to sample image data included in the training data and to reduce the possibility that an image converges to a local solution and the inference of the sample image data fails , ( Hig , 0048 ). Regarding claim 10, the rejection of claim 1 is incorporated and Ha in combination with Hig teaches the appar atus of claim 1, wherein the electronic controller is configured to apply processing to the streams of measurements to obtain the respective first input vector and the respective second input vector, the processing including one or more operations selected from the group consisting of removal of outlier peaks, subtraction of an estimated background, scaling, normalization, averaging, fitting with a selected function, binning or re-binning, and Gaussian kernel filtering . ( in [0112] To avoid overfitting and reduce redundancy in the extracted features, sparsity in the feature maps may be enforced by using a drop out layer at the end of the encoder and also including a L1 regularization on the codes in the L2 cost function [ wherein the electronic controller is configured to apply processing to the streams of measurements to obtain the respective first input vector and the respective second input vector, the processing including one or more operations selected from the group consisting of … , fitting with a selected function ] . Again, these specific learning based model configurations are not meant to be limiting to the learning based models that are appropriate for use in the embodiments described herein. The learning based model may vary in type and parameter values from those described above and still be used in the embodiments described herein.) Regarding claim 11, the rejection of claim 1 is incorporated and Ha in combination with Hig teaches the apparat us of claim 1, wherein the first detector is positioned downstream from the sample with respect to a propagation direction of the electron beam; and wherein the second detector is positioned upstream from the sample with respect to the propagation direction of the electron beam . ( in [0037] In some instances, the optical tool may be configured to direct light to the specimen at more than one angle of incidence at the same time [ wherein the first detector is positioned downstream from the sample with respect to a propagation direction of the electron beam; and wherein the second detector is positioned upstream from the sample with respect to the propagation direction of the electron beam ] . For example, the illumination subsystem may include more than one illumination channel, one of the illumination channels may include light source 16, optical element 18, and lens 20 as shown in FIG. 1 and another of the illumination channels (not shown) may include similar elements, which may be configured differently or the same, or may include at least a light source and possibly one or more other components such as those described further herein. If such light is directed to the specimen at the same time as the other light, one or more characteristics (e.g., wavelength, polarization, etc.) of the light directed to the specimen at different angles of incidence may be different such that light resulting from illumination of the specimen at the different angles of incidence [ wherein the first detector is positioned downstream from the sample with respect to a propagation direction of the electron beam; and wherein the second detector is positioned upstream from the sample with respect to the propagation direction of the electron beam ] can he discriminated from each other at the detector(s) … [0044] Although FIG. 1 shows an embodiment of the optical tool that includes two detection channels, the optical tool may include a different number of detection channels (e.g., only one detection channel or two or more detection channels). In one such instance, the detection channel formed by collector 30, element 32, and detector 34 may form one side channel as described above, and the optical tool may include an additional detection channel (not shown) formed as another side channel that is positioned on the opposite side of the plane of incidence [ wherein the first detector is positioned downstream from the sample with respect to a propagation direction of the electron beam; and wherein the second detector is positioned upstream from the sample with respect to the propagation direction of the electron beam ] . Therefore, the optical tool may include the detection channel that includes collector 24, element 26, and detector 28 and that is centered in the plane of incidence and configured to collect and detect light at scattering angle(s) that are at or close to normal to the specimen surface [ wherein the first detector is positioned downstream from the sample with respect to a propagation direction of the electron beam; and wherein the second detector is positioned upstream from the sample with respect to the propagation direction of the electron beam ] . This detection channel may therefore be commonly referred to as a “top” channel [ and wherein the second detector is positioned upstream from the sample with respect to the propagation direction of the electron beam ] , and the optical tool may also include two or more side channels [ wherein the first detector is positioned downstream from the sample with respect to a propagation direction of the electron beam ] configured as described above. As such, the optical tool may include at least three channels (i.e., one top channel and two side channels), and each of the at least three channels has its own collector, each of which is configured to collect light at different scattering angles than each of the other collectors.) Regarding claim 1 2 , the rejection of claim 1 is incorporated and Ha in combination with Hig teaches the apparatus of claim 1, wherein the plurality of detectors includes a fourth detector for a third modality; ( in [0113] Alignment 308 may be performed with any suitable non-learning based alignment or registration method known in the art such as NCC, sum square difference, etc. Therefore, the embodiments described herein can use a relatively simple alignment method to robustly align the images. In particular, images acquired with different modalities [ wherein the plurality of detectors includes a fourth detector for a third modality as a fourth labeled detector in the plurality capturing a third modality/ ] (e.g., a SEM image and a trivially rendered design clip) often look very different from each other due to many factors such as optical proximity errors, missing layers in design (e.g., where a feature in the design (such as a liner) does not appear in an image of the specimen on which the design is formed), various types of noise in the specimen images, or difference in contrast between specimen images and design images … [0063] The system includes one or more computer subsystems, e.g., computer subsystem(s) 102 shown in FIG. 1, that may be configured for receiving the optical and electron beam images generated for the specimen. For example, as shown in FIG. 1, computer subsystem(s) 102 may be coupled to computer subsystem 36 and computer subsystem 124 such that computer subsystem(s) 102 can receive the optical images or output generated by detectors 28 and 34 and electron beam images or output generated by detector 134. Although the computer subsystem(s) may receive the optical images or output and the electron beam images or output from other computer subsystems coupled to the optical and electron beam tools, the computer subsystem(s) may he configured to receive the optical and electron beam images or output directly from the detectors that generate the images or output (e.g., if computer subsystems)102 are coupled directly to the detectors shown in FIG. 1) .. . [0065] The one or more virtual systems are not capable of having the specimen disposed therein. In particular, the virtual system(s) are not part of optical tool 10 or electron beam tool 122 and do not have any capability for handling the physical version of the specimen. In other words, in a system configured as a virtual system, the output of its one or more “detectors” may be output that was previously generated by one or more detectors of an actual tool and that is stored in the virtual system, and during the “imaging and/or scanning,” the virtual system may replay the stored output as though the specimen is being imaged and/or scanned [ wherein the plurality of detectors includes a fourth detector for a third modality as the virtual detector of the replayed physical detector capturing a third modality ] . In this manner, imaging and/or scanning the specimen with a virtual system may appear to be the same as though a physical specimen is being imaged and/or scanned with an actual system, while, in reality, the “imaging and/or scanning” involves simply replaying output for the specimen in the same manner as the specimen may be imaged and/or scanned. ) and wherein the electronic controller is configured to, for each pixel of the base image, map, with the autoencoder, the respective first input vector, the respective second input vector, and a respective third input vector to the respective probability density in the latent space, the respective third input vector corresponding to the third modality. ( in [0074] The different modalities are different in at least one imaging parameter of at least one imaging system. In one embodiment, the first and second modalities generate the first and second images with different pixel sizes. . . In one such example, an image captured using an optical imaging system and an image captured using an electron beam imaging system are captured at different frequencies [ and wherein the electronic controller is configured to, for each pixel of the base image, map, with the autoencoder, the respective first input vector, the respective second input vector, and a respective third input vector to the respective probability density in the latent space, the respective third input vector corresponding to the third modality ]… [0091] Deep learning is part of a broader family of machine learning methods based on learning representations of data. An observation (e.g., an image) can be represented in many ways such as a vector of intensity values per pixel, or in a more abstract way as a set of edges, regions of particular shape, etc. Some representations are better than others at simplifying the learning task (e.g., face recognition or facial expression recognition) … [0110] As shown in FIG. 3, SEM image 300 (or a first image acquired for a specimen with a first modality) is input to learning based model 302, which transforms the SEM image to thereby render it into the common space of CAD image 306. In other words, learning based model 302 transforms SEM image 300 to rendered image 304 by mapping SEM image 300 from SEM image space to CAD image space [ and configured to: for each pixel of a base image of the sample generated using the imaging modality, map, with an autoencoder, a respective first input vector and a respective second input vector to a respective probability density i n a latent space, with the respective first input vector, the respective second input vector ].. . Since the rendered image and the CAD image now look as if they were acquired from the same modality prior to alignment, alignment can be performed relatively easily as described further herein. [0111] In the embodiment shown in FIG. 3, the learning based model may be a regression model or any of the learning based models described herein. In one such example, the learning based model may be in the form of a deep convolution autoencoder (DCAE) [ and configured to: for each pixel of a base image of the sample generated using the imaging modality, map, with an autoencoder, a respective first input vector and a respective second input vector to a respective probability density in a latent space, with the respective first input vector, the respective second input vector ] . The encoder portion of the learning based model may include, for example, five convolutional layers with kernel sizes of, for example, 5×5, a stride of 2, and no zero padding … And in [0115] The embodiment shown in FIG. 3 shows a run time mode of one of the alignment approaches described herein, FIG. 4 shows one possible method for training such an alignment approach. As shown in FIG. 4, the training may include inputting SEM image 400 into learning based model 402, which may be a regression model or another learning based model described herein. In this embodiment, the learning based model includes encoder 404 and decoder 408, Which may be configured as described further herein. In addition, although an auto-encoder is shown in this figure in the learning based model, any regression model such as CG AN or demise convolutional auto-encoder can be use
Read full office action
Prosecution Timeline

Feb 20, 2023
Application Filed
Mar 30, 2026
Non-Final Rejection — §103, §112 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

18/093,594
Patent 12579409
IDENTIFYING SENSOR DRIFTS AND DIVERSE VARYING OPERATIONAL CONDITIONS USING VARIATIONAL AUTOENCODERS FOR CONTINUAL TRAINING
2y 5m to grant Granted Mar 17, 2026
18/802,747
Patent 12572814
ARTIFICIAL NEURAL NETWORK BASED SEARCH ENGINE CIRCUITRY
2y 5m to grant Granted Mar 10, 2026
18/196,986
Patent 12561570
METHODS AND ARRANGEMENTS TO IDENTIFY FEATURE CONTRIBUTIONS TO ERRONEOUS PREDICTIONS
2y 5m to grant Granted Feb 24, 2026
17/410,689
Patent 12547890
AUTOREGRESSIVELY GENERATING SEQUENCES OF DATA ELEMENTS DEFINING ACTIONS TO BE PERFORMED BY AN AGENT
2y 5m to grant Granted Feb 10, 2026
18/399,358
Patent 12536478
TRAINING DISTILLED MACHINE LEARNING MODELS
2y 5m to grant Granted Jan 27, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.
AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Prosecution Projections

1-2
Expected OA Rounds
58%
Grant Probability
85%
With Interview (+26.3%)
3y 8m
Median Time to Grant
Low
PTA Risk
Based on 199 resolved cases by this examiner. Grant probability derived from career allow rate.