Last updated: May 29, 2026
Application No. 18/610,339
DEVICE AND METHOD FOR ANALYZING DEFECT IN ULTRASONIC TESTING USING THREE-DIMENSIONAL DEEP LEARNING MODEL

Non-Final OA §103§112
Filed
Mar 20, 2024
Priority
May 11, 2023 — RE 10-2023-0061162
Examiner
GEBRESLASSIE, WINTA
Art Unit
2677
Tech Center
2600 — Communications
Assignee
DOOSAN ENERBILITY CO., LTD.
OA Round
1 (Non-Final)
Interview Optional

— +25.0% interview lift. Examiner has a relatively high allowance rate (76%); +25.0% interview lift. A written response may suffice.
Based on 135 resolved cases, 2023–2026
Examiner Intelligence

GEBRESLASSIE, WINTA View full profile →
Grants 76% — above average
Career Allowance Rate
102 granted / 135 resolved
+13.6% vs TC avg
Strong +25% interview lift
Without
With
+25.0%
Interview Lift
resolved cases with interview
Typical timeline
2y 6m
Avg Prosecution
25 currently pending
Career history
189
Total Applications
across all art units
Statute-Specific Performance

§103
94.4%
+54.4% vs TC avg
§102
3.3%
-36.7% vs TC avg
§112
1.1%
-38.9% vs TC avg
Black line = Tech Center average estimate • Based on career data from 135 resolved cases
Office Action

§103 §112
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claim Rejections - 35 USC § 112
Claim 11 is rejected under 35 U.S.C 112(b) as being indefinite for failing to particularly point out and distinctly claim the subject matter which applicant regards as the invention.
Specifically, claim 11 recites “updating, by the detecting network, the second weights of the generation network based on the prediction loss so as to maximally reduce the prediction loss while the first weights of the detecting network remain unchanged”.
However, the specification describes the corresponding operation differently, stating “updating the weights of the detecting network DN so as to maximally reduce the generation loss while the weights of the generation network GN remain unchanged”. Thus, it is unclear whether claim 11 intends to:
(i) update the weights of the generation network while holding the detecting network unchanged, or
(ii) update the weights of the detecting network while holding the generation network.
Because the claim language is inconsistent with the specification as to which network’s weights are updated and which remain fixed, the metes and bounds of the claim are unclear. Accordingly, claim 11 is indefinite.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The text of those sections of Title 35, U.S. Code not included in this action can be found in a prior Office action.
Claims 1-5, 12-14 are rejected under 35 U.S.C. 103 as being unpatentable over Jack et al. (US 20210302373 A1) in view of Jong Pil et al. (KR 20220008530 A) herein after Pil, and further in view of Hsieh et al. (US 20200097773 A1).
Regarding claim 1, Jack et al. teaches a method for analyzing a defect (see para [0040]; “a system for real-time visualization of defects within a material using ultrasonic non-destructive testing”), the method comprising: preparing three-dimensional raw data by collecting a plurality of two-dimensional inspection images obtained by ultrasonic testing of an inspection object (see Abstract; “a three-dimensional (3-D) image of a composite laminate constructed of a series of two-dimensional (2-D) cross sections. The GUI is capable of displaying the 3-D image as each additional 2-D cross section is scanned by an ultrasonic testing apparatus”) and stacking the plurality of two-dimensional inspection images (see para [0180]; “The system is operable to create a 3-D damage profile for the region consisting of the damage area of each C-scan in the series of reconciled C-scans stacked vertically”); deriving representation data, which is an inferenced three-dimensional image representing the defect of the inspection object (see para [0150]; “a three-dimensional (3-D) layered image 110, constructed by combining data from corresponding B-scan images 104, 106 and C-scan images 108” Note; 3-D layered image is corresponding to the representation data derived from the combination of B-scan and C-scan data). However, Jack et al. does not teach generating input data for a deep learning model by processing the three-dimensional raw data, through a first feature transformation by applying first weights in a trained state to the input data within a generation network of the deep learning model; and determining a defect type of the inspection object by inputting the representation data to a detecting network of the deep learning model.
In the same field of endeavor, Pil teaches through a first feature transformation by applying first weights in a trained state to the input data within a generation network of the deep learning model (see page 2, 5th para; “a first encoder for mapping an input original image to a feature map….including at least any one of a convolutional layer, a pooling layer, an activation function, and combinations thereof”); and determining a defect type of the inspection object by inputting the representation data to a detecting network of the deep learning model (see page 3, 5th para; “the defect inspection module 20 is a module for classifying defects in products by learning the restored image 2 and the interpolated image 3 in a deep learning method, For example, the defect inspection module 20 may operate in two modes: a machine learning mode and an object detection mode. 4 is a relationship between the first encoder (E1) of the deep learning-based product defect inspection system 100 using the input/output data generation and transformation of FIG”). However, the combination of Jack et al. and Pil as a whole does not teach generating input data for a deep learning model by processing the three-dimensional raw data.
In the same field of endeavor, Hsieh et al. teaches generating input data for a deep learning model by processing the three-dimensional raw data (see para [0116]; “raw image data can be preprocessed by the reconstruction engine 1440. Preprocessing may include one or more sub-processes, such as intensity correction, resembling, filtering, etc”, see also para [0089]; “a trained network package to provide a deep learning product offering. As shown in the example of FIG. 6, an input 610 (e.g., raw data) is provided for preprocessing 620”). Accordingly, it would have been obvious to one of ordinary skill in the art before the effective filling date to the claimed invention to modify the general use of method for real-time visualization of a material during ultrasonic non-destructive testing of Jack et al. in view system and method of defect inspection using generation and transformation of input and output data based on deep learning of Pil and further in view of apparatus to automatically generate an image quality metric for an image are provided of Hsieh et al. in order to determine the associated image quality metric (see para [0116]). 
Regarding claim 2, the rejection of claim 1 is incorporated herein.
Hsieh et al. in the combination further teaches wherein the determining of the defect type comprises: deriving detection data that predicts the defect type of the inspection object as a probability (see para [0204]; “an image can be categorized as belonging to class 4 with an associated probability of 90%, a 9% probability that the image belongs to class 5, and a 1% probability that the image belongs to class 3”) through a second feature transformation (see para [0209]; “a number of feature maps are created by convolving the input”) by applying second weights in the trained state to the representation data within the detecting network of the deep learning model (see para [0154]; “the diagnosis engine 1450 operates with the DDLD 1542, which is trained, validated and tested”, see also para [0208]; “A classifier 2150 (e.g., a softmax classifier, etc.) associates weights with nodes representing features of interest”, see also para [0120]; “Image output from the reconstruction engine 1440 can then be provided to the diagnosis engine 1450. The diagnosis engine 1450 can take image data from the reconstruction engine 1440 and/or non-image data from the information subsystem 1420 and process the data”); and determining the defect type of the inspection object according to the probability (see para [0177]; “A patient diagnosis can be provided with respect to various patient disease types and/or patient conditions, as well as associated severity levels”).
Regarding claim 3, the rejection of claim 1 is incorporated herein.
Pil in the combination further teaches wherein the deriving of the representation data comprises: deriving, by a plurality of encoding modules of the generation network a latent vector represented as a feature map by compressing the input data sequentially (see page 8, 6th para; “a deep learning model in the form of a variational auto-encoder, and features the input original image. A first encoder (E1) that maps to a map”, see also page 9, 8th para; “the CNN of the first encoder E1 may perform an operation by repeatedly passing through the plurality of convolutional layers 11 and pooling layers”, and page 10, 4th para; “these intermediate random hidden variables”); and deriving, by a latent module and a plurality of decoding modules of the generation network, the representation data by sequentially expanding the latent vector and restoring the latent vector to fit sizes of the input data (page 2, 5th para; “a first decoder for reconstructing the feature map into a first reconstructed image based on a first reconstructed jet value (Z1) and .. a second reconstructing jet value (Z2); …deconvand a second decoder for reconstructing the plurality of N-th interpolated images based on the plurality of N-th interpolated jet values ZNnew”, see also page 6, 6th para; “the first decoder or the second decoder may include: a deconvolution layer for deconvolving an input value with a filter to extract a feature map of a larger size;... and an unpooling layer for upsampling data corresponding to the size of the pooling kernel in the feature map to data of a larger size”, and page 10, 1st para; “the CNN of the first decoder D1 may perform an operation by repeatedly passing through the plurality of deconvolution layers 15 and unpooling layers 18”).
Regarding claim 4, the rejection of claim 3 is incorporated herein.
Pil in the combination further teaches wherein the deriving of the latent vector comprises: performing, by each of one or more convolution layers of each of the plurality of encoding modules (see also page 9, 8th para; “the CNN of the first encoder E1 may perform an operation by repeatedly passing through the plurality of convolutional layers 11 and pooling layers”), a convolution, a batch normalization, and an activation function on the input data or on an input feature map to generate a convolved feature map; and downsampling, by a pooling layer of each of the plurality of encoding modules, the convolved feature map through a pooling operation (see page 6, 5th para; “the first encoder comprises: a convolution layer for extracting a feature map of a smaller size by convolving an input value with a filter; a batch normalization layer that generates a normalized output value based on the average or standard deviation of the batch; an activation function that additionally activates data after the convolution operation; and a pooling layer for subsampling data corresponding to the size of the pooling kernel in the feature map into data having a smaller size”).
Regarding claim 5, the rejection of claim 3 is incorporated herein.
Pil in the combination further teaches wherein the deriving of the representation data comprises: performing, by each of one or more convolution layers of the latent module (see page 10, 4th para; “these intermediate random hidden variables are learned to be output with different weights in the process of converting input data between an input layer and an output layer”), a convolution, a batch normalization, and an activation function on an input feature map to generate a convolved feature map; and upsampling, by an up-convolution layer of the latent module, the convolved feature map through an up-convolution operation (see page 6, 6th para; “the first decoder or the second decoder may include: a deconvolution layer for deconvolving an input value with a filter to extract a feature map of a larger size; a batch normalization layer that generates a normalized output value based on the average or standard deviation of the batch; an activation function that additionally activates data after the convolution operation; and an unpooling layer for upsampling data corresponding to the size of the pooling kernel in the feature map to data of a larger size”).
Regarding claim 12, the scope of claim 12 is fully encompassed by the scope of claim 1,
accordingly, the rejection of claim 1 is fully applicable here
Regarding claim 13, the rejection of claim 12 is incorporated herein.
Hsieh et al. in the combination further teaches wherein the defect type is determined by: deriving detection data that predicts the defect type of the inspection object as a probability (see para [0204]; “an image can be categorized as belonging to class 4 with an associated probability of 90%, a 9% probability that the image belongs to class 5, and a 1% probability that the image belongs to class 3”) through a second feature transformation (see para [0209]; “a number of feature maps are created by convolving the input”) by applying second weights in the trained state to the representation data within the detecting network of the deep learning model (see para [0154]; “the diagnosis engine 1450 operates with the DDLD 1542, which is trained, validated and tested”, see also para [0208]; “A classifier 2150 (e.g., a softmax classifier, etc.) associates weights with nodes representing features of interest”, see also para [0120]; “Image output from the reconstruction engine 1440 can then be provided to the diagnosis engine 1450. The diagnosis engine 1450 can take image data from the reconstruction engine 1440 and/or non-image data from the information subsystem 1420 and process the data”); and determining the defect type of the inspection object according to the probability (see para [0177]; “A patient diagnosis can be provided with respect to various patient disease types and/or patient conditions, as well as associated severity levels”).
Regarding claim 14, the rejection of claim 13 is incorporated herein.
Pil in the combination further teaches wherein the generation network comprises: a plurality of encoding modules of the generation network a latent vector represented as a feature map by compressing the input data sequentially (see page 8, 6th para; “a deep learning model in the form of a variational auto-encoder, and features the input original image. A first encoder (E1) that maps to a map”, see also page 9, 8th para; “the CNN of the first encoder E1 may perform an operation by repeatedly passing through the plurality of convolutional layers 11 and pooling layers”, and page 10, 4th para; “these intermediate random hidden variables”); and a latent module and a plurality of decoding modules, which are configured to derive representation data by sequentially expanding the latent vector and restoring the latent vector to fit sizes of the input data (page 2, 5th para; “a first decoder for reconstructing the feature map into a first reconstructed image based on a first reconstructed jet value (Z1) and .. a second reconstructing jet value (Z2); …deconvand a second decoder for reconstructing the plurality of N-th interpolated images based on the plurality of N-th interpolated jet values ZNnew”, see also page 6, 6th para; “the first decoder or the second decoder may include: a deconvolution layer for deconvolving an input value with a filter to extract a feature map of a larger size;... and an unpooling layer for upsampling data corresponding to the size of the pooling kernel in the feature map to data of a larger size”, and page 10, 1st para; “the CNN of the first decoder D1 may perform an operation by repeatedly passing through the plurality of deconvolution layers 15 and unpooling layers 18”).
Claims 6, and 15 are rejected under 35 U.S.C. 103 as being unpatentable over Jack et al. and Pil, in view of Hsieh et al. as applied in claims 1, and 3 and further in view of Fuchs et al. (US 20210295528 A1).
Regarding claim 6, the rejection of claim 3 is incorporated herein. 
Pil in the combination further teaches performing, by each of one or more convolution layers of each of the plurality of decoding modules, a convolution, a batch normalization, and an activation function on the combined feature map to generate a convolved feature map; and upsampling, by an up-convolution layer of each of the plurality decoding modules, the convolved feature map through an up-convolution operation (see page 6, 6th para; “the first decoder or the second decoder may include: a deconvolution layer for deconvolving an input value with a filter to extract a feature map of a larger size; a batch normalization layer that generates a normalized output value based on the average or standard deviation of the batch; an activation function that additionally activates data after the convolution operation; and an unpooling layer for upsampling data corresponding to the size of the pooling kernel in the feature map to data of a larger size”). However, the combination of Jack et al., Pil and Hsieh et al. as a whole does not teach wherein the deriving of the representation data comprises: combining, by a concatenate layer of each of the plurality of decoding modules, a first feature map input from a corresponding encoding module of the plurality of encoding modules and a second feature map input from the latent module or a preceding decoding module of the plurality of decoding modules to create a combined feature map.
In the same field of endeavor, Fsch et al. teaches wherein the deriving of the representation data comprises: combining, by a concatenate layer of each of the plurality of decoding modules, a first feature map input from a corresponding encoding module of the plurality of encoding modules and a second feature map input from the latent module or a preceding decoding module of the plurality of decoding modules to create a combined feature map (see para [0052]; “Each concatenation unit 628 may concatenate, adjoin, or otherwise add two or more feature maps prior to processing by the subsequent deconvolution block 610 (e.g., as depicted) or the convolution block 604 within the same row 620. In some embodiments, the concatenation unit 628 may be part of the deconvolution block 610 that is to process the resultant set of feature maps in the same row 620. Each received feature map may be from another network 602 within the image segmentation model 504. Upon receipt of input feature maps, the concatenation unit 628 may combine the feature maps to generate a resultant set of feature maps to feed forward along the row 620. The combination of the feature maps (e.g., feature maps 608′) by the concatenation unit 628 may include concatenation, weighted summation, and addition, among others”). Accordingly, it would have been obvious to one of ordinary skill in the art before the effective filling date to the claimed invention to modify the general use of method for real-time visualization of a material during ultrasonic non-destructive testing of Jack et al. in view system and method of defect inspection using generation and transformation of input and output data based on deep learning of Pil and further in view of apparatus to automatically generate an image quality metric for an image are provided of Hsieh et al. and a method of training models to segment images of Fusch et al. in order to preserve and combine feature information across resolution level while reconstructing higher-resolution image representations (see para [0052]).
Regarding claim 15, the rejection of claim 14 is incorporated herein. 
Pil in the combination further teaches wherein each of the plurality of encoding modules comprises: each of one or more convolution layers configured to perform a convolution, a batch normalization, and an activation function on the input data or an input feature map to generate a convolved feature map; and a pooling layer configured to downsample the convolved feature map through a pooling operation (see page 6, 5th para; “the first encoder comprises: a convolution layer for extracting a feature map of a smaller size by convolving an input value with a filter; a batch normalization layer that generates a normalized output value based on the average or standard deviation of the batch; an activation function that additionally activates data after the convolution operation; and a pooling layer for subsampling data corresponding to the size of the pooling kernel in the feature map into data having a smaller size”), the latent module comprises: each of the one or more convolution layers configured to perform the convolution, the batch normalization, and the activation function on the input feature map to generate the convolved feature map; and an up-convolution layer configured to upsample the convolved feature map through an up-convolution operation (see page 6, para 6th;’ “the first decoder or the second decoder may include: a deconvolution layer for deconvolving an input value with a filter to extract a feature map of a larger size; a batch normalization layer that generates a normalized output value based on the average or standard deviation of the batch; an activation function that additionally activates data after the convolution operation; and an unpooling layer for upsampling data corresponding to the size of the pooling kernel in the feature map to data of a larger size”), each of the one or more convolution layers configured to perform the convolution, the batch normalization, and the activation function on the input feature map to generate the convolved feature map; and an up-convolution layer configured to upsample the convolved feature map through an up-convolution operation (see page 6, 6th para; “the first decoder or the second decoder may include: a deconvolution layer for deconvolving an input value with a filter to extract a feature map of a larger size; a batch normalization layer that generates a normalized output value based on the average or standard deviation of the batch; an activation function that additionally activates data after the convolution operation; and an unpooling layer for upsampling data corresponding to the size of the pooling kernel in the feature map to data of a larger size”).
Fsch et al.  in the combination further teaches and each of the plurality of decoding modules comprises: a concatenate layer configured to combine a first input feature map input from a corresponding encoding module and a second input feature map input from the latent module or a previous decoding module to generate a combined feature map (see para [0052]; “Each concatenation unit 628 may concatenate, adjoin, or otherwise add two or more feature maps prior to processing by the subsequent deconvolution block 610 (e.g., as depicted) or the convolution block 604 within the same row 620. In some embodiments, the concatenation unit 628 may be part of the deconvolution block 610 that is to process the resultant set of feature maps in the same row 620. Each received feature map may be from another network 602 within the image segmentation model 504. Upon receipt of input feature maps, the concatenation unit 628 may combine the feature maps to generate a resultant set of feature maps to feed forward along the row 620. The combination of the feature maps (e.g., feature maps 608′) by the concatenation unit 628 may include concatenation, weighted summation, and addition, among others”). 
Claims 7-8, and 16-17 are rejected under 35 U.S.C. 103 as being unpatentable over Jack et al. and Pil, in view of Hsieh et al. as applied in claims 1, and 3 and further in view of Taerum et al. (US 20200085382 A1).
Regarding claim 7, the rejection of claim 1 is incorporated herein. The combination of Jack et al., Pil and Hsieh et al. as a whole does not teach wherein the generating of the input data comprises dividing each of the plurality of two-dimensional inspection images into a plurality of pixel patches according to the number of pooling operations of the generation network.
In the same field of endeavor, Taerum et al. teaches wherein the generating of the input data comprises dividing each of the plurality of two-dimensional inspection images into a plurality of pixel patches according to the number of pooling operations of the generation network (see page [0205]; “Segmenting the full image therefore requires a tiling approach” see also para [0219]; “2D U-Net model include: [0219] num_pooling_layers: the total number of pooling (and upsampling) layers”, and [0206]; “As in U-Net, a 2×2 max pooling operation with stride 2 is used to downsample the images after every set of convolutions” and para [0208]; “after every pooling layer, the number of feature maps doubles and the spatial resolution is halved”). Accordingly, it would have been obvious to one of ordinary skill in the art before the effective filling date to the claimed invention to modify the general use of method for real-time visualization of a material during ultrasonic non-destructive testing of Jack et al. in view system and method of defect inspection using generation and transformation of input and output data based on deep learning of Pil and further in view of apparatus to automatically generate an image quality metric for an image are provided of Hsieh et al. and an automated end-to-end pipeline for accurate lesion detection and segmentation of Taerum et al. in order to improve both accuracy and efficiency for both detection and further quantitative assessment (see para [0205]).
Regarding claim 8, the rejection of claim 1 is incorporated herein. 
Taerum et al. in the combination further teach wherein the generating of the input data comprises dividing each of the plurality of two-dimensional inspection images into 2n or more pixel patches, wherein the n is the number of pooling operations of the generation network (see page [0205]; “Segmenting the full image therefore requires a tiling approach” see also para [0219]; “2D U-Net model include: [0219] num_pooling_layers: the total number of pooling (and upsampling) layers”, and [0206]; “As in U-Net, a 2×2 max pooling operation with stride 2 is used to downsample the images after every set of convolutions” and para [0208]; “after every pooling layer, the number of feature maps doubles and the spatial resolution is halved”). Accordingly, it would have been obvious to one of ordinary skill in the art before the effective filling date to the claimed invention to modify the general use of method for real-time visualization of a material during ultrasonic non-destructive testing of Jack et al. in view system and method of defect inspection using generation and transformation of input and output data based on deep learning of Pil and further in view of apparatus to automatically generate an image quality metric for an image are provided of Hsieh et al. and an automated end-to-end pipeline for accurate lesion detection and segmentation of Taerum et al. in order to improve both accuracy and efficiency for both detection and further quantitative assessment (see para [0205]).
Regarding claim 16, the rejection of claim 12 is incorporated herein. 
Taerum et al. in the combination further teach wherein the input data is generated through data augmentation, which comprises dividing each of the plurality of two-dimensional inspection images into a plurality of pixel patches according to the number of pooling operations of the generation network (see page [0205]; “Segmenting the full image therefore requires a tiling approach” see also para [0219]; “2D U-Net model include: [0219] num_pooling_layers: the total number of pooling (and upsampling) layers”, and [0206]; “As in U-Net, a 2×2 max pooling operation with stride 2 is used to downsample the images after every set of convolutions” and para [0208]; “after every pooling layer, the number of feature maps doubles and the spatial resolution is halved”).
Regarding claim 17 the rejection of claim 12 is incorporated herein. 
Taerum et al. in the combination further teach wherein the input data is generated through data augmentation, which comprises dividing each of the plurality of two-dimensional inspection image into 2n or more pixel patches wherein the n is the number of pooling operations of the generation network (see page [0205]; “Segmenting the full image therefore requires a tiling approach” see also para [0219]; “2D U-Net model include: [0219] num_pooling_layers: the total number of pooling (and upsampling) layers”, and [0206]; “As in U-Net, a 2×2 max pooling operation with stride 2 is used to downsample the images after every set of convolutions” and para [0208]; “after every pooling layer, the number of feature maps doubles and the spatial resolution is halved”).
Claims 9 and 18 are rejected under 35 U.S.C. 103 as being unpatentable over Jack et al. and Pil, in view of Hsieh et al. as applied in claim 1, and further in view of Mentl et al. (US20180268526 A1) and Shcherbinin et al. (US 20200387750 A1).
Regarding claim 9, the rejection of claim 1 is incorporated herein. 
Jack et al. in the combination further teach and target data which is a three-dimensional image generated from historical three-dimensional raw data created by stacking a plurality of historical two-dimensional inspection images obtained through the ultrasonic testing of the inspection object with the defect (see Abstract; “a three-dimensional (3-D) image of a composite laminate constructed of a series of two-dimensional (2-D) cross sections. The GUI is capable of displaying the 3-D image as each additional 2-D cross section is scanned by an ultrasonic testing apparatus”, see also para [0180]; “The system is operable to create a 3-D damage profile for the region consisting of the damage area of each C-scan in the series of reconciled C-scans stacked vertically).
Pil in the combination further teach further comprising, before the preparing of the three-dimensional raw data, training the generation network of the deep learning model by performing (see page 8, 13th para; “using a variational auto-encoder structure composed of an encoder and a decoder, the deep learning model can be trained so that the input image and the output image are identical or similar”); inputting the training input data into the generation network (see page 2, 5th para; “a first encoder for mapping the input original image to a feature map”). 
Hsieh et al. in the combination further teach preparing a first training data comprising training input data generated through data augmentation (see para [0105]; “data augmentation to generate more training samples”); generating the representation data through the first feature transformation by applying the first weights in an untrained state to the training input data (see para [0211]; “In a convolutional layer of an example deep convolutional network, an initial layer includes a plurality of feature maps in which node weights are initialized using parameterized normal random variables”). However, the combination of Jack et al., Pil and Hsieh et al. as a whole does not teach deriving a generation loss representing a difference between the representation data and the target data through a first loss function; and updating the first weights of the generation network based on the generation loss so as to maximally reduce the generation loss.
In the same field of endeavor Scherbinin et al teaches deriving a generation loss representing a difference between the representation data and the target data through a first loss function (see para [0014]; “the loss function may include an L1 difference between the lowquality output image patch and the high quality input image patch”). Accordingly, it would have been obvious to one of ordinary skill in the art before the effective filling date to the claimed invention to modify the general use of method for real-time visualization of a material during ultrasonic non-destructive testing of Jack et al. in view system and method of defect inspection using generation and transformation of input and output data based on deep learning of Pil and further in view of apparatus to automatically generate an image quality metric for an image are provided of Hsieh et al. and a method and apparatus for training a neural network model for enhancing image detail of Scherbinin et al. in order to generate reconstruction networks (see para [0014]).
However, the combination of Jack et al., Pil, Hsieh et al. and Scherbinin et al. as a whole does not teach and updating the first weights of the generation network based on the generation loss so as to maximally reduce the generation loss.
In the same field of endeavor Mentl et al teaches and updating the first weights of the generation network based on the generation loss so as to maximally reduce the generation loss (see para [0008]; “The weights of the deep neural networks are randomly initialized during training, and the method compares the final denoised image with target image data to update the weights of the first and second deep neural networks using backpropagation”). Accordingly, it would have been obvious to one of ordinary skill in the art before the effective filling date to the claimed invention to modify the general use of method for real-time visualization of a material during ultrasonic non-destructive testing of Jack et al. in view system and method of defect inspection using generation and transformation of input and output data based on deep learning of Pil and further in view of apparatus to automatically generate an image quality metric for an image are provided of Hsieh et al. and methods for machine learning sparse image representations with deep unfolding and deploying the machine learnt network to denoise medical images of Mentl et al. in order to generate a clean proximal mapping of the image data via a set of sparse image representations (see para [0008]).
Regarding claim 18, the rejection of claim 12 is incorporated herein. 
Jack et al. in the combination further teach and target data which is a three-dimensional image generated from historical three-dimensional raw data created by stacking a plurality of historical two-dimensional inspection images obtained through the ultrasonic testing of the inspection object with the defect (see Abstract; “a three-dimensional (3-D) image of a composite laminate constructed of a series of two-dimensional (2-D) cross sections. The GUI is capable of displaying the 3-D image as each additional 2-D cross section is scanned by an ultrasonic testing apparatus”, see also para [0180]; “The system is operable to create a 3-D damage profile for the region consisting of the damage area of each C-scan in the series of reconciled C-scans stacked vertically).
Pil in the combination further teach further wherein, before the three-dimensional raw data is prepared, the device is further configured to train the generation network of the deep learning model by performing (see page 8, 13th para; “using a variational auto-encoder structure composed of an encoder and a decoder, the deep learning model can be trained so that the input image and the output image are identical or similar”); inputting the training input data into the generation network (see page 2, 5th para; “a first encoder for mapping the input original image to a feature map”). 
Hsieh et al. in the combination further teach preparing first training data comprising training input data generated through data augmentation (see para [0105]; “data augmentation to generate more training samples”); generating the representation data through the first feature transformation by applying the first weights in an untrained state to the training input data (see para [0211]; “In a convolutional layer of an example deep convolutional network, an initial layer includes a plurality of feature maps in which node weights are initialized using parameterized normal random variables”). 
Scherbinin et al in the combination further teaches deriving a generation loss representing a difference between the representation data and the target data through a first loss function (see para [0014]; “the loss function may include an L1 difference between the lowquality output image patch and the high quality input image patch”). 
Mentl et al. in the combination further teaches and updating the first weights of the generation network based on the generation loss so as to maximally reduce the generation loss (see para [0008]; “The weights of the deep neural networks are randomly initialized during training, and the method compares the final denoised image with target image data to update the weights of the first and second deep neural networks using backpropagation”). Accordingly, it would have been obvious to one of ordinary skill in the art before the effective filling date to the claimed invention to modify the general use of method for real-time visualization of a material during ultrasonic non-destructive testing of Jack et al. in view system and method of defect inspection using generation and transformation of input and output data based on deep learning of Pil and further in view of apparatus to automatically generate an image quality metric for an image are provided of Hsieh et al. and methods for machine learning sparse image representations with deep unfolding and deploying the machine learnt network to denoise medical images of Mentl et al. in order to generate a clean proximal mapping of the image data via a set of sparse image representations (see para [0008]).

Claim 11 is rejected under 35 U.S.C. 103 as being unpatentable over Jack et al. Pil, and Hsieh et al. in view of Mentl et al. and Shcherbinin et al. as applied in claims 1, and 9 and further in view of Rosenfeld et al. (US 20210089842 A1) and Dutta et al. (US 20220147760 A1).
Regarding claim 11, the rejection of claim 9 is incorporated herein. 
Pil in the combination further teach inputting the training input data into the generation network; (see page 2, 5th para; “a first encoder for mapping the input original image to a feature map”); generating, by the generation network, the representation data through the first feature transformation by applying the first weights in the trained state to the training input data (see page 8, 13th para; “using a variational auto-encoder structure composed of an encoder and a decoder, the deep learning model can be trained so that the input image and the output image are identical or similar”); 
Hsieh et al. in the combination further teach generating, by the detecting network, detection data that predicts the defect type as a probability through a second feature transformation by applying second weights in the untrained state to the representation data (see para [0212]; “weights and biases for the classification layer are set to 0. An output from the softmax layer is a set of positive numbers which sum up to 1. In other words, the output from the softmax layer can be thought of as a probability distribution. Using this distribution, the network can be used to select values for desired hyper parameters”). However, the combination of Jack et al., Pil, Hsieh et al. and Scherbinin et al. as a whole does not teach further comprising training the detecting network of the deep learning model by performing: preparing second training data comprising the training input data and a label indicating the defect type of the inspection object, inputting the representation data into the detecting network; and updating, by the detecting network, the second weights of the generation network based on the prediction loss so as to maximally reduce the prediction loss while the first weights of the detecting network remain unchanged during the updating of the second weights in the deep learning model.
In the same field of endeavor, Rosenfeld teaches further comprising training the detecting network of the deep learning model by performing: preparing second training data comprising the training input data and a label indicating the defect type of the inspection object (see para [0008]; “a label for a novel input data based on training data in which multiple training input data are associated with a similar label”, see para [0168]; “the training data comprising multiple training input data (x.sub.i) and corresponding labels (y.sub.i), a training input data representing physical properties of a physical system”); inputting the representation data into the detecting network (see para [0009]; “The encoder may be configured to map an input data to a latent representation while the classifier may be configured to be applied to the latent representation”); and updating, by the detecting network, the second weights of the generation network based on the prediction loss so as to maximally reduce the prediction loss while the first weights of the detecting network remain unchanged during the updating of the second weights in the deep learning model (see para [0009]; “Training the base classifier may comprise optimizing the parameters so that the base classifier fits the training data. Interestingly, in an embodiment, the encoder part of the base classifier may be regarded as fixed, while the classifier part may be re-trained…. a base training function may be configured for optimizing the parameters defining the classifier according to the training data, while leaving the parameters of the encoder untouched”). Accordingly, it would have been obvious to one of ordinary skill in the art before the effective filling date to the claimed invention to modify the general use of method for real-time visualization of a material during ultrasonic non-destructive testing of Jack et al. in view system and method of defect inspection using generation and transformation of input and output data based on deep learning of Pil and further in view of apparatus to automatically generate an image quality metric for an image are provided of Hsieh et al. and a method to classify sensor data with improved robustness against label noise of Rosenfeld et al. in order to improve the quality of a machine learning approach (see para [0008]).
However, the combination of Jack et al., Pil, Hsieh et al., Scherbinin et al. and Rosenfeld et al. as a whole does not teach deriving, by the detecting network, a prediction loss representing a difference between the detection data and the label through a second loss function.
In the same field of endeavor, Dutta et al. teach deriving, by the detecting network, a prediction loss representing a difference between the detection data and the label through a second loss function (see para [0476]; “the loss function is a custom-weighted categorical cross-entropy loss and the error 5506 is minimized on a subpixel-by-subpixel basis between predicted classification scores (e.g., softmax scores) and labelled class scores”, see also para [0490]; “an array of units that classifies each subpixel based on the predicted classification scores, with each unit in the array representing a corresponding subpixel in the input. The classification scores can be softmax scores”). Accordingly, it would have been obvious to one of ordinary skill in the art before the effective filling date to the claimed invention to modify the general use of method for real-time visualization of a material during ultrasonic non-destructive testing of Jack et al. in view system and method of defect inspection using generation and transformation of input and output data based on deep learning of Pil and further in view of apparatus to automatically generate an image quality metric for an image are provided of Hsieh et al. and a method to classify sensor data with improved robustness against label noise of Rosenfeld et al. and artificial intelligence based determination of analyte data for base calling of Duutta et al. in order to successively model high-level features (see para [0476]).
Allowable Subject Matter
The following is a statement of reasons for the indication of allowable subject matter:
Claims 10, 19, and 20 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.
The following is a statement of reasons for the indication of allowable subject matter:  
Regarding claim 10, and 19, none of the prior art taches wherein, in the deriving of the generation loss, the generation loss is generated according to Equations below: 
    PNG
    media_image1.png
    144
    749
    media_image1.png
    Greyscale

 where the Lg is the generation loss, the o is the representation data, the t is the target data, the P(o, t) is a similarity between the representation data and the target data, the i, the j, and the k are indices for identifying coordinates of a three-dimensional pixel patch of each of the representation data and target data, each of the μo and the μt is a luminance of the three-dimensional pixel patch of the representation data and target data, respectively, each of the σo and the σt is a contrast of the three-dimensional pixel patch of the representation data and target data, respectively, and each of the C1 and the C2 represents a constant.
Claim 20 depend on base claim 19 respectively and would be allowable once the objection above is resolved.
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to WINTA GEBRESLASSIE whose telephone number is (571)272-3475. The examiner can normally be reached Monday-Friday9:00-5:00.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Andrew Bee can be reached at 571-270-5180. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/WINTA GEBRESLASSIE/            Examiner, Art Unit 2677
Read full office action
Prosecution Timeline

Mar 20, 2024
Application Filed
Apr 06, 2026
Non-Final Rejection mailed — §103, §112 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

17/710,872
Patent 12579683
IMAGE VIEW ADJUSTMENT
3y 11m to grant Granted Mar 17, 2026
17/876,145
Patent 12573238
BIOMETRIC FACIAL RECOGNITION AND LIVENESS DETECTOR USING AI COMPUTER VISION
3y 7m to grant Granted Mar 10, 2026
18/177,769
Patent 12530768
SYSTEMS AND METHODS FOR IMAGE STORAGE
2y 10m to grant Granted Jan 20, 2026
17/923,954
Patent 12524932
MACHINE LEARNING IMAGE RECONSTRUCTION
3y 2m to grant Granted Jan 13, 2026
18/196,332
Patent 12511861
DETECTION OF ANNOTATED REGIONS OF INTEREST IN IMAGES
2y 7m to grant Granted Dec 30, 2025
Study what changed to get past this examiner. Based on 5 most recent grants.
Strategy Recommendation AI-generated — please review before filing

Get a prosecution strategy drawn from examiner precedents, rejection analysis, and claim mapping.
Typically takes 5-10 seconds — AI-generated, attorney review required before filing
Prosecution Projections

1-2
Expected OA Rounds
76%
Grant Probability
99%
With Interview (+25.0%)
2y 6m (~4m remaining)
Median Time to Grant
Low
PTA Risk
Based on 135 resolved cases by this examiner. Grant probability derived from career allowance rate.