Office Action Analysis: 18533221 — VECTOR BYPASS FOR GENERATIVE ADVERSARIAL IMAGE SEGMENTATION

Office Action

§101 §103
DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Information Disclosure Statement
The information disclosure statement (“IDS”) filed on 12/08/2023 was reviewed and the listed references were noted.
Drawings
The 3-page drawings have been considered and placed on record in the file. 
Status of Claims
Claims 1-20 are currently pending.

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claims 19-20 are rejected under 35 U.S.C. 101 because the claimed invention is directed to non-statutory subject matter as follows.

Consider independent Claim 19, Claim 19 is directed to "A computer program product comprising a computer-readable storage medium …". Paragraph [0114] of Applicant’s specification recites certain non-limiting examples of a computer-readable storage medium.  In addition, this paragraph further recites “A computer readable storage medium, as that term is used in the present disclosure, is not to be construed as storage in the form of transitory signals per se…”.   It should be noted that Applicant may rebut the presumption of plain meaning by clearly disavowing the full scope of the claim term in the specification; but the disavowal must be clear and unmistakable; and statements directed to claim construction itself are not special definitions or disavowals (e.g., “computer-readable media are not to be construed to cover non-statutory signals”). See Subject Matter Eligibility of Computer Readable Media, 1351 OG 212 (26 Jan 2010). See MPEP 2111.01. Signals are nothing but the physical characteristics of a form of energy, and as such is non-statutory natural phenomena. See, e.g., In re Nuitjen, 500 F. 3d 1346, 1357 (Fed. Cir. 2007)(slip. op. at 18)("A transitory, propagating signal like Nuitjen's is not a process, machine, manufacture, or composition of matter.'  Thus, such a signal cannot be patentable subject matter."). Dependent claim 20 is rejected under this section of the rules, because of its dependency from claim 19. Accordingly, claims 19-20 are rejected under 35 U.S.C. 101 because the claimed invention is directed to non-statutory subject matter. It is suggested that Applicant amend the claim by inserting the term “non-transitory” before “computer-readable” in the preamble of the claim.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
Determining the scope and contents of the prior art.
Ascertaining the differences between the prior art and the claims at issue.
Resolving the level of ordinary skill in the pertinent art.
Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary. Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.

Claims 1-6 and 9-20 are rejected under 35 U.S.C. 103 as being unpatentable over Joyce et al. (“Deep Multi-Class Segmentation Without Ground-Truth Labels”) in view of Zhang (US 2025/0078229 with effective filing date Aug 8, 2023). 

Consider Claim 1, Joyce discloses “A computer-implemented method comprising: training a visual inspection machine learning model using a generative adversarial network;” (Joyce, Section 1, Paragraph 3, “to achieve segmentation, we train a Generative Adversarial Network (GAN) [4] model to synthesise realistic masks from input images.”) and implementing within the generative adversarial network a vector bypass through which a (Joyce, Figure 2, Elements Zm and Zb) Examiner notes that Joyce discloses a network that implements the use of parallel segmentation channels (Zm) and a separate, parallel residual channel (Zb) that all function individually and bypass each other. “and to a generator to assist with image reconstruction.” (Joyce, Figure 2, “The residual channels Zb of the feature map along with Zm are used as the input to a reconstruction network h that synthesises the input image”)
Joyce does not explicitly disclose “a vector embedding representation of an unlabeled image”. However, Zhang teaches (Zhang, Paragraph [0039], “For example, some embodiments perform pre-processing functionality, such as converting the image into a vector or matrix”). Examiner notes the pre-processing of the input image is interpreted to occur prior to inputting the image into the semantic and normal model layers. Therefore, the normal map outputted by the normal model layer would be in the form of a vector. Please note, Zhang also teaches “a vector embedding representation of an unlabeled image is transmitted around the visual inspection machine learning model” (Zhang, FIG. 5) Examiner notes the normal model layer is parallel to the segmentation model layer and bypasses segmentation and analysis of the input image. “and to a generator to assist with image reconstruction (Zhang, FIG. 3; Paragraph [0062], “The inverse shading model/layer 512 takes, as input, the semantic map 506, the normal map 508, and the input image 510 to produce or predict the inverse shading map 514, as described, for example, with respect to the inverse shading S 306 of FIG. 3.”) Examiner also notes the inverse shading model produces a map of the outlines and shadows of the input image which is then inputted into albedo (see figure below) for image reconstruction. It is implied that the inverse shading model must have a generator to produce a map of the input.

    PNG
    media_image1.png
    528
    917
    media_image1.png
    Greyscale

Accordingly, before the effective filing date of the instant application, it would have been obvious to one of ordinary skill in the art to combine Joyce with the teachings of Zhang to further create a vector representation of the input image to assist in image reconstruction of the input. One of ordinary skill in the art would be motivated to combine Joyce and Zhang to utilize a bypass or ensemble machine learning models to prevent the loss of image information/features during the segmentation process for a more accurate reconstruction of the input image. Accordingly, the combination of Joyce and Zhang discloses the invention of Claim 1.

Consider Claim 2, the combination of Joyce and Zhang discloses “The computer-implemented method of claim 1, wherein the training comprises: inputting the unlabeled image (Joyce, Section 1, Paragraph 4, “We demonstrate the possibility for multi-class cardiac segmentation without labels on the data set of interest through adversarial training”) “into the visual inspection machine learning model to produce a segmentation result;” (Joyce, Section 4.1, Paragraph 2, “given an input image X we are interested in segmenting MYO, LV and RV, represented as a 3-channel mask Zm”) “inputting the unlabeled image into an embedding vector model to produce the vector embedding representation;” (Joyce, Section 4.2 , Paragraph 4.2, “Thus, our model functions like an auto-encoder, with the segmentor acting as an encoder, encoding an image X”) “ inputting the vector embedding representation and the segmentation result into the generator so that the generator produces a reconstructed image;” (Joyce, Figure 2 (see image below); Section 4.2, Paragraph 3, “The reconstructor network h then takes Zm, Zb and X, and following a very simple structure tries to reconstruct X from a weighted sum of the channels of Zm and Zb.” (emphasis added)) Examiner notes that Zb is interpreted to be a vector representation of the residual information of the input. “inputting an unpaired ground-truth image” (Joyce, Section 1, Paragraph 2, “a method for cardiac segmentation which does not require a training set of paired images and ground-truth segmentation labels. Instead, we make use of example labels coming from any previously labelled cardiac data set, i.e. not necessarily from images of the same modality or the same patients as the images of interest.” (emphasis added)) Examiner notes the ground truth image is implied to be inputted into the discriminator. “and the segmentation result into a discriminator of the generative adversarial network;” (Joyce, Figure 2, “The first three channels, Zm, contain segmentations of LV, RV and MYO, as encouraged by a mask discriminator D.”) “optimizing a first loss for the generative adversarial network, wherein the generative adversarial network comprises the visual inspection machine learning model and the discriminator;” (Joyce, Section 4.1, Paragraph 1, “In this case segmentation can be perceived as a special case of image generation, thus an adversarial loss can be used to train a deep neural network to produce realistic results.”) “and optimizing a second loss for the visual inspection machine learning model and the generator based on a comparison of the reconstructed image and the unlabeled image. (Joyce, Section 4.2, Paragraph 4, “In additional to the LSGAN based adversarial cost defined above, which we still apply to Zm, we also introduce three additional costs. Firstly an autoencoder like reconstruction loss” (emphasis added)).

    PNG
    media_image2.png
    524
    1080
    media_image2.png
    Greyscale


Consider Claim 3, the combination of Joyce and Zhang discloses “The computer-implemented method of claim 2, wherein the unlabeled image and the unpaired ground-truth image contain a common feature.” (Joyce, Section 5.3, Paragraph 1, “When training the MR model we use the segmentation masks from the CT data as 'real' examples for the discriminator, and vice versa for training on CT images.”) Examiner notes that “real examples” is interpreted as ground truth images. Examiner also notes that the CT and MR images are both cardiac images that aim to segment the same three regions or anatomical features of the heart.

Consider Claim 4, the combination of Joyce and Zhang discloses “The computer-implemented method of claim 2, wherein the discriminator produces predictions regarding origin of input data as the unpaired ground-truth image” (Joyce, Section 5.3, Paragraph 1, “When training the MR model we use the segmentation masks from the CT data as 'real' examples for the discriminator, and vice versa for training on CT images.” Examiner notes the real example segmentation mask used for each respective input image type is interpreted as the ground truth image when training the GAN. Joyce states the real examples are unpaired and obtained “from any previously labelled cardiac data set, i.e. not necessarily from images of the same modality or the same patients as the images of interest” (see Joyce, Section 1, Paragraph 1). In addition, please note that Zhang also discloses (Zhang, Paragraph [0064], “For example, the albedo generator 605 may generate the normal map 404 and the segmentation map 402 of FIG. 4 so that these individual maps can be compared to the ground truth 414 via the discriminators 607 and 609.”)) “or as the input segmentation result.” (Joyce, Figure 2, “The first three channels, Zm, contain segmentations of LV, RV and MYO, as encouraged by a mask discriminator D.”). The proposed combination as well as the motivation for combining the Joyce and Zhang references presented in the rejection of claim 1, apply to claim 4 and are incorporated herein by reference.  Thus, the method recited in claim 4 is met by Joyce and Zhang.

Consider Claim 5, the combination of Joyce and Zhang discloses “The computer-implemented method of claim 2, wherein the optimizing the first loss comprises performing backpropagation (Zhang, Fig. 5, Element 518 and 520) on a min-max loss in which the visual inspection machine learning model seeks to minimize the min-max loss and the discriminator seeks to maximize the min-max loss.” (Zhang, Paragraph [0073], “However, the albedo generator 606's goal is to use the loss of the discriminators as an objective function to modify parameters or weights of its model in order to maximize the loss of the discriminators…. the loss from the discriminator 607 is passed to the albedo generator 605 so that it can maximize the loss (or get an incorrect prediction) of the discriminators.”) (emphasis added).  The proposed combination as well as the motivation for combining the Joyce and Zhang references presented in the rejection of claim 1, apply to claim 5 and are incorporated herein by reference.  Thus, the method recited in claim 5 is met by Joyce and Zhang.

Consider Claim 6, the combination of Joyce and Zhang discloses “The computer-implemented method of claim 2, wherein the segmentation result comprises an identification of a first feature shown in the segmentation result and in the unlabeled image.” (Joyce, Figure 3 (see image below), input and proposed GAN prediction rows). Examiner notes the proposed GAN prediction row displayed a captured first feature, some anatomical structure from its corresponding input image; and said anatomical structure can be also be seen in the unlabeled input image.

    PNG
    media_image3.png
    919
    1052
    media_image3.png
    Greyscale

Consider Claim 9, the combination of Joyce and Zhang discloses “The computer-implemented method of claim 2, wherein the vector embedding representation captures a first feature from the unlabeled image” (Joyce, Figure 3 (see image above), Residual row) “and the first feature is not present in the unpaired ground-truth image.” (Joyce, Figure 3 (see image above), Residual row and Ground-truth mask row). Examiner notes the residual mask in the residual row contain the first feature the input image, not present in the ground truth mask. The lighten region on the Ground truth mask row is the anatomical structure identified and the darkened region of the ground truth mask row represents the first feature absent from the ground truth image.

Consider Claim 10, the combination of Joyce and Zhang discloses “The computer-implemented method of claim 9, wherein the first feature is selected from a group consisting of a texture,” (Zhang, Paragraph [0025]; “the normal map guides the geometric properties of the real world objects (e.g., human), such as grooves and patterns in hair, texture in skin, lines in lips, and the like” (emphasis added) “a color, and a brightness.” (Zhang, Paragraph [0040], “ Normal maps are typically saved in a Red-Blue-Green (RGB) format, and contain its directional information in X, Y, and Z axes”). Examiner notes information stored in RGB format typically include color and brightness. The proposed combination as well as the motivation for combining the Joyce and Zhang references presented in the rejection of claim 1, apply to claim 10 and are incorporated herein by reference.  Thus, the method recited in claim 10 is met by Joyce and Zhang.

Consider Claim 11, the combination of Joyce and Zhang discloses “The computer-implemented method of claim 9, wherein the first feature is a background feature.” (Joyce, Figure 3 (see image above), Residual row; Section 4.2, Paragraph 2, “also produce a multi-channel residual Zb, which can store the non-mask information”). Examiner notes the non-mask information is interpreted as the background of the image. 

Consider Claim 12, the combination of Joyce and Zhang discloses “The computer-implemented method of claim 9, wherein the visual inspection machine learning model attempts to produce the segmentation result to lack the first feature.” (Joyce, Figure 3 (see image above), Proposed GAN prediction row, “The next two rows show segmentation results from our proposed and a simple adversarial method.”) Examiner notes the segmented result shown in the proposed GAN prediction row of Figure 3 consist of a light and dark region. It is noted that the dark region is the first feature of the input image that is lacking from the segment result.

Consider Claim 14, the combination of Joyce and Zhang discloses “The computer-implemented method of claim 1, further comprising performing image inspection on a new image by inputting the new image to the trained visual inspection machine learning model.” (Zhang, Paragraph [0023], “a model may be trained only on images that contain a finite set of RGB values and reflectance properties. However, a testing or deployment input image may include RGB values or reflectance properties that models have not trained on.” (emphasis added)).  The proposed combination as well as the motivation for combining the Joyce and Zhang references presented in the rejection of claim 1, apply to claim 14 and are incorporated herein by reference.  Thus, the method recited in claim 14 is met by Joyce and Zhang.

Consider Claim 15, the combination of Joyce and Zhang discloses “The computer-implemented method of claim 1, further comprising performing supervised training (Joyce, Section 5, Paragraph 1, “our approach by generating binary masks of the MYO, LV and RV regions of the heart and compare with an upper bound, as obtained by fully supervised segmentation, and the naive unsupervised segmentation approach described” (emphasis added)) “of the visual inspection machine learning model by submitting a labeled image sample to the visual inspection machine learning model.” (Joyce, Table 1; Section 5.1, Paragraph 1 (see image below), “ All the data has manual segmentations of the seven whole heart substructures. We removed images that did not contain at least 400 pixels of myocardium, restricting our attention to central slices, as basal and apical slices” (emphasis added)). Examiner notes that the manual segmented images input into the visual inspection machine learning model disclosed in Joyce is interpreted to be labeled because the images were limited to the preselected central, basal or apical images, and were sorted into one of the seven heart substructures. 

    PNG
    media_image4.png
    357
    939
    media_image4.png
    Greyscale


Consider Claim 16, the combination of Joyce and Zhang discloses “The computer-implemented method of claim 1, wherein the visual inspection machine learning model performs image segmentation.” (Joyce, Section 1, Paragraph 1, “we focus on the segmentation of the Left Ventricle (LV), Right Ventricle (RV) and Myocardium (MYO) regions of cardiac MR and CT images”).

	Claim 17 recites a computer system with elements corresponding to the steps recited in Claim 1. Therefore, the recited elements of this claim are mapped to the proposed combination in the same manner as the corresponding steps in its corresponding method claim. Additionally, the rational and motivation to combine Joyce and Zhang references, presented in the rejection of Claim 1, apply to this claim. Finally, the combination of Joyce and Zhang discloses “A computer system comprising: one or more processors, one or more computer-readable memories,” (Zhang, Paragraph [0080], “the computer-implemented method, the system (that includes at least one computing device having at least one processor and at least one computer readable storage medium), and/or the computer readable medium as described herein may perform or be caused to perform the process 800 or any other functionality described herein.”) and program instructions stored on at least one of the one or more computer-readable memories for execution by at least one of the one or more processors (Zhang, Paragraph [0102], “various functions may be carried out by a processor executing instructions stored in memory.”).

Claim 18 recites a computer system with elements corresponding to the steps recited in Claim 2. Therefore, the recited elements of this claim are mapped to the proposed combination in the same manner as the corresponding steps in its corresponding method claim. Additionally, the rational and motivation to combine Joyce and Zhang references, presented in the rejection of Claim 2, apply to this claim.

Claim 19 recites a “computer program product comprising a computer-readable storage medium” with programing instructions corresponding to the steps recited in Claim 1. Therefore, the recited elements of this claim are mapped to the proposed combination in the same manner as the corresponding steps in its corresponding method claim. Additionally, the rational and motivation to combine Joyce and Zhang references, presented in the rejection of Claim 1, apply to this claim. Finally, the combination of Joyce and Zhang discloses “A computer program product comprising a computer-readable storage medium having program instructions embodied therewith, the program instructions executable by a computer” (Zhang, Paragraph [0102], “various functions may be carried out by a processor executing instructions stored in memory.”).

Claim 20 recites a “computer program product comprising a computer-readable storage medium” with programing instructions corresponding to the steps recited in Claim 2. Therefore, the recited elements of this claim are mapped to the proposed combination in the same manner as the corresponding steps in its corresponding method claim. Additionally, the rational and motivation to combine Joyce and Zhang references, presented in the rejection of Claim 2, apply to this claim.

Claims 7-8 are rejected under 35 U.S.C. 103 as being unpatentable over Joyce et al. (“Deep Multi-Class Segmentation Without Ground-Truth Labels”) in view of Zhang (US 2025/0078229 with effective filing date Aug 8, 2023) and in further view of Sun (US 12,141,990 with effective filing date Sept 9, 2021).

Consider Claim 7, the combination of Joyce and Zhang do not disclose “The computer-implemented method of claim 2, wherein the second loss is a cycle consistency loss”. However, in an analogous field of endeavor, Sun teaches (Sun, Col 10, Line 12-17, “The cross-domain training of the GAN described above may be accompanied by image reconstruction training within each domain…. training operations utilizing reconstruction losses… (e.g., for cycle consistency)”). Accordingly, before the effective filing date of the instant application, it would have been obvious to one of ordinary skill in the art to combine Joyce and Zhang with the teachings of Sun to utilize cycle consistency loss during reconstruction training. One of ordinary skill in the art would use cycle consistency loss to minimize the dissimilarity between the reconstructed image and the input image to ensure an accurate, high-quality reconstruction output. Therefore, the combination of Joyce, Zhang and Sun discloses the invention of Claim 7. 

Consider Claim 8, the combination of Joyce, Zhang, and Sun discloses “The computer-implemented method of claim 2, wherein the optimizing of the first loss comprises performing an L2 regularization” (Sun, Col 8, Line 5-9, “the neural ODE network may at 414 adjust the neural network parameters with … (e.g., based on any differentiable loss function such as an L2 loss function).”). Accordingly, before the effective filing date of the instant application, it would have been obvious to one of ordinary skill in the art to combine Joyce and Zhang with the teachings of Sun to minimize feature detection error. One of ordinary skill in the art would utilize L2 regularization to validate whether the visual inspection model is accurate in detecting a  feature. Therefore, the combination of Joyce, Zhang and Sun disclose the invention of Claim 8.

Allowable Subject Matter
Claim 13 is objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims. The following is a statement of reasons for the indication of allowable subject matter: consider Claim 13, none of the cited prior art, alone or in combination, provide a motivation to teach the ordered combination of  “The computer-implemented method of claim 1, wherein the vector embedding representation comprises a one-dimensional hidden embedding representing an image secondary feature of the unlabeled image.” 

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Annie Pham whose telephone number is 
(571)272-1673. The examiner can normally be reached Mon - Fri: 8:30a - 5:00p. 

Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice. 
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Amandeep Saini can be reached on (571)272-3382. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.

Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/ANNIE H PHAM/           Examiner, Art Unit 2662                                                                                                                                                                                             
/AMANDEEP SAINI/           Supervisory Patent Examiner, Art Unit 2662
Read full office action
VECTOR BYPASS FOR GENERATIVE ADVERSARIAL IMAGE SEGMENTATION

Examiner Intelligence

Statute-Specific Performance

Office Action

Prosecution Timeline

AI Strategy Recommendation

Prosecution Projections

Ready to respond to this office action?

VECTOR BYPASS FOR GENERATIVE ADVERSARIAL IMAGE SEGMENTATION

Examiner Intelligence

Statute-Specific Performance

Office Action

Prosecution Timeline

AI Strategy Recommendation

Prosecution Projections

Ready to respond to this office action?

Sign in with your work email